Developers & Practitioners

Migrating a PHP application to use Cloud Spanner

PHP is used in ~78% of websites, making it a popular language for developers. If your application runs PHP and you want to take advantage of Google Cloud Spanner for its reliability and scalability, this post is for you!

We will highlight the common steps to migrate an existing PHP application to Spanner. The example we work through modifies the e-commerce platform Magento’s Catalog Module to use the Spanner PHP client library. But these principles can be used for any PHP application that makes use of the Spanner PHP library.

What we aim to accomplish

  • Use the PHP client library to configure a connection to the Spanner database and manage a session pool.

  • Execute CRUD operations on the configured database in a reliable manner with transactions.

  • Become acquainted with some useful snippets and queries inspired by the Magento application implementation.

  • Observe how the snippets can be used in an application, along with the usual CRUD operations.

Before we begin

  • Consider working through the Magento Codelab that leverages the magento-spanner-port. While the codelab describes the steps required to get the actual integration working, this post goes into more detail to explain what is happening under the hood.

  • This document shows a number of examples of how to use Magento with Spanner. The examples are taken from the SpannerAdapter and the AbstractDb implementation in the magento-spanner-port. These are the main files of the Magento Spanner adapter implementation.

Setup

Authentication

As the first step, please follow the Authentication guide to ensure the application is authenticated before executing any of the code snippets below.

Creating a session pool and the connection

After successfully authenticating, create a session pool to facilitate the connection between the application and Spanner. Creating a session is an expensive operation, so it is recommended to create a persistent pool of sessions that can be reused. 

The Spanner client library provides a handy way to alleviate this problem by having a cached session pool. More information on using a cached Session Pool and best practices is available here.

With SysVCacheItemPool, you can share the cached sessions among multiple processes. The CreateSessionPool() method creates and returns a session pool object. Each session connects to a single database and a session can only execute one transaction at a time. The variable minSessions can be used to set the minimum number of concurrent sessions that is expected. The pool is initialized with minSessions when it is created. Then if more than minSessions are required, new sessions will be created up until maxSessions is reached.

  /**
    * Creates the session pool for spanner connection
    * @return SessionPoolInterface
    * @throws Exception
    */
   protected function createSessionPool()
   {
       $cache = new SysVCacheItemPool( [
              // Use a different project identifier for ftok than the default
              'proj' => 'B'
        ]);
       return new CacheSessionPool($cache, [
'maxSessions' => $this->maxsessions, 
'minSessions' => $this->minsessions // Default value is 1
       ]);
   }

To learn more about project identifiers in PHP in relation to ftok, please refer to the section of the PHP documentation here.

Connect to the Spanner database

The created session pool can be used to make a connection to Spanner in the connect() method as shown below. The connection object can be used to perform CRUD operations on Spanner.

  /**
    * Creates a Cloud Spanner object and connects to the database.
    *
    * @return void
    */
   protected function _connect()
   {
       if ($this->_connection) {
           return;
       }
       $spanner = new Google\Cloud\Spanner\SpannerClient([ 'projectId' => $this->project_id ]);
       $sessionPool = $this->createSessionPool();
       $this->_connection = $spanner->connect($this->instance, $this->database, ['sessionPool' => $sessionPool]);
   }

Warm up the session pool

  // Will create sessions in the session pool
     $sessionPool->warmup();

Keeping the session alive

  // Maintain up to minSessions and drop expired sessions
     $sessionPool->maintain();

The maintain method refreshes sessions that would otherwise expire within the next 10 minutes. In some cases, it can also refresh sessions that would otherwise expire in more than 10 minutes in order to distribute refresh calls more evenly. Only minSessions sessions are maintained. Excess sessions are left to expire.

Note: You should set maxSessions high enough to handle peak concurrency with some additional buffer to allow for growth in peak concurrency. Be aware of memory usage of the process, to avoid the process being killed. Additionally, we recommend setting minSessions equal to maxSessions to avoid the latency cost of having to create new sessions when serving requests. This should minimize tail latencies.

Executing Transactions on Spanner 

After setting up the connection and creating the session pool successfully, we are all set to try some Insert, Read, Update, and Delete operations on the Spanner database.

The snippets below showcase executing CRUD operations using transactions, where they will be committed to the database if and only if all the enclosed queries are successful.

Insert

In the insert() method, we use runTransaction to create the sequence of instructions that need to be executed.

  /**
    * Insert multiple rows in single table
    * @param string $table
    * @param array $data - Example : ​​['wishlist_item_id' => "6136f3048f1e1d16565478b1", 'wishlist_id' => 23, 'product_id' =>  16, 'store_id'=> 1, 'qty' => 1 ]
    * @return Commit timestamp
    */
  public function insert(string $table, array $data)
   {
        $results = $this->_connection->runTransaction(function (Transaction $t) use ($table, $data) {
       	$t->insertBatch($table, [ $data ]);
               $t->commit();
        });
       return $results;
   }

We attempt to insert multiple rows using insertBatch, and if all inserts are successful the transaction is committed to the database.

Update

Very similar to insert, updating an existing batch of data can be done using updateBatch in the update() method.

  /**
    * Updating table data for modified columns
    * @param string $table
    * @param array $data - Example : ​​['wishlist_item_id' => "6136f3048f1e1d16565478b1", 'wishlist_id' => 23, 'product_id' =>  16, 'store_id'=> 1, 'qty' => 2 ]
    * @return Commit timestamp
    */
   public function update(string $table, array $data)
   {
   $results = $this->_connection->runTransaction(function (Transaction $t) use ($table, $data) {
         	$t->updateBatch($table, [ $data ]);
            	$t->commit();
        });
       return $results;
   }

Transactions allow you to rollback your updates in case any update of a row fails to process.

While updateBatch uses the mutations API, updates can also be performed using DML statements, with executeUpdate. The same is true for inserts mentioned previously. Refer to Comparing DML and Mutations for guidelines on when it's appropriate to use mutations vs. DML. Please keep in mind that Spanner limits the number of mutations that can be performed in a single transaction. To learn more about how the number of mutations is calculated, please review the documentation. To find the mutation count for a transaction and learn how to prevent your transactions from hitting this limit, see the commit statistics documentation. 

Delete

The delete() method deletes the row using transactions to safeguard against partial failed queries.

  /** 
    * Delete from table
    * @param string $table
    * @param string $where
    * @return Commit timestamp
    */
   public function delete(string $table, string $where)
   {
       $sql = "DELETE FROM ".$table." WHERE ".$where;
       $results = $this->_connection->runTransaction(function (Transaction $t) use ($sql, $params) {
       	$t->executeUpdate($sql, [
                   'parameters' => $params
             	]);
             	$t->commit();
        });
       return $results;
   }

In the above code snippet, $where contains the WHERE clause condition and $table is the name of the table to delete from. Spanner requires the presence of a WHERE clause to prevent accidental deletion of the entire table.

Reading from the database

Reading from the database is simpler compared to the transaction queries above, as read queries do not require explicit rollback or commit calls. The example fetchAll() method is shown below.

  /**
    * Fetch all rows
    *
    * @param string $sql
    * @return array
    */
   public function fetchAll(string $sql)
   {
        $result = $this->query($sql);
       return $this->fetch($result);
   }

Here $sql is an SQL string that would be used to query for the results over the database.


Running any SQL query on the Spanner database

Queries are made using the execute() function on the Spanner connection from the class's query() method. 

  /**
    *  Run query using Cloud Spanner connection
    *
    * @param string $sql
    * @return mixed|null
    */
   public function query(string $sql)
   {
       $results = $this->_connection->execute($sql);
       return $results;
   }

Closing the connection

The closeConnection() method releases the resources of the Spanner connection. This must be done after all logical operations have been successfully completed by the application, otherwise sessions will be held forever by the previous caller. This will eventually exhaust the session pool, preventing any further requests from being executed.
  /**
    * Closes the connection.
    * @return void
    */
   public function closeConnection()
   {
       if ($this->_connection) {
           $this->_connection->close();
       }
   }

More Useful Queries

So far, we described examples of basic CRUD operations. Here are a few more interesting operations inspired from the Magento e-commerce implementation using Spanner. 

Convert iterator into array

Copies the iterator object of results from a query into an array to be used by the application. It's important to note that large result sets will use a significant amount of memory, so best practice would be to limit results of your Spanner query.

  /**
    * Returns all rows
    *
    * @param object $data
    * @return array
    */
   public function fetchArray(object $data)
   {
       return iterator_to_array($data->rows());
   }

Generate a UUID 

Spanner does not support auto-increment fields and recommends using a UUID. This is because incremental primary keys can create hotspots when accessing data due to the manner in which data is distributed across Spanner servers. See Choosing a primary key for more details. You can use the generateUUID() method to generate random UUIDs.

  /**
    * Generate UUID
    *
    * @return string
    */
   public function generateUuid()
   {
       if (function_exists('com_create_guid') === true) {
           return trim(com_create_guid(), '{}');
       }
 
       return sprintf(
           '%04X%04X-%04X-%04X-%04X-%04X%04X%04X',
           mt_rand(0, 65535),
           mt_rand(0, 65535),
           mt_rand(0, 65535),
           mt_rand(16384, 20479),
           mt_rand(32768, 49151),
           mt_rand(0, 65535),
           mt_rand(0, 65535),
           mt_rand(0, 65535)
       );;   
   }

Sanitize SQL query

Since Spanner uses a strict type of formatting the query, you can use the following sanitizeSQL() method to ensure the value matches the Spanner data type. This is only an example function that handles sanitizing integer data types. It can be expanded in your application to handle other sanitization.

  /**
    * Formats the sql for Spanner
    * Example
    * Input SQL : <select statement> WHERE (`product_id` = '340') ORDER BY position  ASC
    * Output SQL <select statement> WHERE (`product_id` = 340) ORDER BY position  ASC
    * In the above example integer `340` is sanitized by removing single quotes.
    * Sanitization is required since Spanner has strict types.
    * @param string $sql
    * @return string $sql
    */
   public function sanitizeSql(string $sql)
   {
       if (preg_match_all("/('[^']*')/", $sql, $m)) {
           $matches = array_shift($m);
           for($i = 0; $i < count($matches); $i++) {
               $curr =  $matches[$i];
               $curr = filter_var($curr, FILTER_SANITIZE_NUMBER_INT);
               if (is_numeric($curr)) {
                   $sql = str_replace($matches[$i], $curr, $sql);
               }
           }
       }
       return $sql;
   }

Add casting by type to the columns in queries

As we just mentioned, Spanner is strict on the type of the data being read and written. Although Spanner performs implicit casting, sometimes queries fail when the default implicit casting fails. So, to be safe, we can explicitly specify the cast to be applied to the column using a snippet like the one below. More information on the cast() function can be found here.

  /**
    * Cast the column with type
    * @param string $sql
    * @param string $col
    * @param string $type
    * @return string
    */
   public function addCast(string $sql, string $col, string $type)
   {
      $cast = "cast(" . $col . " as " . $type . ")";
      return str_replace($col, $cast, $sql);
   }

Putting everything together

Taking a few excerpts from the Magento code lab that you worked through, let’s discuss how we can use our learnings in practice.

Fetching a catalog of items from the database

Fetching all the items from the database can be as simple as:

  $values = $this->getConnection()->fetchAll($select);

Here $select references to the SQL query to select from the database, as described in the “Reading from the database” section above.

Considering that Spanner enforces strict types, using the addCast() function we learned about, the entire operation of reading from the catalog of items from the database can be framed as: 

  $con = $this->getSpannerConnection();
 
    /**
     * Cloud Spanner follows strict type so cast the columns in common type
     */
    $select = $con->addCast($select, "`t_d`.`value`", 'string');
    $select = $con->addCast($select, "`t_s`.`value`", 'string');
    $select = $con->addCast($select, "IF(t_s.value_id IS NULL, t_d.value, t_s.value)", 'string');
                  
    $values = $con->fetchAll($select);

Getting Wishlist or Cart items

In fetching the required items in the catalog for the wishlist, we would need to run a SQL query for items again. This can be done as follows:

  $data = $this->getData();

We also learned about sanitizing the queries to adhere to Spanner’s strict type requirements. To do this, the simple getData() method should be modified as shown below:

  $sql = $this->getSelect()->__toString();
       $con = $this->getSpannerConnection();
       $sql = $con->sanitizeSql($sql);
       $data = $con->fetchAll($sql);

The sanitizeSql() function from the snippets discussed earlier takes care of the strict type formatting mandated on the queries.

Fetching data with conditions

You will need to ensure your query knows the type of the field. Consider the implementation below from the load() function. The  _getLoadSelect() method returns a simple MySQL query where strings and numeric values for field types are treated alike, i.e. as strings.


  $connection = $this->getConnection();
       if ($connection && $value !== null) {
           $select = $this->_getLoadSelect($field, $value, $object);
           $data = $connection->fetchRow($select);
           if ($data) {
               $object->setData($data);
           }
       }

To ensure that we are meeting Spanner’s type requirements, replace the function _getLoadSelect() with getLoadSelectForSpanner() as shown here:

  $con = $this->getSpannerConnection();
       if ($con && $value !== null) {
           $select = $this->getLoadSelectForSpanner($field, $value);
           $data = $con->fetchRow($select);
           if ($data) {
               $object->setData($data);
           }
       }

Where the implementation of getLoadSelectForSpanner() is as follows:

  /**
    * Retrieve select object for load object data
    *
    * @param string $field
    * @param string $value
    * @return string
    */
   protected function getLoadSelectForSpanner(string $field, string $value)
   {
       $select = "select * from ".$this->getMainTable()." where ".$field;
       if (is_numeric($value)) {
           $select = $select."=".$value."";
       } else {
           $select = $select."='".$value."'";
       }
       return $select;
   }

Saving and updating the objects in your Items  

Here we demonstrate the usage of generateUuid(), based on the fact that Spanner strongly recommends the use of random UUIDs instead of incremental numerals in primary keys. 

Consider a non-Spanner implementation as below:

  if ($this->isObjectNotNew($object)) {
          $this->updateObject($object);
    } else {
          $this->saveNewObject($object);
    }

Which can be replaced by the following:

  if ($this->isObjectNotNew($object)) {
        $this->updateObjectInSpanner($object);
    } else {
         $this->saveNewObjectInSpanner($object);
    }

updateObjectInSpanner() and saveNewObjectInSpanner() make use of our generateUuid() snippet explained earlier. 

To save an object:

  /**
    * Save New Object in Cloud Spanner
    *
    * @param \Magento\Framework\Model\AbstractModel $object
    * @throws LocalizedException
    * @return void
    */
   protected function saveNewObjectInSpanner(\Magento\Framework\Model\AbstractModel $object)
   {
       $bind = $this->_prepareDataForSave($object);
       $con = $this->getSpannerConnection();
       if ($this->_isPkAutoIncrement) {
           $bind[$this->getIdFieldName()] = $con->generateUuid();
       }
 
       if(isset($bind['added_at'])) {
           $bind['added_at'] =  $con->formatDate();
       }
 
      if($this->getMainTable() == 'quote_item' || $this->getMainTable() == 'quote_address') {
           $bind['created_at'] =  $con->formatDate();
           $bind['updated_at'] =  $con->formatDate();
           $bind['free_shipping'] =  1;
      }
 
       if(isset($bind['last_visit_at'])) {
           $bind['last_visit_at']  =  $con->formatDate();
       }
 
       $con->insert($this->getMainTable(), $bind);
 
       if ($this->_isPkAutoIncrement) {
           $object->setId($bind[$this->getIdFieldName()]);
       }
 
       if ($this->_useIsObjectNew) {
           $object->isObjectNew(false);
       }
   }

To update an existing object:

  /**
    * Update existing object
    *
    * @param \Magento\Framework\Model\AbstractModel $object
    * @throws LocalizedException
    * @return void
    */
   protected function updateObjectInSpanner(\Magento\Framework\Model\AbstractModel $object)
   {
       $con = $this->getSpannerConnection();
       $data = $this->prepareDataForSpannerUpdate($object);
       if ($this->_isPkAutoIncrement) {
           $data[$this->getIdFieldName()] = $object->getId();
       }
       if(isset($data['added_at'])) {
           $data['added_at'] =  $con->formatDate();
       }
 
       if(isset($data['created_at'])) {
           $data['created_at'] =  $con->formatDate();
       }
 
       if(isset($data['updated_at'])) {
           $data['updated_at'] =  $con->formatDate();
       }
 
       if(isset($data['customer_dob'])) {
           $data['customer_dob'] =  $con->convertDate($data['customer_dob']);
       }
 
       if(isset($data['last_visit_at'])) {
           $data['last_visit_at']  =  $con->formatDate();
       }
 
       if (!empty($data)) {
           $con->update($this->getMainTable(), $data);
       }
   }

For reference, the implementation of prepareDataForSpannerUpdate() in updateObjectInSpanner() above, is shown below:

  /**
    * Get the array of data fields that was changed or added
    *
    * @param \Magento\Framework\Model\AbstractModel $object
    * @return array
    * @throws LocalizedException
    */
   protected function prepareDataForSpannerUpdate(\Magento\Framework\Model\AbstractModel $object)
   {
       $data = $object->getData();
       foreach ($object->getStoredData() as $key => $value) {
           if (array_key_exists($key, $data) && $data[$key] === $value) {
               unset($data[$key]);
           }
       }
       $dataObject = clone $object;
       $dataObject->setData($data);
       $data = $this->_prepareDataForTable($dataObject, $this->getMainTable());
       unset($dataObject);
 
       return $data;
   }

Deleting an object

To complete the D in the CRUD, here is an example of safely deleting an object from Spanner in the context of Magento:

  /**
    * Delete from Cloud Spanner
    *
    * @param \Magento\Framework\Model\AbstractModel $object
    * @return void
    */
   public function deleteInSpanner(\Magento\Framework\Model\AbstractModel $object)
   {
       $con = $this->getSpannerConnection();
       if ($object->getId()) {
           $condition = $this->getIdFieldName() .'=@Value';
           $con->delete($this->getMainTable(), $condition, [ 'Value' => $object->getId() ]);
       }
   }

In the Magento example, id columns are generally integers. For other applications it may be useful to validate that the value matches the expected Spanner data type.

Wrapping it all up

The basic tools when interfacing with any database are its CRUD operations. This article introduced these operations for Spanner in the context of PHP and Magento, along with other nuances to keep an eye out for. Please refer to the official documentation here, to continue your journey with Spanner and PHP.