Google.Cloud.Spanner.Data
Google.Cloud.Spanner.Data
is the ADO.NET provider for Cloud Spanner. It is the recommended
package for regular Cloud Spanner database access from .NET.
Note:
This documentation is for version 3.5.0
of the library.
Some samples may not work with other versions.
The Google.Cloud.Spanner.Admin.Instance.V1 package should be used for Cloud Spanner instance administration, such as creating or deleting instances.
Installation
Install the Google.Cloud.Spanner.Data
package from NuGet. Add it to
your project in the normal way (for example by right-clicking on the
project in Visual Studio and choosing "Manage NuGet Packages...").
Authentication
When running on Google Cloud Platform, no action needs to be taken to authenticate.
Otherwise, the simplest way of authenticating your API calls is to
download a service account JSON file then set the GOOGLE_APPLICATION_CREDENTIALS
environment variable to refer to it.
The credentials will automatically be used to authenticate. See the Getting Started With
Authentication guide for more details.
Getting started
Operations generally follow ADO.NET conventions. However Cloud Spanner does not support DML (eg. "INSERT INTO ..."). Therefore you should use the corresponding API on SpannerConnection to create ADO command objects instead of setting CommandText manually.
See examples below:
Sample code
Once you have created your Google Cloud Project and Spanner Instance using the web console, you can start using the ADO.NET provider to create and modify a Cloud Spanner database.
Creating a Database and Table
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
SpannerCommand createDbCmd = connection.CreateDdlCommand($"CREATE DATABASE {databaseName}");
await createDbCmd.ExecuteNonQueryAsync();
SpannerCommand createTableCmd = connection.CreateDdlCommand(
@"CREATE TABLE TestTable (
Key STRING(MAX) NOT NULL,
StringValue STRING(MAX),
Int64Value INT64,
) PRIMARY KEY (Key)");
await createTableCmd.ExecuteNonQueryAsync();
}
DataAdapter support (.NET 4.5+ only)
Cloud Spanner supports a limited DataAdapter that provides basic CRUD operations on any table. Create a SpannerDataAdapter with the target table and its primary keys. You may replace any commands provided by SpannerDataAdapter with your own custom commands.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
DataSet untypedDataSet = new DataSet();
// Provide the name of the Cloud Spanner table and primary key column names.
SpannerDataAdapter adapter = new SpannerDataAdapter(connection, "TestTable", "Key");
adapter.Fill(untypedDataSet);
// Insert a row
DataRow row = untypedDataSet.Tables[0].NewRow();
row["Key"] = Guid.NewGuid().ToString("N");
row["StringValue"] = "New String Value";
row["Int64Value"] = 0L;
untypedDataSet.Tables[0].Rows.Add(row);
adapter.Update(untypedDataSet.Tables[0]);
}
Modifying data
Cloud Spanner supports two approaches to modifying data: DML, and direct row modifications.
DML
DML is capable of affecting multiple rows with a single command. For example, you could delete all rows matching a query, or update rows to set the value of one column equal to another one.
DML can be executed in standard mode using ExecuteNonQuery
or ExecuteNonQueryAsync
:
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
// Please be aware that the whole unit of work needs to be prepared
// to be called more than once.
await connection.RunWithRetriableTransactionAsync(async transaction =>
{
SpannerCommand cmd = connection.CreateDmlCommand(
"UPDATE TestTable SET StringValue='Updated' WHERE Int64Value=@value");
cmd.Parameters.Add("value", SpannerDbType.Int64, 10);
cmd.Transaction = transaction;
int rowsAffected = await cmd.ExecuteNonQueryAsync();
Console.WriteLine($"{rowsAffected} rows updated...");
});
}
It is recommended that you execute DML statements within a retriable transaction. Queries within the same transaction are able to observe the changes already made by DML statements, and later DML statements can use the values created or updated by earlier ones. Also, if the transaction is aborted at any point, the whole unit of work will be retried automatically. Note that the code executing within a retriable transaction needs to be prepared to be called more than once. Please read the Cloud Spanner user documentation for details on when and why transactions can be aborted.
Some DML statements can be executed in a partitioned manner, enabling an efficient
update of large data sets. ExecutePartitionedUpdate
or ExecutePartitionedUpdateAsync
:
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
SpannerCommand cmd = connection.CreateDmlCommand(
"UPDATE TestTable SET TestTable.StringValue='Updated in partitions' WHERE TestTable.Int64Value=@value");
cmd.Parameters.Add("value", SpannerDbType.Int64, 9);
long rowsAffected = await cmd.ExecutePartitionedUpdateAsync();
Console.WriteLine($"{rowsAffected} rows updated...");
}
Partitioned DML updates cannot be performed within another transaction, and have "at least once" semantics: the update can be applied more than once to a row in some cases, and so is best used with idempotent updates.
Not all DML statements can be partitioned. Please read the Cloud Spanner user documentation for details on the restrictions.
Batch DML
Batch DML allows you to execute multiple DML commands in one batched operation.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
// Please be aware that the whole unit of work needs to be prepared
// to be called more than once.
await connection.RunWithRetriableTransactionAsync(async (transaction) =>
{
SpannerBatchCommand cmd = transaction.CreateBatchDmlCommand();
cmd.Add(
"UPDATE TestTable SET StringValue='Updated' WHERE Int64Value=@value",
new SpannerParameterCollection { { "value", SpannerDbType.Int64, 5 } });
cmd.Add(
"DELETE FROM TestTable WHERE Int64Value=@value",
new SpannerParameterCollection { { "value", SpannerDbType.Int64, 250 } });
IEnumerable<long> rowsAffected = await cmd.ExecuteNonQueryAsync();
Console.WriteLine($"{rowsAffected.ElementAt(0)} rows updated...");
Console.WriteLine($"{rowsAffected.ElementAt(1)} rows deleted...");
});
}
Statements are executed sequentally in the same order they are provided. Changes made by one statement are visible to all subsequent statements in the batch.
When one of the statemens in the batch fails, execution is halted and all subsequent statements are not
executed. A SpannerBatchNonQueryException
will be thrown that contains both information about the error
and the number of rows affected by each of the statements that executed before the failed one. If you are executing
the batch DML command inside a non-retriable transaction you can simply commit the transaction when SpannerBatchNonQueryException
is thrown if partial success is acceptable in your application.
It is recommended to run Batch DML within a retriable transaction. If the transaction
aborts the whole unit of work will be retried automatically.
Note that the code executing within a retriable transaction needs to be prepared to be called more
than once.
Please read the Cloud Spanner user documentation
for details on when and why transactions can be aborted.
Direct row modifications
The following operations are supported for direct row modification:
- Insert
- Update
- Delete
- Insert or update (also known as "upsert")
Create a command from the SpannerConnection
, providing the table name to the appropriate method
(CreateInsertCommand
and the like), then use the command parameters to specify values for columns.
The command can be reused to perform the same kind of operation for multiple rows.
This sample inserts two rows, then reads them again:
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
await connection.RunWithRetriableTransactionAsync(async transaction =>
{
SpannerCommand cmd = connection.CreateInsertCommand("TestTable");
SpannerParameter keyParameter = cmd.Parameters.Add("Key", SpannerDbType.String);
SpannerParameter stringValueParameter = cmd.Parameters.Add("StringValue", SpannerDbType.String);
SpannerParameter int64ValueParameter = cmd.Parameters.Add("Int64Value", SpannerDbType.Int64);
cmd.Transaction = transaction;
// This executes 5 distinct insert commands using the retriable transaction.
// The mutations will be effective once the transaction has committed successfully.
for (int i = 0; i < 5; i++)
{
keyParameter.Value = Guid.NewGuid().ToString("N");
stringValueParameter.Value = $"StringValue{i}";
int64ValueParameter.Value = i;
int rowsAffected = await cmd.ExecuteNonQueryAsync();
Console.WriteLine($"{rowsAffected} rows written...");
}
});
}
This sample reads three keys using a SELECT
command, updates a row using an UPDATE
command, then
deletes a row using a DELETE
command:
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
// Please be aware that the whole unit of work needs to be prepared
// to be called more than once.
await connection.RunWithRetriableTransactionAsync(async (transaction) =>
{
// Read the first two keys in the database.
List<string> keys = new List<string>();
SpannerCommand selectCmd = connection.CreateSelectCommand("SELECT * FROM TestTable");
using (SpannerDataReader reader = await selectCmd.ExecuteReaderAsync())
{
while (keys.Count < 3 && await reader.ReadAsync())
{
keys.Add(reader.GetFieldValue<string>("Key"));
}
}
// Update the Int64Value of keys[0]
// Include the primary key and update columns.
SpannerCommand updateCmd = connection.CreateUpdateCommand("TestTable");
updateCmd.Parameters.Add("Key", SpannerDbType.String, keys[0]);
updateCmd.Parameters.Add("Int64Value", SpannerDbType.Int64, 0L);
await updateCmd.ExecuteNonQueryAsync();
// Delete row for keys[1]
SpannerCommand deleteCmd = connection.CreateDeleteCommand("TestTable");
deleteCmd.Parameters.Add("Key", SpannerDbType.String, keys[1]);
await deleteCmd.ExecuteNonQueryAsync();
});
When direct row modifications are performed in a transaction, they are only applied when the transaction is committed. Queries within the transaction will not observe any changes. It is recommended to run direct row modifications within a retriable transaction. If the transaction aborts the whole unit of work will be retried automatically. Note that the code executing within a retriable transaction needs to be prepared to be called more than once. Please read the Cloud Spanner user documentation for details on when and why transactions can be aborted.
Using a commit timestamp
Spanner allows you to write a commit timestamp for insert and update operations.
When using direct row modifications, the commit timestamp can be set
using SpannerParameter.CommitTimestamp
as the value for a
parameter, as shown below.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
string createTableStatement =
@"CREATE TABLE UsersHistory (
Id INT64 NOT NULL,
CommitTs TIMESTAMP NOT NULL OPTIONS
(allow_commit_timestamp=true),
Name STRING(MAX) NOT NULL,
Email STRING(MAX),
Deleted BOOL NOT NULL,
) PRIMARY KEY(Id, CommitTs DESC)";
await connection.CreateDdlCommand(createTableStatement).ExecuteNonQueryAsync();
// Create a command for inserting rows.
SpannerCommand cmd = connection.CreateInsertCommand("UsersHistory",
new SpannerParameterCollection
{
{ "Id", SpannerDbType.Int64 },
{ "CommitTs", SpannerDbType.Timestamp, SpannerParameter.CommitTimestamp },
{ "Name", SpannerDbType.String },
{ "Deleted", SpannerDbType.Bool , false}
});
int rowsAffected = 0;
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
// Please be aware that the whole unit of work needs to be prepared
// to be called more than once.
await connection.RunWithRetriableTransactionAsync(async transaction =>
{
// Insert a first row
cmd.Parameters["Id"].Value = 10L;
cmd.Parameters["Name"].Value = "Demo 1";
cmd.Transaction = transaction;
rowsAffected += await cmd.ExecuteNonQueryAsync();
});
await connection.RunWithRetriableTransactionAsync(async transaction =>
{
// Insert a second row
cmd.Parameters["Id"].Value = 11L;
cmd.Parameters["Name"].Value = "Demo 2";
rowsAffected += await cmd.ExecuteNonQueryAsync();
});
Console.WriteLine($"{rowsAffected} rows written...");
// Display the inserted values
SpannerCommand selectCmd = connection.CreateSelectCommand("SELECT * FROM UsersHistory");
using (SpannerDataReader reader = await selectCmd.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
long id = reader.GetFieldValue<long>("Id");
string name = reader.GetFieldValue<string>("Name");
DateTime timestamp = reader.GetFieldValue<DateTime>("CommitTs");
Console.WriteLine($"{id}: {name} - {timestamp:HH:mm:ss.ffffff}");
}
}
}
This parameter value cannot be used in DML. Instead, use the
PENDING_COMMIT_TIMESTAMP()
function in your DML statement.
See the Cloud Spanner documentation for more information about commit timestamps.
Transactions and Fault Handling
Cloud Spanner is fully ACID compliant.
Retries should be implemented at the transaction level (as opposed to individual call level)
because there is a chance in high stress situations that a Cloud Spanner session (transaction)
can be aborted due to causing a deadlock. In this case, the entire transaction should be
retried. The methods SpannerConnection.RunWithRetriableTransactionAsync(...)
will take care of
automatically retrying the entire callback passed as a paremeter using a fresh transaction
if previous attempts aborted.
- The callback doesn't need to handle the transaction lifecycle.
- The callback needs to set the transaction received on all the commands it creates.
- The callback might be called more than once, so the code should be written bearing this in mind.
Please read the Cloud Spanner user documentation for details on when and why transactions can be aborted.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.OpenAsync();
// If the transaction is aborted, RunWithRetriableTransactionAsync will
// retry the whole unit of work with a fresh transaction each time.
await connection.RunWithRetriableTransactionAsync(async transaction =>
{
SpannerCommand cmd = connection.CreateInsertCommand("TestTable");
SpannerParameter keyParameter = cmd.Parameters.Add("Key", SpannerDbType.String);
SpannerParameter stringValueParameter = cmd.Parameters.Add("StringValue", SpannerDbType.String);
SpannerParameter int64ValueParameter = cmd.Parameters.Add("Int64Value", SpannerDbType.Int64);
cmd.Transaction = transaction;
// This executes 5 distinct insert commands using the retriable transaction.
// The mutations will be effective once the transaction has committed successfully.
for (int i = 0; i < 5; i++)
{
keyParameter.Value = Guid.NewGuid().ToString("N");
stringValueParameter.Value = $"StringValue{i}";
int64ValueParameter.Value = i;
int rowsAffected = await cmd.ExecuteNonQueryAsync();
Console.WriteLine($"{rowsAffected} rows written...");
}
});
}
Session pool management
Google.Cloud.Spanner.Data
manages Spanner sessions for you via a
session pool. The pool has various configuration
options, but additionally you may wish to take
advantage of additional SpannerConnection
methods at the start and
end of your application.
To ensure that the session pool is populated with the configured
minimum number of sessions before serving requests from your
application, you may wish to wait for the task returned by
SpannerConnection.WhenSessionPoolReady()
to complete before
marking your application as "ready to serve".
// This would usually be executed during application start-up, before any Spanner
// operations are performed. It can be used at any time, however. It is purely passive:
// it doesn't modify the session pool or trigger any other actions.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.WhenSessionPoolReady();
}
When your application is shutting down, you may wish to delete the
sessions it has been using from the server. Most applications will
not need to do this: sessions automatically expire after an idle
time of an hour anyway, and the server-enforced session limit is
high. Applications using a large number of sessions may wish to
clean up sessions more actively, however. Again, SpannerConnection
provides a simple method to accomplish this:
// When your application is shutting down. Note that any pending or future requests
// for sessions will fail.
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
await connection.ShutdownSessionPoolAsync();
}
If you want to diagnose session management issues,
SpannerConnection
provides a method to fetch a snapshot of
the statistics for the session pool associated with the database
associated with the connection:
using (SpannerConnection connection = new SpannerConnection(connectionString))
{
SessionPool.DatabaseStatistics stats = connection.GetSessionPoolDatabaseStatistics();
if (stats is null)
{
Console.WriteLine("No session pool for this connection string yet");
}
else
{
// Access individual properties...
Console.WriteLine($"Database name: {stats.DatabaseName}");
Console.WriteLine($"Active sessions: {stats.ActiveSessionCount}");
Console.WriteLine($"Pooled read-only sessions: {stats.ReadPoolCount}");
Console.WriteLine($"Pooled read-write sessions: {stats.ReadWritePoolCount}");
// ... or just use the overridden ToString method to log all the statistics in one go:
Console.WriteLine(stats);
}
}