Deletes
This document describes how to delete data stored in Bigtable tables, discusses when you should use each approach, and provides examples. Before you read this page, you should be familiar with the Bigtable overview and understand the concepts involved in schema design.
For consistency, descriptions on this page refer to the API methods that are used for each type of request. However, we strongly recommend that you always use one of the Bigtable client libraries to access the Bigtable APIs instead of using REST or RPC.
Examples on this page use sample data similar to the data that you might store in Bigtable.
To learn the number of times that you can use the operations described on this page per day, see Quotas and limits.
How Bigtable deletes data
When you send a delete request, cells are marked for deletion and cannot be read. The data is removed up to a week later during compaction, a background process that continuously optimizes the table. Deletion metadata can cause your data to take up slightly more space (several kb per row) for a few days after you send a delete request, until the next compaction occurs.
You can always send a delete request, even if your cluster has exceeded the storage limit and reads and writes are blocked.
Delete a range of rows
If you want to delete a large amount of data stored in contiguous rows, use
dropRowRange
. This operation deletes all rows for a range of rows identified
by a starting and ending row or a row key prefix.
The row key values that you provide when you delete a range of rows are treated as service data. For information about how service data is handled, see the Google Cloud Privacy Notice.
After a successful deletion is complete and you receive a response, you can safely write data to the same row range.
The dropRowRange
operation has the following restrictions:
- You can't drop a range of rows from an authorized view.
- You can't call the
dropRowRange
method asynchronously. If you send adropRowRange
request to a table while another request is in progress, Bigtable returns anUNAVAILABLE
error with the messageA DropRowRange operation is already ongoing
. To resolve the error, send the request again. - With instances that use replication, be aware that Bigtable might take a long time to complete the operation due to increased replication latency and CPU usage. To delete data from an instance that uses replication, use the Data API to read and then delete your data.
The following code samples show how to drop a range of rows that start with
the row key prefix phone#5c10102
:
Java
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Delete data using Data API methods
If you need to delete small amounts of non-contiguous data, deleting data using a method that calls the Cloud Bigtable API (Data API) is often the best choice. Use these methods if you are deleting MB, not GB, of data in a request. Using the Data API is the only way to delete data from a column (not column family).
Data API methods call MutateRows
with one of three mutation types:
- DeleteFromColumn
- DeleteFromFamily
- DeleteFromRow
A delete request using the Data API is atomic: either the request succeeds and all data is deleted, or the request fails and no data is removed.
In most cases, avoid using CheckAndMutate
methods to delete data. In the rare
event that you require strong consistency, you might want to use this
approach, but be aware that it is resource-intensive and performance might be
affected.
To use MutateRows
to delete data, you send a readRows
request with a
filter to determine what you want to delete, and then you send the deletion
request. For a list of the filters that are available, see
Filters.
Samples in this section assume that you have already determined what data to delete.
Delete from a column
The following code samples demonstrate how to delete all the cells from a column in a row:
Java
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python asyncio
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Delete from a column family
The following code samples demonstrate how to delete cells from a column family in a row:
Java
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python asyncio
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Delete from a row
The following code snippets demonstrate how to delete all the cells from a row:
Java
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python asyncio
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Delete by streaming and batching
Streaming and batching your delete requests is often the best way to delete large amounts of data. This strategy can be useful when you have finer-grained data retention requirements than garbage-collection policies allow.
The following code snippets start a stream of data (reading
rows), batch them, and then go through the batch and delete all the
cells in column data_plan_01gb1
in the cell_plan
column family:
Java
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Python asyncio
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Node.js
To learn how to install and use the client library for Bigtable, see Bigtable client libraries.
To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for client libraries.
Delete data in an authorized view
You can delete table data by sending a delete request to an authorized view. You must use one of the following:
- gcloud CLI
- Bigtable client for Java
When you delete data from an authorized view, you supply the authorized view ID in addition to the table ID.
The data that you can delete from an authorized view is limited by the
authorized view definition. You can only delete data that is included in
the authorized view. If you attempt to delete data that is outside of
the authorized view definition or is subject to the following rules, an
error of PERMISSION_DENIED
is returned:
- Deleting a range of rows from an authorized view using
DropRowRange
in the admin API is not supported. - Deleting from a row is not supported.
- Deleting from a column is supported as long as it's for rows that are in the authorized view.
- Deleting from a column family is only permitted if the specified column family
is configured to allow all column qualifier prefixes (
qualifier_prefixes=""
) in the authorized view.
For example, if you attempt to delete from a specified row, and that row contains columns in the underlying table that are not in your authorized view, then the request fails.