From time to time, we release papers, blog posts, and videos related to Cloud Data Loss Prevention. They are listed here.
Blog posts
Automatic data risk management for BigQuery using DLP
Automatic DLP continuously scans data across your entire organization to give you general awareness of what data you have and specific visibility into where sensitive data is stored and processed. This awareness is a critical first step in protecting and governing your data and acts as a key control to help improve your security, privacy, and compliance posture.
Read blog post: "Automatic data risk management for BigQuery using DLP "
Not just compliance: reimagining DLP for today's cloud-centric world
A look back at the history of DLP before discussing how DLP is useful in today's environment, including compliance, security, and privacy use cases.
Read blog post: "Not just compliance: reimagining DLP for today's cloud-centric world"
Scan for sensitive data in just a few clicks
A deeper look at the Google Cloud console user interface for Cloud DLP to show how you can start to inspect your enterprise data with just a few clicks.
Read blog post: "Take charge of your data: Scan for sensitive data in just a few clicks"
How tokenization makes data usable without sacrificing privacy
Tokenization, sometimes referred to as pseudonymization or surrogate replacement, is widely used in industries like finance and healthcare to help reduce the use of data in use, compliance scope, and minimize sensitive data being exposed to systems that do not need it. With Cloud DLP, customers can perform tokenization at scale with minimal setup.
Using Cloud DLP to de-identify and obfuscate sensitive information
The team discusses how to leverage Cloud DLP to protect data by automatically incorporating data obfuscation and minimization techniques into your workflows.
Using Cloud DLP to find and protect PII
Scott Ellis, Cloud DLP Product Manager, discusses how to leverage Cloud DLP to increase your privacy posture.
Read blog post: "Take charge of your data: Using Cloud DLP to find and protect PII"
Scanning BigQuery with Cloud DLP
The team shares how to easily scan BigQuery from the Google Cloud console.
Read blog post: "Scan BigQuery for sensitive data using Cloud DLP"
Solutions
Cloud DLP hybrid inspection for SQL databases using JDBC
This tutorial shows how to use the Cloud DLP hybrid inspection method with a JDBC driver to inspect samples of tables in a SQL database like MySQL, SQL Server, or PostgreSQL running virtually anywhere.
Read tutorial: "Cloud DLP hybrid inspection for SQL databases using JDBC"
Speech Redaction Framework using Cloud DLP
This tutorial includes a collection of components and code that you can use to redact sensitive information from audio files. Using files uploaded to Cloud Storage, it can discover and write sensitive findings or redact sensitive information from the audio file.
Additionally, a second tutorial, the Speech Analysis Framework includes a collection of components and code that you can use to transcribe audio, create a data pipeline for analytics of transcribed audio files, and redact sensitive information from audio transcripts with Cloud DLP.
GitHub: "Speech Redaction Framework"
GitHub: "Speech Analysis Framework"
Event-driven serverless scheduling architecture with Cloud DLP
This tutorial shows a simple yet effective and scalable event-driven serverless scheduling architecture with Google Cloud services. The example included demonstrates how to work with the DLP API to inspect BigQuery data.
Read tutorial: "Event-driven serverless scheduling architecture with Cloud DLP"
Cloud DLP Filter for Envoy
Cloud DLP Filter for Envoy is a WebAssembly ("Wasm") HTTP filter for Envoy sidecar proxies inside an Istio service mesh. DLP Filter for Envoy captures proxy data plane traffic and sends it for inspection to Cloud DLP, where the payload is scanned for sensitive data, including PII.
GitHub: Cloud DLP Filter for Envoy
Anomaly detection using streaming analytics & AI
In this post, we walk through a real-time AI pattern for detecting anomalies in log files. By analyzing and extracting features from network logs, we helped a telecommunications (telco) customer build a streaming analytics pipeline to detect anomalies. We also discuss how you can adapt this pattern to meet your organization's real-time needs. This proof of concept solution uses Pub/Sub, Dataflow, BigQuery ML, and Cloud DLP.
Read blog post: "Anomaly detection using streaming analytics & AI"
Read tutorial: "Realtime Anomaly Detection Using Google Cloud Stream Analytics and AI Services"
De-identification and re-identification of PII in large-scale datasets using Cloud DLP
This solution discusses how to use Cloud DLP to create an automated data transformation pipeline to de-identify sensitive data like personally identifiable information (PII). This inspection and migration solution reads structured and unstructured data from storage systems like Amazon S3 and Cloud Storage. Data can be automatically de-identified using DLP API and sent to BigQuery and Cloud Storage.
GitHub: Data Tokenization PoC Using Dataflow/Beam and DLP API
Automating the classification of data uploaded to Cloud Storage
This tutorial shows how to implement an automated data quarantine and classification system using Cloud Storage and other Google Cloud products.
Read tutorial: "Automating the Classification of Data Uploaded to Cloud Storage"
Relational database import to BigQuery with Dataflow
This proof-of-concept uses Dataflow and Cloud DLP to securely tokenize and import data from a relational database to BigQuery. The example describes how to use this pipeline with a sample SQL Server database created in Google Kubernetes Engine and use of DLP template to tokenize PII data before it's persisted.
GitHub: Relational Database Import to BigQuery with Dataflow and Cloud DLP
Example architecture for using a Cloud DLP proxy to query a database containing sensitive data
This proof-of-concept architecture uses a proxy to pass all queries and results through a service that parses, inspects, and then either logs the findings or de-identifies the results by using Cloud DLP. It then returns the requested data to the user. Note that if the database already stores tokenized data, this proxy concept can also be used to de-tokenize before returning the requested data. Read tutorial: "Example architecture for using a Cloud DLP proxy to query a database containing sensitive data"
Videos
Cloud Next '20: OnAir: Managing Sensitive Data in Hybrid Environments
Sensitive data exists in enterprise environments both on and off cloud. Properly managing this data is critical regardless of where the data resides. In this session, we will show you how Cloud DLP can help you manage data, focusing on support for inspection of content in hybrid environments like on-premises, databases running in virtual machines, files hosted on other cloud providers, data flowing inside Kubernetes, and more.
YouTube: SEC206: Managing Sensitive Data in Hybrid Environments
Read tutorial: "Cloud DLP Filter for Envoy"
Read tutorial: "Cloud DLP hybrid inspection for SQL databases using JDBC"
Cloud OnAir: Protecting sensitive datasets on Google Cloud
Data is one of your company's most valuable assets. Analytics and machine learning can help unlock valuable services for your customers and your business. These datasets can also contain sensitive data that need protection. In this webinar, you'll learn how Cloud DLP can help you discover, classify, and de-identify sensitive data as part of an overall governance strategy.
YouTube: Cloud OnAir: Protecting sensitive datasets in Google Cloud
Cloud Next 2019: Scotiabank shares their cloud-native approach to ingesting PII into Google Cloud
As a major international bank, Scotiabank discusses its security journey and cloud-native approach to ingesting PII into Google Cloud, constraining access, and carefully and selectively allowing re-identification by bank applications.
YouTube: Comprehensive Protection of PII in Google Cloud (Cloud Next '19)
Cloud Next 2019: Identify and Protect Sensitive Data in the Cloud
The team shares the latest advancements made to Cloud DLP and demos several different techniques to protect your sensitive data.