Blog posts, solutions, & videos

From time to time, we release papers, blog posts, and videos related to Sensitive Data Protection. They are listed here.

Blog posts

Protect generative AI workloads with Sensitive Data Protection

This blog post explores a data-focused approach to protecting generative AI applications with Sensitive Data Protection, and provides a Jupyter Notebook with real-life examples.

Read blog post: "How Sensitive Data Protection can help secure generative AI workloads"

Automatic data risk management for BigQuery using DLP

The sensitive data discovery service continuously scans data across your entire organization to give you general awareness of what data you have and specific visibility into where sensitive data is stored and processed. This awareness is a critical first step in protecting and governing your data and acts as a key control to help improve your security, privacy, and compliance posture.

Read blog post: "Automatic data risk management for BigQuery using DLP "

Not just compliance: reimagining DLP for today's cloud-centric world

A look back at the history of DLP before discussing how DLP is useful in today's environment, including compliance, security, and privacy use cases.

Read blog post: "Not just compliance: reimagining DLP for today's cloud-centric world"

Scan for sensitive data in just a few clicks

A deeper look at the Google Cloud console user interface for Sensitive Data Protection to show how you can start to inspect your enterprise data with just a few clicks.

Read blog post: "Take charge of your data: Scan for sensitive data in just a few clicks"

How tokenization makes data usable without sacrificing privacy

Tokenization, sometimes referred to as pseudonymization or surrogate replacement, is widely used in industries like finance and healthcare to help reduce the use of data in use, compliance scope, and minimize sensitive data being exposed to systems that do not need it. With Sensitive Data Protection, customers can perform tokenization at scale with minimal setup.

Read blog post: "Take charge of your data: How tokenization makes data usable without sacrificing privacy"

Using Sensitive Data Protection to de-identify and obfuscate sensitive information

The team discusses how to leverage Sensitive Data Protection to protect data by automatically incorporating data obfuscation and minimization techniques into your workflows.

Read blog post: "Take charge of your data: using Sensitive Data Protection to de-identify and obfuscate sensitive information"

Using Sensitive Data Protection to find and protect PII

Scott Ellis, Sensitive Data Protection Product Manager, discusses how to leverage Sensitive Data Protection to increase your privacy posture.

Read blog post: "Take charge of your data: Using Cloud DLP to find and protect PII"

Scanning BigQuery with Sensitive Data Protection

The team shares how to easily scan BigQuery from the Google Cloud console.

Read blog post: "Scan BigQuery for sensitive data using Sensitive Data Protection"

Solutions

Sensitive Data Protection hybrid inspection for SQL databases using JDBC

This tutorial shows how to use the Sensitive Data Protection hybrid inspection method with a JDBC driver to inspect samples of tables in a SQL database like MySQL, SQL Server, or PostgreSQL running virtually anywhere.

Read tutorial: "Sensitive Data Protection hybrid inspection for SQL databases using JDBC"

Speech Redaction Framework using Sensitive Data Protection

This tutorial includes a collection of components and code that you can use to redact sensitive information from audio files. Using files uploaded to Cloud Storage, it can discover and write sensitive findings or redact sensitive information from the audio file.

Additionally, a second tutorial, the Speech Analysis Framework includes a collection of components and code that you can use to transcribe audio, create a data pipeline for analytics of transcribed audio files, and redact sensitive information from audio transcripts with Sensitive Data Protection.

GitHub: "Speech Redaction Framework"

GitHub: "Speech Analysis Framework"

Event-driven serverless scheduling architecture with Sensitive Data Protection

This tutorial shows a simple yet effective and scalable event-driven serverless scheduling architecture with Google Cloud services. The example included demonstrates how to work with the DLP API to inspect BigQuery data.

Read tutorial: "Event-driven serverless scheduling architecture with Sensitive Data Protection"

Sensitive Data Protection Filter for Envoy

Sensitive Data Protection Filter for Envoy is a WebAssembly ("Wasm") HTTP filter for Envoy sidecar proxies inside an Istio service mesh. Sensitive Data Protection Filter for Envoy captures proxy data plane traffic and sends it for inspection to Sensitive Data Protection, where the payload is scanned for sensitive data, including PII.

GitHub: Sensitive Data Protection Filter for Envoy

Anomaly detection using streaming analytics & AI

In this post, we walk through a real-time AI pattern for detecting anomalies in log files. By analyzing and extracting features from network logs, we helped a telecommunications (telco) customer build a streaming analytics pipeline to detect anomalies. We also discuss how you can adapt this pattern to meet your organization's real-time needs. This proof of concept solution uses Pub/Sub, Dataflow, BigQuery ML, and Sensitive Data Protection.

Read blog post: "Anomaly detection using streaming analytics & AI"

Read tutorial: "Realtime Anomaly Detection Using Google Cloud Stream Analytics and AI Services"

De-identification and re-identification of PII in large-scale datasets using Sensitive Data Protection

This solution discusses how to use Sensitive Data Protection to create an automated data transformation pipeline to de-identify sensitive data like personally identifiable information (PII). This inspection and migration solution reads structured and unstructured data from storage systems like Amazon S3 and Cloud Storage. Data can be automatically de-identified using DLP API and sent to BigQuery and Cloud Storage.

Read tutorial: "De-identification and re-identification of PII in large-scale datasets using Sensitive Data Protection"

GitHub: Data Tokenization PoC Using Dataflow/Beam and DLP API

Automating the classification of data uploaded to Cloud Storage

This tutorial shows how to implement an automated data quarantine and classification system using Cloud Storage and other Google Cloud products.

Read tutorial: "Automating the classification of data uploaded to Cloud Storage"

Relational database import to BigQuery with Dataflow

This proof-of-concept uses Dataflow and Sensitive Data Protection to securely tokenize and import data from a relational database to BigQuery. The example describes how to use this pipeline with a sample SQL Server database created in Google Kubernetes Engine and use of Sensitive Data Protection template to tokenize PII data before it's persisted.

GitHub: Relational Database Import to BigQuery with Dataflow and Sensitive Data Protection

Example architecture for using a Sensitive Data Protection proxy to query a database containing sensitive data

This proof-of-concept architecture uses a proxy to pass all queries and results through a service that parses, inspects, and then either logs the findings or de-identifies the results by using Sensitive Data Protection. It then returns the requested data to the user. Note that if the database already stores tokenized data, this proxy concept can also be used to de-tokenize before returning the requested data. Read tutorial: "Example architecture for using a Sensitive Data Protection proxy to query a database containing sensitive data"

Videos

Cloud Next '20: OnAir: Managing Sensitive Data in Hybrid Environments

Sensitive data exists in enterprise environments both on and off cloud. Properly managing this data is critical regardless of where the data resides. In this session, we will show you how Sensitive Data Protection can help you manage data, focusing on support for inspection of content in hybrid environments like on-premises, databases running in virtual machines, files hosted on other cloud providers, data flowing inside Kubernetes, and more.

YouTube: SEC206: Managing Sensitive Data in Hybrid Environments

Read tutorial: "Sensitive Data Protection Filter for Envoy"

Read tutorial: "Sensitive Data Protection hybrid inspection for SQL databases using JDBC"

Cloud OnAir: Protecting sensitive datasets on Google Cloud

Data is one of your company's most valuable assets. Analytics and machine learning can help unlock valuable services for your customers and your business. These datasets can also contain sensitive data that need protection. In this webinar, you'll learn how Sensitive Data Protection can help you discover, classify, and de-identify sensitive data as part of an overall governance strategy.

YouTube: Cloud OnAir: Protecting sensitive datasets in Google Cloud

Cloud Next 2019: Scotiabank shares their cloud-native approach to ingesting PII into Google Cloud

As a major international bank, Scotiabank discusses its security journey and cloud-native approach to ingesting PII into Google Cloud, constraining access, and carefully and selectively allowing re-identification by bank applications.

YouTube: Comprehensive Protection of PII in Google Cloud (Cloud Next '19)

Cloud Next 2019: Identify and Protect Sensitive Data in the Cloud

The team shares the latest advancements made to Sensitive Data Protection and demos several different techniques to protect your sensitive data.

YouTube: Identify and Protect Sensitive Data in the Cloud: Latest Innovations in Google Cloud (Cloud Next '19)