Data privacy strategies

Data privacy is protecting data such as personally identifiable information (PII) from those who shouldn't have access to it. This page describes several approaches to data privacy that you can use to protect your PII in Cloud SQL.

You can use Cloud SQL to store your PII securely. You want to ensure that this information is processed with the highest privacy protection so that it isn't made accessible inadvertently. For example, if you store credit card information or healthcare data in your databases, then you can use Cloud SQL to hide or mask PII from unprivileged users.

Use the following strategies to help you secure your PII in Cloud SQL:

Column-level security

Column-level security lets you restrict who can see the content in specific columns of database tables. Column-level privileges are applicable for INSERT, UPDATE, SELECT, and REFERENCES statements.

For example, consider a retail website where you want to govern PII for two users: Jack and Alice.

--User: "admin"

CREATE SCHEMA secure_schema;

CREATE TABLE secure_schema.user_details(id bigint, name text, age smallint, email_id text, password text);

--For this example, passwords are stored in plain text for demonstration
--purposes only. In production, never store passwords in plain text.

INSERT INTO secure_schema.user_details VALUES(1,'jack',34,'jack@example.com','testpass');
INSERT INTO secure_schema.user_details VALUES(2,'alice',37,'alice@example.com','testpass');

GRANT USAGE ON SCHEMA secure_schema TO analyst_ro;

--Grant read permissions on specific columns only.

GRANT SELECT (id, name, age) ON secure_schema.user_details TO analyst_ro;

--User: "analyst_ro"

SELECT * FROM secure_schema.user_details;
ERROR:  permission denied for table user_details

SELECT name, age, password FROM secure_schema.user_details;
ERROR:  permission denied for table user_details

SELECT id, name,age FROM secure_schema.user_details;
 id | name  | age
----+-------+----
  1 | jack  |  34
  2 | alice |  37

If you include the restricted columns in the SELECT statement or you enter SELECT *, then an error message appears. Cloud SQL secures the PII for Jack and Alice in these columns.

You can also use a single GRANT statement to combine different privileges.

GRANT SELECT (id,name,age), UPDATE (name) ON secure_schema.user_details TO analyst_ro;

View-based approach

You can also achieve column-level security by creating a view on a table, excluding or masking columns that you want to hide from other users, and providing access to the view instead of to the table.

The following example shows how to use a view-based approach for the retail website to secure the PII for Jack and Alice:

--User: "admin"

CREATE SCHEMA analyst_ro;
CREATE VIEW analyst_ro.user_details AS SELECT id, name, age FROM secure_schema.user_details;
GRANT USAGE ON SCHEMA analyst_ro TO analyst_ro;
GRANT SELECT ON analyst_ro.user_details TO analyst_ro;

--User: "analyst_ro"

SELECT id,name,age FROM user_details;
 id | name  | age
----+-------+----
  1 | jack  |  34
  2 | alice |  37

SELECT * FROM user_details;
 id | name  | age
----+-------+----
  1 | jack  |  34
  2 | alice |  37

In this example, a separate schema is created for the view to keep its name the same as the table. With the view-based approach, you can use SELECT *.

You can also create a view and mask the columns of the database table so that unprivileged users can't see the PII that's masked.

CREATE VIEW analyst_ro.user_details AS SELECT id, name, age, 'redacted@example.com' as email_id,'*****'::text as password FROM secure_schema.user_details;

SELECT * FROM user_details;
 id | name  | age |     email_id         | password
----+-------+-----+----------------------+---------
  1 | jack  |  34 | redacted@example.com | *****
  2 | alice |  37 | redacted@example.com | *****

Row-level security

Column-level security and a view-based approach let you hide PII in columns of database tables from specific users. However, sometimes you want to filter this data and grant access to specific rows of a table. This table contains the PII that only certain users can access, based on qualifying user conditions. This is known as row-level security.

Row-level security is useful for multi-tenant applications where users have read-access and write-access privileges to their own PII only. In Cloud SQL, tables can have row-level security policies that restrict, on a per-user basis, which rows users can view by creating queries, or the rows that users can insert, update, or delete by running data modification commands.

For the retail website example, you can implement row-level security for Jack and Alice so that they can view their own PII, but they can't modify or delete it.

--User: "admin"
--Create and enable a policy for row-level security

CREATE POLICY user_details_rls_pol ON secure_schema.user_details FOR ALL TO PUBLIC USING (name=current_user);
ALTER TABLE secure_schema.user_details ENABLE ROW LEVEL SECURITY;

SELECT * FROM secure_schema.user_details;
 id | name  | age |     email_id      | password
----+-------+-----+-------------------+---------
  1 | jack  |  34 | jack@example.com  | testpass
  2 | alice |  37 | alice@example.com | testpass

--User: "jack"

SELECT * FROM secure_schema.user_details;
 id | name | age |    email_id      | password
----+------+-----+------------------+---------
  1 | jack |  34 | jack@example.com | testpass

--User: "alice"

SELECT * FROM secure_schema.user_details;
 id | name  | age |    email_id       | password
----+-------+-----+-------------------+---------
  2 | alice |  37 | alice@example.com | testpass

Users who are assigned to roles that have the BYPASSRLS attribute can bypass row-level security when they're accessing a table. Table owners can also bypass row-level security. If you want to subject a table owner to row-level security, then use the ALTER TABLE ... FORCE ROW LEVEL SECURITY command.

Sometimes, you don't want to apply row-level security to rows of a database table. For example, if you use pg_dump to take a backup of the table, then you don't want any rows to be omitted from the backup. To prevent this from occurring, for the user who takes the backup, set the row_security configuration parameter to OFF. If any rows are filtered based on row-level security, then an error message appears.

Mask and anonymize data

In addition to masking data by using a view-based approach, you can mask data by using the postgresql_anonymizer extension. This extension masks or replaces PII or commercially sensitive data from a PostgreSQL database.

Using the extension over a view-based approach provides you with the following benefits:

  • You have various masking functions such as substitution, randomization, faking, pseudonymization, partial scrambling, shuffling, noise addition, and generalization.

  • You can generate meaningful masked data that you can use for functional testing and data processing.

  • You can use the PostgreSQL Data Definition Language (DDL) to declare masking rules and specify the anonymization strategy inside the table definition.

Install and configure the postgresql_anonymizer extension

To use this extension on a Cloud SQL instance, complete the following steps:

  1. Edit the instance and then set the cloudsql.enable_anon flag to on. For information about setting flags, and to review the flags supported for the extension, see Configure database flags.

  2. Create the extension in the database by running the following command:

    --Connect to the PostgreSQL database
    
    CREATE EXTENSION IF NOT EXISTS anon CASCADE;
    SELECT anon.init();
    

After you install and configure the extension, use it on the instance to implement dynamic mask, static mask, and anonymous dump anonymization strategies.

Dynamic mask

Use dynamic masks to define masking rules for specific users. These users can't see PII. Instead, they see masked data. All other users see the unmasked data. This is useful in production environments when you don't want to alter the PII, but only hide it from certain users.

For the retail website example, you can implement dynamic masks so that the administrator can view the unmasked email addresses and passwords for Jack and Alice, but the analyst can view only masked data.

--Activate the dynamic masking engine

SELECT anon.start_dynamic_masking();

--Declare the masking user and masking rules
--analyst_ro is the masked user with select privileges on the
--user_details table

SECURITY LABEL FOR anon ON ROLE analyst_ro IS 'MASKED';

SECURITY LABEL FOR anon ON COLUMN secure_schema.user_details.email_id IS 'MASKED WITH FUNCTION anon.fake_email()';
SECURITY LABEL FOR anon ON COLUMN secure_schema.user_details.password  IS 'MASKED WITH FUNCTION anon.hash(password)';

--User: "admin" (can see all unmasked data)

SELECT * FROM secure_schema.user_details;
 id | name  | age |    email_id       | password
----+-------+-----+------------  -----+---------
  1 | jack  |  34 | jack@example.com  | testpass
  2 | alice |  37 | alice@example.com | testpass

--User:"analyst_ro" (note that the "email_id" and "password" columns are
--replaced with masked data,)
--Data in the password column is truncated for better formatting.

SELECT * FROM secure_schema.user_details;
 id | name  | age |       email_id         | password
----+-------+-----+-----------------  -----+----------------
  1 | jack  |  34 | alisontodd@example.com | 13d249f2cb4127b
  2 | alice |  37 | amanda35@example.com   | 13d249f2cb4127b

Static mask

Use static masks to remove the PII in a table, according to the criteria defined in the masking rules, and replace this information with masked data. Users can't retrieve the unmasked data. This is useful in test environments when you want to alter the PII and you don't want any users to view this information.

For the retail website example, you can implement static masks so that no users can view the unmasked email addresses and passwords for Jack and Alice. Instead, they view only masked data.

--User: "admin"

SELECT * FROM secure_schema.user_details;
 id | name  | age |    email_id       | password
----+-------+-----+--------------  ---+---------
  1 | jack  |  34 | jack@example.com  | testpass
  2 | alice |  37 | alice@example.com | testpass

--Apply earlier defined masking rules to the table permanently.
--Now all users see masked data only.

SELECT anon.anonymize_table('secure_schema.user_details');
 anonymize_table
-----------------
 t

--User: "analyst_ro"
--Data in the password column is truncated for better formatting.

select * from secure_schema.user_details;
 id | name  | age |           email_id              |  password
----+-------+-----+-------------------------  ------+---------------
  1 | jack  |  34 | christophercampbell@example.com | 13d249f2cb412c
  2 | alice |  37 | annebenitez@example.com         | 13d249f2cb4127

Anonymous dump

Use anonymous dumps to export masked data into a SQL file. For the retail website example, you can create a dump file for the masked data that's contained in the user_details table.

--Launch pg_dump_anon with the masked user to apply earlier defined --masking rules

pg_dump_anon -h HOSTIP -p 5432 -d DATABASE_NAME -U analyst_ro --table=secure_schema.user_details --file=user_details_anonysms.sql

Encrypt data

Although you can mask PII, the information is stored in the database as plain text. An administrator can view this information.

Use the pgcrypto extension to encrypt the PII before you store it. This way, only users that have a valid encryption key can decrypt the information and view it as plain text.

The pgcrypto extension has a number of hash and encrypt functions.

Hash

A hash is a one-way cryptographic function where you care only about encrypting the PII. This is useful for storing passwords in a hashed format and matching the user-entered passwords with the hashed passwords. Hashed passwords are never decrypted in plain text.

For the retail website example, you can use the pgcrypto extension to hash Jack's password before storing it in the user_details table.

--Hash passwords before storing them in the user_details table.

TRUNCATE TABLE secure_schema.user_details;
INSERT INTO secure_schema.user_details VALUES(1,'jack',34,'jack@example.com',crypt('testpassword', gen_salt('bf')));

--Match the hashed data with user entered password

SELECT id, name FROM secure_schema.user_details WHERE email_id = 'jack@example.com' AND password = crypt('testpassword', password);
 id | name
----+-----
  1 | jack

Encrypt

Use an encryption cryptographic function to encrypt PII with a key. Users then need this key to decrypt the information into plain text. This is useful for storing credit card information and bank details where applications want to retrieve the PII in a readable format.

For the retail website example, Jack's password and email address are encrypted. Users who have the encryption key can decrypt this information and view it as plain text. For all other users, an error message appears.

--"user_acc_key" is the encryption key

TRUNCATE TABLE secure_schema.user_details;
INSERT INTO secure_schema.user_details VALUES(1,'jack',34,pgp_sym_encrypt('jack@example.com','user_acc_key'),pgp_sym_encrypt('testpassword','user_acc_key'));

--User: "admin" (queries without an encryption key)
--Data in the email_id and password columns are truncated for better
--formatting.

SELECT * FROM secure_schema.user_details;
 id | name  | age |    email_id     | password
----+-------+-----+-----------------+-------------------
  1 | jack |  34 | \xc30d0407030209 | \xc30d040703028962

--User: "app_user" (queries with a valid encryption key)

SELECT name,pgp_sym_decrypt(email_id::bytea,'user_acc_key'),pgp_sym_decrypt(password::bytea,'user_acc_key') FROM secure_schema.user_details;
 name | pgp_sym_decrypt   | pgp_sym_decrypt
------+-------------------+----------------
 jack | jack@example.com  | testpassword

--If a user uses the wrong encryption key, then the following error message appears:

SELECT name,pgp_sym_decrypt(email_id::bytea,'user_bad_key'),
pgp_sym_decrypt(password::bytea,'user_bad_key') FROM secure_schema.user_details;
ERROR:  Wrong key or corrupt data

What's next

Learn about the following additional controls that you can use to protect PII from unwarranted access: