Help secure data workloads in Google Cloud

This article is the first part of a three-part series that discusses how you can use Google Cloud products to help secure common data workloads:

This series is designed to help you learn how to interact with Google Cloud APIs, such as BigQuery and Cloud Storage, in a secure way.


When you secure your data workloads, the challenge is to grant enough access so that authorized entities can perform the appropriate tasks, but not so much that unauthorized entities get access to restricted data.

In Google Cloud, you can manage data security at different levels:

  • Service access. Grant an entity access to the services storing the data.
  • Data access. Grant an entity permission to interact with specific data. Data access can include the following:
    • Granting an individual identity read-only access to a single object (that is, granular access).
    • Preventing the listing of a group of objects outside of an authorized perimeter.
    • Applying rules similarly for humans and applications.
  • Data transit. Encrypt, allow, or deny data flows between two entities, networks, or organizations.

Overview of Google Cloud products

Part 2 of this series discusses the following services:

  • Storage and databases: Cloud Storage as an object store and BigQuery as a columnar database for structured data.
  • Compute: BigQuery as an analytics tool and Dataproc as a data processor.
  • Networking: Virtual Private Cloud, firewalls, and other features of the Google Cloud software defined network that help security at the network level.
  • Security: Identity and Access Management (IAM) and Access Controls, which provide authentication and authorization features.

For more information about services in Google Cloud, see the Google Cloud Services section in the platform overview. For more information about GCP products related to security, visit Security products and capabilities.

This solution focuses on key data workloads, but they rely on technologies common to Google Cloud services, and so can be reproduced for other security workloads:

  • Cloud Storage and BigQuery are services accessible through an API. Other similar services can follow the same concepts to protect their access.
  • Dataproc is based on Google Compute Engine. Other products based on Compute Engine, such as Dataflow, support the same networking features, including Virtual Private Cloud and firewalls.

Use cases and building blocks

Part 3 of this series, Help secure data workloads: use cases, uses the following terminology and concepts when addressing a use case:

  • Altostrat. A fictional company that owns some data and wants to make it available to employees, partners, and customers in a secure way.
  • Admin. An Altostrat employee who has sufficient rights to perform the required tasks. Admin and Altrostrat's admin represent the same person.
  • Identities. Users or apps of Altostrat or of Altostrat's customers or partners.
  • Apps. Entities that might need access to data, but that aren't individual users. For example, Dataproc or custom code running on Compute Engine are both considered apps.
  • Google Cloud APIs. Google Cloud services that are API-based and supported by VPC and Private Google Access. This series focuses on BigQuery and Cloud Storage as examples of Google Cloud APIs.

Part 3 explains implementation for the following use cases:

What's next

Continue to the next parts of this series: