Architecture: HIPAA-aligned Cloud Healthcare

This solution introduces the HIPAA-Aligned Cloud Healthcare Reference Infrastructure, an end-to-end architecture that encapsulates Google Cloud best practices to help you meet healthcare security, privacy, and compliance needs.

Disclaimer

  • This solution does not constitute legal advice on the proper administrative, technical, and physical safeguards you must implement in order to comply with HIPAA or any other data privacy legislation.
  • The scope of this solution is limited to implementation of a reference architecture. The implementation in this architecture is not an official Google product; it is intended as a reference implementation. The code is available open source as the Google Cloud healthcare deployment automation utility and available under the Apache License, Version 2.0. You may use the guide as a quick starter and configure it to fit your use cases. You are responsible for ensuring that the environment and applications that you build on top of Google Cloud are properly configured and secured according to HIPAA requirements.

The architecture is designed to help you transfer sensitive health information to the cloud.

HIPAA overview

HIPAA (Health Insurance Portability and Accountability Act of 1996) sets standards in the United States to protect individually identifiable health information. HIPAA applies to health plans, most healthcare providers, and healthcare clearinghouses — collectively known as "covered entities" — that manage protected health information (PHI) electronically and to persons or entities that perform certain functions on their behalf, known as "business associates."

The HIPAA Privacy Rule requires covered entities and their business associates to safeguard the privacy of PHI handled in any medium, while the HIPAA Security Rule obligates them to protect the confidentiality, integrity, and availability of PHI they create, receive, maintain, or transmit electronically with administrative, physical, and technical measures. Covered entities and business associates also have breach notification obligations and duties.

For definitions of HIPAA terms, more information about HIPAA rules and how individual Google Cloud products support them, see the Google Cloud HIPAA overview guide.

Architecture diagram

Referring to the Setting up a HIPAA-aligned Google Cloud project tutorial, and relying on the Cloud Healthcare Data Protection Toolkit, this architecture helps you build a Google Cloud-based infrastructure with few steps by treating the configuration as code. The following diagram illustrates how the architecture helps you meet security and compliance best practices by using reusable building blocks: a Google Cloud Cloud Deployment Manager configuration script and parameterized configuration templates.

Diagram showing the reusable components of a HIPAA-aligned reference architecture

This architecture is designed to be a quick start that you can customize for deploying a specific use case or product: it doesn't produce a final product you can simply deploy. Though you implement most of the components using Cloud Deployment Manager scripts, some of the capabilities, such as Cloud VPN Dedicated Interconnect and multi-factor authentication, must integrate with your existing systems and so require custom work.

Architecture components

The following table summarizes the components of the architecture, which are described in detail afterward.

Architecture component summary

Component Purpose
1 Google Cloud project for data A base project with a Cloud Storage bucket, a Compute Engine VM, a BigQuery dataset, and a Pub/Sub topic. The project is created under Organization, and a Billing ID is provided. (See organization and data encryption for advanced information.)
Cloud Monitoring Account Monitoring for security and IT admin purposes.
Cloud Identity Used to create access control groups and attach groups to users under Organization node. (See service accounts for advanced information.)
IAM Provide Cloud Identity group access to project resources and authorization to modify or use them. (See identity and access management and IAM roles for advanced information.)
2 Google Cloud projects for security operations Project to hold an instance of Forseti Security that implements a set of monitoring policies and alerts to notify security admin when Google Cloud configuration anomalies are detected. (See Forseti security monitoring and rules enforcement and BeyondCorp security approach for advanced information).

Also, another project holds the Audit Log Cloud Storage bucket and BigQuery data set, resources and data access logging, long-term storage, and availability of logs for security analysis.
1 Google Cloud project for shared networking controls For Cloud VPN Cloud Interconnect, Cloud Router, Virtual Private Cloud (VPC), subnets, and firewall rules. Install a shared VPC with a common networking project that has appropriate connectivity and networking controls.

This section goes into the details of some of the architecture's components and associated best practices, including:

  • Projects in Google Cloud
  • Identity and access management (IAM)
  • Service accounts
  • Google's worldwide network and enterprise connectivity options
  • VPC networks
  • Audit logging
  • Security monitoring and alerting policies

Projects in Google Cloud

Google Cloud organizes resources, such as Compute Engine VMs and Cloud Storage buckets, according to a resource hierarchy. The top node in the hierarchy is an Organization node, and below it might be multiple projects. A specific project might contain multiple separate apps, or conversely, a single app might include several projects. Projects can contain resources spread across multiple regions and geographies.

You can set security features, such as permissions settings, at the organization level so they are inherited throughout the resource hierarchy, or set them in a more granular manner, according to your needs.

When you create a project, a best practice is to name it in a way that conveys meaning to the people who work with it. This simple practice helps to ensure that work is performed within the correct projects and supports segregation of duties using the principle of least privilege.

This architecture provides an example of segregation of duties and least privilege: the security monitoring project (audit and Forseti resources) is owned by different users than the core data analysis project. The central networking project has its own permissions policies.

After you create a project, you grant users roles to give them access. It's a best practice to use the Groups feature for role assignments, rather than assigning the roles to individuals, as discussed in the next section.

Identity and access management

You set up your organization's users with the Identity and Access Management (IAM) service, which relies on roles that define sets of permissions. You use IAM to control access to various resources for data transfer, analysis, and security auditing. See Best practices for enterprise organizations for details.

Rather than directly assigning permissions, you should collect users with the same responsibilities into groups and assign IAM roles to groups rather than to individual users. For example, the BigQuery Data Viewer role contains the permissions to list, read, and query BigQuery tables, but does not include permissions to create tables or modify existing data.

Like projects, groups that you create to assign roles are most efficient when you name them with a meaningful convention. For example, you might use something like ${org}-*readonly@${org.tld} to grant access to Cloud Storage data for read-only analysis purposes.

Here are the groups used in the reference architecture:

Group Purpose Permission
Project level access
${org}-project-owner@${org.tld} Project owners Owner
{org}-project-auditor@${org.tld}$ Security reviewer of ${org}-health-data, owner of ${org}-audit-logs-retention Security Reviewer, Owner

Service accounts

A service account is a special type of Google Account that represents a Google Cloud service identity or app rather than an individual user. Like users and groups, service accounts can be assigned IAM roles to grant access to specific resources. Service accounts authenticate with a key rather than a password. You can use the IAM service account API to implement key rotation. It's a best practice to use service accounts for inter-service communications, such as running Deployment Manager or exporting logs.

Cloud connectivity

You might want to connect your existing on-premises infrastructure with your Google Cloud resources. Evaluate bandwidth, latency, and SLA requirements to choose the best connection options. If you need low-latency, highly available, enterprise-grade connections that enable you to reliably transfer data between on-premises and VPC networks without traversing the internet connections to Google Cloud, use Cloud Interconnect:

  • Dedicated Interconnect provides a direct physical connection between your on-premises network and Google's network.
  • Partner Interconnect provides connectivity between your on-premises and Google Cloud VPC networks through a supported service provider.

If you don't require the low latency and high availability of Cloud Interconnect, or you are just starting with the cloud, you can use Cloud VPN to set up encrypted IPsec VPN tunnels between your on-premises network and VPC.

Virtual Private Cloud

A VPC network comprises Google Cloud's fundamental networking technologies, including networks, subnets, IP addresses, routes, firewalls, VPN, and Cloud Router. This foundation allows you to create compute instances and containers that share a single, global VPC network, and can be added to regional subnets.

In this configuration, network policy and control for all networking resources are centralized and easier to manage. Service project departments can configure and manage non-network resources, enabling a clear separation of responsibilities for different teams in the organization. Resources in those projects can communicate with each other more securely and efficiently across project boundaries using internal IP addresses.

In addition, resources with only a private, internal IP address can still access many Google APIs and services through Private Google Access. You can configure only an application's frontend to receive internet requests and shield its backend services from public endpoints. You can manage shared network resources—such as subnets, routes, and firewalls—from a central host project, so you can enforce consistent network policies across the projects.

Audit logging

With Cloud Logging and Cloud Monitoring, you can store, search, analyze, monitor, and alert on log data and events. Admin activity logs and data access (read and write) are on by default in this architecture. It's a best practice to export all Cloud Logging logs to Cloud Storage for long-time retention. Logs are configured to be exported to a BigQuery dataset residing in a separate project. You can use the BigQuery UI for security analysis. Google Cloud also supports log file integration with other security analytics tools, such as Splunk. In addition, you can enable object versioning of the Cloud Storage log bucket, or place a lien on the audit log project to prevent any accidental deletion. For details, see Setting up log delivery.

Security monitoring and alerting policies

The architecture relies on Forseti, an open source tool for Google Cloud, to keep track of the project resources and monitor security policies. Forseti helps systematically monitor the Google Cloud project resources to ensure that access controls are set as intended. Forseti creates rule-based policies to codify your security policies. Then, if something changes unexpectedly, action will be taken, including notifying you, and possibly automatically reverting to a known state in some cases. The following table shows the Forseti rules implemented by the architecture:

Rule Description
Audit Logging Rule Require all audit logging enabled by default (ADMIN access, DATA read/write).
BigQuery Rule No public, domain, or special group dataset access.
Bucket Rule Disallow all ACL rules, only allow IAM.
Enabled API Rule Only APIs in the whitelist are enabled.
IAM Rule All projects must have an owner group from the domain.
Lien Rule Require project deletion liens for all projects.
Location Rule All datasets must be in the specified location.
Log Sink Rule Require a BigQuery Log sink in all projects.

See the Forseti documentation for details.

As part of the tutorial associated with this solution, you create the following alerts:

  1. Unexpected Cloud Storage access
  2. IAM policy change alert
  3. bucket permission change alert
  4. BigQuery update alert
Metric Description
unexpected data access The designated user/group is notified when bucket is accessed by an unexpected user.
IAM policy change count The designated user/group is notified when IAM policies are altered.
Bucket Permission Change Count The designated user/group is notified when bucket/object permissions are altered.
BigQuery update alert The designated user/group is notified when BigQuery dataset settings are altered

Scenarios

This section walks through a basic scenario using the architecture, and then a second scenario that expands on the architecture.

  • In scenario 1, you have a single host project, multiple service projects, and a single shared VPC.
  • In scenario 2, you have a shared services VPC, and multiple shared VPCs.

Scenario 1: Single host project, multiple service projects, single shared VPC

The architecture above implements a simple custom mode VPC topology to help set the foundation for secure, reliable, and manageable architecture. Using Shared VPC alleviates the need for each project to replicate the same solution. For example, when you integrate a Cloud Interconnect solution into a Shared VPC, all VMs—regardless of region or service project—can access the Cloud Interconnect connection. But you delegate administrative responsibilities—such as creating and managing VM instances—to service project admins, while you maintain centralized control over network resources like subnets, routes, and firewalls.

The architecture shows starting with a single shared VPC that includes the following:

  • 1 project for shared networking resources, such as Cloud VPN, Cloud Interconnect, firewall rules, and Cloud Router.
  • 1 project for central audit logging and security monitoring.
  • Multiple service projects for data analysis.

With this configuration, you manage shared network resources—such as subnets, routes, and firewalls—from the central host project, so you can enforce consistent network policies across the projects. Meanwhile, service project departments can configure and manage non-network and non-audit-retention resources, enabling a clear separation of responsibilities for different teams in the organization. Resources in those projects can communicate with each other more securely and efficiently across project boundaries using internal IP addresses.

Scenario 2: Shared services VPC, multiple shared VPCs

The following diagram shows an architecture for VPC isolation, which builds on the initial architecture but separates prod VPC from dev/test VPC. The architecture keeps the common services in a single shared VPC, avoiding duplication of resources.

Diagram showing a separate VPC for prod and one for dev/test.

There are many reasons to consider VPC isolation, including access control and quota considerations between environments, or just another layer of isolation because the dev/test environment may have only de-identified data, where production has identified data. Using a Shared Services VPC allows you to share the services with other VPCs through VPC peering, while centralizing administration and deployment of common resources.

In addition to VPC rules, you can use the following tools to help secure and protect data:

  • For workloads involving sensitive data, use VPC VPC Service Controls to configure service perimeters around VPC resources and Google-managed services, and control the movement of data across the perimeter boundary. Using VPC VPC Service Controls, you can group projects and your on-premises network into a single perimeter that prevents data access through Google-managed services. Service perimeters can't contain projects from different organizations, but you can use perimeter bridges to allow projects and services in different service perimeters to communicate.
  • Integrate Google Cloud Armor with the HTTP(S) load balancer to provide DDoS protection and the ability to block or allow IP addresses at the network edge.
  • Control access to apps by using Identity-Aware Proxy (IAP) to verify user identity and the context of the request to determine if a user should be granted access.
  • Provide a single interface for security insights, anomaly detection, and vulnerability detection with Security Command Center.

Best practices

This section discusses some additional best practices you should consider implementing, which are not explicitly discussed in the architecture itself.

Synchronize existing identity platform

If the organization uses an on-premises or third-party identity platform, synchronize the existing user directory with Cloud Identity, which lets users access Google Cloud with their corporate credentials. This way, your existing identity platform remains the source of truth while Cloud Identity provides control over how your employees access Google services.

Use Google Cloud Directory Sync (GCDS) to synchronize the users and groups in the Active Directory or LDAP server to the user directory provided by Cloud Identity. GCDS provisions Google accounts corresponding to corporate users and groups. The synchronization is one-way: the data in the directory server is not modified. Passwords are not copied to Google's directory (unless explicitly configured).

Implement single-sign-on (SSO)

After integrating an existing identity management platform with Cloud Identity, one can set up single-sign-on (SSO). SSO enables users to access their enterprise cloud apps by signing in one time for all services. Google Cloud supports SAML 2.0-based SSO against existing on-premises or third-party identity provider. After you've configured users, they must authenticate against the primary identity provider before they can access Google Cloud.

PHI and logging

Monitor for PHI in the logs. Do not include PHI in the logs. When creating or updating resources, avoid including PHI or security credentials when specifying a resource's metadata, because that information might be captured in the logs. Audit logs never include the data contents of a resource or the results of a query in the logs, but resource metadata might be captured.

Use multi-factor authentication

Multi-factor authentication (MFA) is an important tool in protecting corporate resources. Using a security key offers the strongest security among 2-step-verification (2SV) methods. Users typically insert a physical key into a USB port on a computer. When prompted, a user touches the key and it generates a cryptographic signature. For details, see Enforce Uniform MFA to Company Resources.

What's next