A data mesh is an architectural and organizational framework which treats data as a product (referred to in this document as data products). In this framework, data products are developed by the teams that best understand that data, and who follow an organization-wide set of data governance standards. Once data products are deployed to the data mesh, distributed teams in an organization can discover and access data that's relevant to their needs more quickly and efficiently. To achieve such a well-functioning data mesh, you must first establish the high-level architectural components and organizational roles that this document describes.
This document is part of a series which describes how to implement a data mesh on Google Cloud. It assumes that you have read and are familiar with the concepts described in Build a modern, distributed Data Mesh with Google Cloud.
The series has the following parts:
- Architecture and functions in a data mesh (this document)
- Design a self-service data platform for a data mesh
- Build data products in a data mesh
- Discover and consume data products in a data mesh
In this series, the data mesh that's described is internal to an organization. Although it's possible to extend a data mesh architecture to provide data products to third-parties, this extended approach is outside the scope of this document. Extending a data mesh involves additional considerations beyond just the usage within an organization.
Architecture
The following key terms are used to define the architectural components which are described in this series:
- Data product: A data product is a logical container or grouping of one or more related data resources.
- Data resource: A data resource is a physical asset in a storage system which holds structured data or stores a query that yields structured data.
- Data attribute: A data attribute is a field or element of a data resource.
The following diagram provides an overview of the key architectural components in a data mesh implemented on Google Cloud.
The preceding diagram shows the following:
- Central services enable the creation and management of data products, including organizational policies that affect the data mesh participants, access controls (through Identity and Access Management groups), and the infrastructure-specific artifacts. Examples of such commitments and reservations, and infrastructure that facilitates the functioning of the data mesh are described in Create platform components and solutions.
- Central services primarily supply the Data Catalog for all the data products in the data mesh and the discovery mechanism for potential customers of these products.
- Data domains expose subsets of their data as data products through well-defined data consumption interfaces. These data products could be a table, view, structured file, topic, or stream. In BigQuery, it would be a dataset, and in Cloud Storage, it would be a folder or bucket. There can be different types of interfaces that can be exposed as a data product. An example of an interface is a BigQuery view over a BigQuery table. The types of interfaces most commonly used for analytical purposes are discussed in Build data products in a data mesh.
Data mesh reference implementation
You can find a reference implementation of this architecture in
the data-mesh-demo
repository.
The Terraform scripts that are used in the reference implementation demonstrate
data mesh concepts and are not intended for production use. By running these
scripts, you'll learn how to do the following:
- Separate product definitions from the underlying data.
- Create Data Catalog templates for describing product interfaces.
- Tag product interfaces with these templates.
- Grant permissions to the product consumers.
For the product interfaces, the reference implementation creates and uses the following interface types:
- Authorized views over BigQuery tables.
- Data streams based on Pub/Sub topics.
For further details, refer to the README file in the repository.
Functions in a data mesh
For a data mesh to operate well, you must define clear roles for the people who perform tasks within the data mesh. Ownership is assigned to team archetypes, or functions. These functions hold the core user journeys for people who work in the data mesh. To clearly describe user journeys, they have been assigned to user roles. These user roles can be split and combined based on the circumstances of each enterprise. You don't need to map the roles directly with employees or teams in your organization.
A data domain is aligned with a business unit (BU), or a function within an enterprise. Common examples of business domains might be the mortgage department in a bank, or the customer, distribution, finance, or HR departments of an enterprise. Conceptually, there are two domain-related functions in a data mesh: the data producer teams and the data consumer teams. It's important to understand that a single data domain is likely to serve both functions at once. A data domain team produces data products from data that it owns. The team also consumes data products for business insight, and to produce derived-data products for the use of other domains.
In addition to the domain-based functions, a data mesh also has a set of functions that are performed by centralized teams within the organization. These central teams enable the operation of the data mesh by providing cross-domain oversight, services, and governance. They reduce the operational burden for data domains in producing and consuming data products, and facilitate the cross-domain relationships that are required for the data mesh to operate.
This document only describes functions that have a data mesh-specific role. There are several other roles that are required in any enterprise, regardless of the architecture being employed for the platform. However, these other roles are out of scope for this document.
The four main functions in a data mesh are as follows:
- Data domain-based producer teams: Create and maintain data products over their lifecycle. These teams are often referred to as the data producers.
- Data domain-based consumer teams: Discover data products and use them in various analytic applications. These teams might consume data products to create new data products. These teams are often referred to as the data consumers.
- Central data governance team: Defines and enforces data governance policies among data producers, ensuring high data quality and data trustworthiness for consumers. This team is often referred to as the data governance team.
- Central self-service data infrastructure platform team: Provides a self-service data platform for data producers. This team also provides the tooling for central data discovery and data product observability that both data consumers and data producers use. This team is often referred to as the data platform team.
An optional extra function to consider is that of a Center of Excellence (COE) for the data mesh. The purpose of the COE is to provide management of the data mesh. The COE is also the designated arbitration team that resolves any conflicts raised by any of the other functions. This function is useful for helping to connect the other four functions.
Data domain-based producer team
Typically, data products are built on top of a physical repository of data (either single or multiple data warehouses, lakes, or streams). An organization needs traditional data platform roles to create and maintain these physical repositories. However, these traditional data platform roles are not typically the people who create the data product.
To create data products from these physical repositories, an organization needs a mix of data practitioners, such as data engineers and data architects. The following table lists all the domain-specific user roles that are needed in data producer teams.
Role |
Responsibilities |
Required skills |
Desired outcomes |
---|---|---|---|
Data product owner |
|
Data analytics Data architecture Product management |
|
Data product technical lead |
|
Data engineering Data architecture Software engineering |
|
Data product support |
|
Software engineering Site reliability engineering (SRE) |
|
Subject matter expert (SME) for data domain |
|
Data analytics Data architecture |
|
Data owner |
|
|
|
Data domain-based consumer teams
In a data mesh, the people that consume a data product are typically data users who are outside of the data product domain. These data consumers use a central data catalog to find data products that are relevant to their needs. Because it's possible that more than one data product might meet their needs, data consumers can end up subscribing to multiple data products.
If data consumers are unable to find the required data product for their use case, it's their responsibility to consult directly with the data mesh COE. During that consultation, data consumers can raise their data needs and seek advice on how to get those needs met by one or more domains.
When looking for a data product, data consumers are looking for data that help them achieve various use cases such as persistent analytics dashboards and reports, individual performance reports, and other business performance metrics. Alternatively, data consumers might be looking for data products that can be used in artificial intelligence (AI) and machine learning (ML) use cases. To achieve these various use cases, data consumers require a mix of data practitioner personas, which are as follows:
Role |
Responsibilities |
Required skills |
Desired outcomes |
---|---|---|---|
Data analyst |
Searches for, identifies, evaluates, and subscribes to single-domain or cross-domain data products to create a foundation for business intelligence frameworks to operate. |
Analytics engineering Business analytics |
|
Application developer |
Develops an application framework for consumption of data across one or more data products, either inside or outside of the domain. |
Application development Data engineering |
|
Data visualization specialist |
|
Requirement analysis Data visualization |
|
Data scientist |
|
ML engineering Analytics engineering |
|
Central data governance team
The data governance team enables data producers and consumers to safely share, aggregate, and compute data in a self-service manner, without introducing compliance risks to the organization.
To meet the compliance requirements of the organization, the data governance team is a mix of data practitioner personas, which are as follows:
Role |
Responsibilities |
Required skills |
Desired outcomes |
---|---|---|---|
Data governance specialist |
|
Legal SME Security SME Data privacy SME |
|
Data steward (sits within each domain) |
|
Data architecture Data stewardship |
|
Data governance engineer |
|
Software engineering |
|
Central self-service data infrastructure platform team
The self-service data infrastructure platform team, or just the data platform team, is responsible for creating a set of data infrastructure components. Distributed data domain teams use these components to build and deploy their data products. The data platform team also promotes best practices and introduces tools and methodologies which help to reduce cognitive load for distributed teams when adopting new technology.
Platform infrastructure should provide easy integration with operations toolings for global observability, instrumentation, and compliance automation. Alternatively, the infrastructure should facilitate such integration to set up distributed teams for success.
The data platform team has a shared responsibility model that it uses with the distributed domain teams and the underlying infrastructure team. The model shows what responsibilities are expected from the consumers of the platform, and what platform components the data platform team supports.
As the data platform is itself an internal product, the platform doesn't support every use case. Instead, the data platform team continuously releases new services and features according to a prioritized roadmap.
The data platform team might have a standard set of components in place and in development. However, data domain teams might choose to use a different, unique set of components if the needs of a team don't align with those provided by the data platform. If data domain teams choose a different approach, they must ensure that any platform infrastructure that they build and maintain complies with organization-wide policies and guardrails for security and data governance. For data platform infrastructure that is developed outside of the central data platform team, the data platform team might either choose to co-invest or embed their own engineers into the domain teams. Whether the data platform team chooses to co-invest or embed engineers might depend on the strategic importance of the data domain platform infrastructure to the organization. By staying involved in the development of infrastructure by data domain teams, organizations can provide the alignment and technical expertise required to repackage any new platform infrastructure components that are in development for future reuse.
You might need to limit autonomy in the early stages of building a data mesh if your initial goal is to get approval from stakeholders for scaling up the data mesh. However, limiting autonomy risks creating a bottleneck at the central data platform team. This bottleneck can inhibit the data mesh from scaling. So, any centralization decisions should be taken carefully. For data producers, making their technical choices from a limited set of available options might be preferable to evaluating and choosing from an unlimited list of options themselves. Promoting autonomy of data producers doesn't equate to creating an ungoverned technology landscape. Instead, the goal is to drive compliance and platform adoption by striking the right balance between freedom of choice and standardization.
Finally, a good data platform team is a central source of education and best practices for the rest of the company. Some of the most impactful activities that we recommend central data platform teams undertake are as follows:
- Fostering regular architectural design reviews for new functional projects and proposing common ways of development across development teams.
- Sharing knowledge and experiences, and collectively defining best practices and architectural guidelines.
- Ensuring engineers have the right tools in place to validate and check for common pitfalls like issues with code, bugs, and performance degradations.
- Organizing internal hackathons so development teams can surface their requirements for internal tooling needs.
Example roles and responsibilities for the central data platform team might include the following:
Role | Responsibilities | Required skills |
Desired outcomes |
---|---|---|---|
Data platform product owner |
|
Data strategy and operations Product management Stakeholder management |
|
Data platform engineer |
|
Data engineering Software engineering |
|
Platform and security engineer (a representative from the central IT teams such as networking and security, who is embedded in the data platform team) |
|
Infrastructure engineering Software engineering |
|
Enterprise architect |
|
Data architecture Solution iteration and problem solving Consensus building |
|
Additional considerations for a data mesh
There are multiple architectural options for an analytics data platform, each option with different prerequisites. To enable each data mesh architecture, we recommend that your organization follow the best practices described in this section.
Acquire platform funding
As explained in the blog post, "If you want to transform start with finance", the platform is never finished: it's always operating based on a prioritized roadmap. Therefore, the platform must be funded as a product, not as a project with a fixed endpoint.
The first adopter of the data mesh bears the cost. Usually, the cost is shared between the business that forms the first data domain to initiate the data mesh, and the central technology team, which generally houses the central data platform team.
To convince finance teams to approve funding for the central platform, we recommend that you make a business case for the value of the centralized platform being realized over time. That value comes from reimplementing the same components in individual delivery teams.
Define the minimum viable platform for the data mesh
To help you to define the minimum viable platform for the data mesh, we recommend that you pilot and iterate with one or more business cases. For your pilot, find use cases that are needed, and where there's a consumer ready to adopt the resulting data product. The use cases should already have funding to develop the data products, but there should be a need for input from technical teams.
Make sure the team that is implementing the pilot understands the data mesh operating model as follows:
- The business (that is, the data producer team) owns the backlog, support, and maintenance.
- The central team defines the self-service patterns and helps the business build the data product, but passes the data product to the business to run and own when it's complete.
- The primary goal is to prove the business operating model (domains produce, domains consume). The secondary goal is to prove the technical operating model (self-service patterns developed by the central team).
- Because platform team resources are limited, use the trunk and branch teams model to pool knowledge but still allow for the development of specialized platform services and products.
We also recommend that you do the following:
- Plan roadmaps rather than letting services and features evolve organically.
- Define minimum viable platform capabilities spanning ingest, storage, processing, analysis, and ML.
- Embed data governance in every step, not as a separate workstream.
- Put in place the minimum capabilities across governance, platform, value-stream, and change management. Minimum capabilities are those which meet 80% of business cases.
Plan for the co-existence of the data mesh with an existing data platform
Many organizations that want to implement a data mesh likely already have an existing data platform, such as a data lake, data warehouse, or a combination of both. Before implementing a data mesh, these organizations must make a plan for how their existing data platform can evolve as the data mesh grows.
These organizations should consider factors such as the following:
- The data resources that are most effective on the data mesh.
- The assets that must stay within the existing data platform.
- Whether assets have to move, or whether they can be maintained on the existing platform and still participate in the data mesh.
What's next
- To learn more about designing and operating a cloud topology, see the Google Cloud Architecture Framework.
- For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.