Stay organized with collections
Save and categorize content based on your preferences.
You can use Dataplex Universal Catalog to build a data mesh architecture. This quickstart
shows you how to use Dataplex Universal Catalog features, such as a lake, zones, and
assets, to build a data mesh.
A data mesh is an organizational and technical approach that decentralizes data
ownership among domain data owners. These owners provide the data as a product
in a standard way and facilitate communication among different parts of the
organization to distribute datasets across different locations. Learn more about
data mesh architectures.
Objectives
In this guide, you use the Dataplex Universal Catalog entities to build a
data mesh architecture:
Create a Dataplex Universal Catalog lake that acts as the domain for your data
mesh.
Add zones to your lake that represents individual teams within each
domain and provide managed data contracts.
Attach assets that map to data stored in Cloud Storage.
Costs
In this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage,
use the pricing calculator.
New Google Cloud users might be eligible for a free trial.
When you finish the tasks that are described in this document, you can avoid
continued billing by deleting the resources that you created. For more information, see
Clean up.
Before you begin
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Click Create to create a new lake, which acts as your data mesh.
In the Display name field, enter My data mesh.
For Region, select us-central1.
Select the Dataproc Metastore service that you created and
configured earlier as the associated metastore.
Click Create.
Create zones in your lake
After creating a domain by creating a Dataplex Universal Catalog lake, you can host
managed data contracts and individual teams within the domain by using zones.
There are two types of zones:
Raw zones are typically used to store data in any format from external sources
in Cloud Storage. Raw zones are useful for data that requires further
processing before it's ready for consumption.
Curated zones are used for structured data in Cloud Storage that must
conform to certain file formats, and are organized in a hive-compatible
directory layout. They are most useful for data that's ready for consumption
and analysis.
Each domain (for example, sales, customers, products) should have at least
a raw zone and a curated zone.
Additional zones are used to manage data contracts between teams or to provide a
more granular breakdown for teams within a given domain. For example, inventory
management within the product domain. Data owners are able to manage the data
within their domain and access it.
In the Google Cloud console, navigate to the Dataplex Universal Catalog
Manage view.
Click the name of the lake (My data mesh) you want to add a zone to.
In the Zones tab, click addAdd Zone.
In the Display name field, enter My sub domain. Dataplex Universal Catalog
automatically generates an ID for your zone.
For Type, select Raw zone.
Click Create.
Attach assets to your zones
Attach data assets to your zone. A data asset, the storage resources that
contain your data, can be a Cloud Storage bucket or a
BigQuery dataset. This is the final step in creating your data
mesh architecture.
In the Dataplex Universal Catalog Manage view, click the lake you created
(My data mesh).
In the Zones tab, click the zone (My sub domain) to add the asset to.
In the Assets tab, click addAdd assets
Click Add an Asset.
For Type, select Cloud Storage bucket.
In the Display name field , enter Data mesh asset. Dataplex Universal Catalog
automatically generates an asset ID for you.
In the Bucket field, click Browse.
Select your bucket from the list.
Click Select.
Click Done and then click Continue.
Click Continue to accept the default Advanced settings.
Click Submit.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this
tutorial, either delete the project that contains the resources, or keep the project and
delete the individual resources.
Delete the project
In the Google Cloud console, go to the Manage resources page.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[],[],null,["# Build a data mesh\n\n*** ** * ** ***\n\nYou can use Dataplex Universal Catalog to build a data mesh architecture. This quickstart\nshows you how to use Dataplex Universal Catalog features, such as a lake, zones, and\nassets, to build a data mesh.\n\nA data mesh is an organizational and technical approach that decentralizes data\nownership among domain data owners. These owners provide the data as a product\nin a standard way and facilitate communication among different parts of the\norganization to distribute datasets across different locations. Learn more about\n[data mesh architectures](https://services.google.com/fh/files/misc/build-a-modern-distributed-datamesh-with-google-cloud-whitepaper.pdf).\n\nObjectives\n----------\n\nIn this guide, you use the Dataplex Universal Catalog entities to build a\ndata mesh architecture:\n\n- Create a Dataplex Universal Catalog lake that acts as the domain for your data mesh.\n- Add zones to your lake that represents individual teams within each domain and provide managed data contracts.\n- Attach assets that map to data stored in Cloud Storage.\n\nCosts\n-----\n\n\nIn this document, you use the following billable components of Google Cloud:\n\n\n- [Dataplex Universal Catalog](/dataplex/pricing)\n- [Cloud Storage](/storage/pricing)\n\n\nTo generate a cost estimate based on your projected usage,\nuse the [pricing calculator](/products/calculator). \nNew Google Cloud users might be eligible for a [free trial](/free). \n\n\u003cbr /\u003e\n\nWhen you finish the tasks that are described in this document, you can avoid\ncontinued billing by deleting the resources that you created. For more information, see\n[Clean up](#clean-up).\n\nBefore you begin\n----------------\n\n1. In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n2.\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n3.\n\n\n Enable the Dataplex API.\n\n\n [Enable the API](https://console.cloud.google.com/flows/enableapi?apiid=dataplex.googleapis.com)\n4. [Create a Dataproc Metastore service](/dataproc-metastore/docs/create-service).\n\n | **Note:** You can attach each Dataproc Metastore to only one Dataplex Universal Catalog lake. Enable `gRPC` for your metastore.\n\n### Create a Cloud Storage bucket\n\nYou need a Cloud Storage bucket to store the data assets of your data\nmesh.\n\nTo create a Cloud Storage bucket, follow the instructions in\n[create a Cloud Storage bucket](/storage/docs/creating-buckets). When\ndoing so, note the following:\n\n- Name your bucket.\n- For **Location type** , choose **Region** , and select **us-central1 (Iowa)** from the menu. \n\nCreate a domain\n---------------\n\n1. In the Google Cloud console, go to the Dataplex Universal Catalog page.\n\n [Go to Dataplex Universal Catalog](https://console.cloud.google.com/dataplex/lakes)\n2. Navigate to the **Manage** view.\n\n3. Click **Create** to create a new lake, which acts as your data mesh.\n\n4. In the **Display name** field, enter `My data mesh`.\n\n | **Note:** Dataplex Universal Catalog automatically generates a lake ID.\n5. For **Region** , select `us-central1`.\n\n | **Note:** The region you select for your data mesh determines the location of the data (not including attached assets) managed by Dataplex Universal Catalog. The same region is used when Dataplex Universal Catalog creates resources in other services, but not for data contained within assets.\n6. Select the Dataproc Metastore service that you created and\n configured earlier as the associated metastore.\n\n7. Click **Create**.\n\nCreate zones in your lake\n-------------------------\n\nAfter creating a domain by creating a Dataplex Universal Catalog lake, you can host\nmanaged data contracts and individual teams within the domain by using zones.\nThere are two types of zones:\n\n- Raw zones are typically used to store data in any format from external sources\n in Cloud Storage. Raw zones are useful for data that requires further\n processing before it's ready for consumption.\n\n- Curated zones are used for structured data in Cloud Storage that must\n conform to certain file formats, and are organized in a hive-compatible\n directory layout. They are most useful for data that's ready for consumption\n and analysis.\n\nEach domain (for example, `sales`, `customers`, `products`) should have at least\na raw zone and a curated zone.\n\nAdditional zones are used to manage data contracts between teams or to provide a\nmore granular breakdown for teams within a given domain. For example, inventory\nmanagement within the product domain. Data owners are able to manage the data\nwithin their domain and access it.\n\n1. In the Google Cloud console, navigate to the Dataplex Universal Catalog\n **Manage** view.\n\n2. Click the name of the lake (`My data mesh`) you want to add a zone to.\n\n3. In the **Zones** tab, click add **Add Zone**.\n\n4. In the **Display name** field, enter `My sub domain`. Dataplex Universal Catalog\n automatically generates an ID for your zone.\n\n | **Note:** The zone name becomes the name of a BigQuery dataset. Therefore, all zones hosted in the same Google Cloud project must have a unique ID, even if they exist within different lakes.\n5. For **Type** , select **Raw zone**.\n\n6. Click **Create**.\n\nAttach assets to your zones\n---------------------------\n\nAttach data assets to your zone. A data asset, the storage resources that\ncontain your data, can be a Cloud Storage bucket or a\nBigQuery dataset. This is the final step in creating your data\nmesh architecture.\n\n1. In the Dataplex Universal Catalog **Manage** view, click the lake you created\n (`My data mesh`).\n\n2. In the **Zones** tab, click the zone (`My sub domain`) to add the asset to.\n\n3. In the **Assets** tab, click add\n **Add assets**\n\n4. Click **Add an Asset**.\n\n5. For **Type** , select **Cloud Storage bucket**.\n\n6. In the **Display name** field , enter `Data mesh asset`. Dataplex Universal Catalog\n automatically generates an asset ID for you.\n\n7. In the **Bucket** field, click **Browse**.\n\n 1. Select your bucket from the list.\n 2. Click **Select**.\n8. Click **Done** and then click **Continue**.\n\n9. Click **Continue** to accept the default **Advanced settings**.\n\n10. Click **Submit**.\n\nClean up\n--------\n\n\nTo avoid incurring charges to your Google Cloud account for the resources used in this\ntutorial, either delete the project that contains the resources, or keep the project and\ndelete the individual resources.\n\n### Delete the project\n\n| **Caution** : Deleting a project has the following effects:\n|\n| - **Everything in the project is deleted.** If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.\n| - **Custom project IDs are lost.** When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an `appspot.com` URL, delete selected resources inside the project instead of deleting the whole project.\n|\n|\n| If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects\n| can help you avoid exceeding project quota limits.\n1. In the Google Cloud console, go to the **Manage resources** page.\n\n [Go to Manage resources](https://console.cloud.google.com/iam-admin/projects)\n2. In the project list, select the project that you want to delete, and then click **Delete**.\n3. In the dialog, type the project ID, and then click **Shut down** to delete the project.\n\n### Delete your data mesh architecture\n\n1. In the Google Cloud console, navigate to the Dataplex Universal Catalog\n **Manage** view.\n\n2. For the lake that you want to delete, click more_vert\n **View more** , and then click **Delete**.\n\n3. To confirm the action, enter `delete` and click **Delete lake**.\n\nWhat's next\n-----------\n\n- Learn about [data processing tasks](/dataplex/docs/task-templates)\n- Learn about [discovering data](/dataplex/docs/discover-data)\n- Learn about [using data quality tasks](/dataplex/docs/using-data-quality-task-templates)"]]