Hierarchical Namespace

This page provides an overview of hierarchical namespace, key features, common use cases, benefits, and limitations to consider.

Overview

Hierarchical namespace is a capability offered by Cloud Storage that lets you organize objects into folders. With hierarchical namespace, you can store your data in a logical file system structure. Organizing your data in a file system structure enhances performance, ensures consistency, and simplifies the management of data-intensive and file-oriented workloads.

The folder management operations provide reliability and management capabilities, including creating, deleting, listing, and renaming folders. The hierarchical organization of objects simplifies data organization and streamlines data management tasks.

A folder in a bucket with hierarchical namespace enabled can contain objects, other folders, or a combination of both. The following diagram shows an example of a bucket with hierarchical namespace enabled where objects are organized in a hierarchical structure of folders.

Figure 1. Bucket hierarchy with folders and objects.
Figure 1. Bucket hierarchy with folders and objects.

Key features

Hierarchical namespace provides the following features:

  • Higher initial queries per second (QPS): Buckets with hierarchical namespace enabled offer a higher initial QPS for read and write operations compared to buckets without hierarchical namespace enabled. The higher initial QPS makes it easier to scale data-intensive workloads and provides enhanced throughput.

  • Folders: Folders act as a container for objects and other folders, with support for operations such as create, delete and get folders.

  • Rename folders: The rename folders operation helps you to atomically rename the path of a folder and its underlying folders without deleting any objects. This technique is efficient and time-saving, especially for large folders with multiple objects.

  • List folders: The list folders operation lists all folders in the bucket or underneath a specific folder helping you to manage and understand the structure of your data stored within a bucket.

You can enable hierarchical namespace for a bucket when you create the bucket. Before enabling hierarchical namespace for your bucket, you should consider the limitations of hierarchical namespace. For information about hierarchical namespace limitations, see Limitations.

When should you enable hierarchical namespace for your bucket

You should consider enabling hierarchical namespace when using applications that expect a file system-like hierarchy and semantics. Hierarchical namespace is beneficial for data-intensive tasks like analytics, AI, and ML workloads. Here are some common scenarios where you should consider using hierarchical namespace:

  • Hadoop based processing: Hadoop and Spark workloads traditionally expect a file system-like storage structure and time-based naming for files and folders. Hierarchical namespace integrates with the Cloud Storage connector to provide enhanced throughput and atomic folder renames, improving data integrity and consistency for many data processing pipelines.

  • File-oriented workloads processing: Workloads such as batch analytics processing, financial services, or high performance computing are structured into partitions based on a hierarchy of folders and files. Hierarchical namespace helps to manage these environments with a dedicated API for folder management. Additionally, hierarchical namespace simplifies managing folders that contain other folders and objects. With a single API command, you can swiftly rename a folder along with all its contents, saving valuable time and resources.

  • AI and ML processing: AI and ML tools such as TensorFlow, Pandas, and PyTorch expect file system-like access and semantics. Hierarchical namespace , especially when combined with Cloud Storage FUSE, delivers increased throughput and efficient data access. As a result, hierarchical namespace enhances the performance and reliability of the ML model iteration.

Benefits of hierarchical namespace

When you enable Hierarchical namespace for your buckets, you can do the following:

  • Optimize organization: You can organize your data into a hierarchical folder structure, that helps you to manage and locate files or datasets.

  • Establish a file system-like ecosystem: Hierarchical namespace introduces file system-like features such as folders, folder renaming, and folder listing, which are beneficial for file-oriented applications, including the Hadoop ecosystem and AI and ML workloads.

  • Performance improvement: By scaling data-intensive workloads to handle higher throughput, you can enhance the overall performance of your application.

Platform support

Buckets with hierarchical namespace support the following Cloud Storage platform capabilities:

  • All the Cloud Storage object APIs and widely-used Cloud Storage features. For details about any unsupported features, see Limitations.

  • Data transfer from a standard bucket to a bucket with hierarchical namespace using Storage Transfer Service.

  • Integration with the following products:

Limitations

The following are the limitations of hierarchical namespace:

  • The following Cloud Storage capabilities are not supported in preview for buckets that use hierarchical namespace:

    • Soft delete
    • Autoclass
    • Object versioning
    • Object ACLs
    • Object retention lock
    • Bucket lock
  • While you can view buckets created with hierarchical namespace in the Google Cloud console, you cannot manage their folders using the Google Cloud console. We recommend using the command line, REST APIs, or client libraries for folder management.

What's next

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Cloud Storage performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Cloud Storage free