Parallelstore is a fully managed, low-latency distributed file system designed to meet the demands of high performance computing (HPC) and data-intensive applications.
Parallelstore is ideal for use cases where multiple clients need concurrent access to shared files with data integrity.
Parallelstore supports the POSIX standard, ensuring compatibility with a wide range of existing applications and tools, simplifying migration and integration.
Parallelstore instances can be mounted to Compute Engine VMs or Google Kubernetes Engine clusters. The Parallelstore CSI driver enables customers to use Kubernetes APIs to access the file system as volumes for their stateful workloads.
Batch data transfers into and out of Cloud Storage are available from the command line and the REST API.
Specifications
Parallelstore is a "scratch" file system: it's backed by local SSD with 2+1 erasure coding, with a mean time to data loss (MTTDL) from 2 to 16 months, depending on instance capacity. See the Performance table for details.
Usable capacity can be configured from 12TiB to 100TiB.
Supported in multiple regions.
Performance
Expected performance from Parallelstore is shown in the following table.
Metric | Result |
---|---|
Write Throughput | 0.5 GiBps per TiB |
Read throughput | 1.15 GiBps per TiB |
Read IOPS | 30k IOPs per TiB |
Write IOPS | 10k IOPs per TiB |
4K Read Latency | 0.3 ms |
Number of client processes supported | 4000 |
Transfer speed (Parallelstore <> Cloud Storage) | Maximum transfer rate of 20 GiBps or 5000 files per second |
Mean time to data loss (MTTDL) | 100 TiB capacity: 2 months
48 TiB capacity: 4 months 12 TiB capacity: 16 months |
These numbers are measured using 256 client connections to a single instance. Latency is measured from a single client. Directory and file striping settings are optimized for each metric.
Use Cases
High-performance computing: Parallelstore excels in HPC environments where multiple compute nodes need fast and consistent access to shared data for simulations, modeling, and analysis.
Machine learning: Parallelstore can handle the large datasets and high throughput requirements of machine learning workloads, enabling efficient training and inference.
Pricing
See the Pricing page for details.