[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[],[],null,["# Performance considerations\n\nThis page provides guidance on configuring your Parallelstore environment to\nobtain the best performance.\n\nGeneral recommendations\n-----------------------\n\n- Remove any alias from `ls` for improved default performance.\n On many systems, it is aliased to `ls -color=auto` which is much slower with\n the default Parallelstore configuration.\n\n- If the performance of list operations is slow, consider enabling caching for\n the dfuse mount.\n\nInterception library\n--------------------\n\nThe `libioil` library can be used to improve the performance of read and write\noperations to DFuse from applications which use libc. The library bypasses the\nkernel by intercepting POSIX read and write calls from the application so as to\nservice them directly in user-space. See\n[Interception library](/parallelstore/docs/interception-library) for\nmore details.\n\nIn most cases, we recommend using the interception library on a per-process or\nper-application invocation.\n\nSituations in which you may not want or need to use the interception library\ninclude the following:\n\n- Only applications built with libc can use the interception library.\n- If you have a workload that benefits from caching, such as accessing the same files repeatedly, we recommend not using the interception library.\n- If your workload is metadata-intensive, such as working with many small files, or a very large directory listing, the interception library likely won't improve performance.\n\nThe `LD_PRELOAD` invocation can be set as an environment variable in your\nshell environment, but doing so can sometimes cause problems. We recommend\ninstead specifying it with each command.\n\nAlternatively, it's possible to link the interception library into your\napplication at compile time with the `-lioil` flag.\n\n`dfuse` caching\n---------------\n\nCaching is enabled in `dfuse` by default.\n\nThere are two cache-related flags used by `dfuse` when mounting a\nParallelstore instance:\n\n- `--disable-wb-cache` uses write-through rather than write-back caching.\n- `--disable-caching` disables all caching.\n\nThe following suggestions apply to caching and performance:\n\n- If you're using the interception library, write-back caching is bypassed. We recommend specifying `--disable-wb-cache` when using the interception library.\n- If your workload involves reading many files once, you should disable caching.\n- For workloads that involve many clients modifying files, and the updates need to be available immediately to other clients, you must disable caching.\n- If your workload is reading the same files repeatedly, caching can improve performance. This is particularly true if the files fit into your clients' memory. `dfuse` uses the Linux page cache for its caching.\n- For workloads which consist of small I/Os to large files, in addition to\n enabling caching, increasing dfuse read-ahead may be beneficial. To increase\n dfuse read-ahead, after `dfuse` has been mounted, run the following commands:\n\n echo 4096 \u003e /sys/class/bdi/\\$(mountpoint -d /mnt)/read_ahead_kb\n echo 100 \u003e /sys/class/bdi/\\$(mountpoint -d /mnt)/max_ratio\n\nIf your workloads involve a mixture of the preceding scenarios, you can mount\nthe same Parallelstore instance to multiple mount points with different caching\nsettings.\n\nThread count and event queue count\n----------------------------------\n\nWhen mounting your Parallelstore instance, we recommend the following values\nfor `--thread-count` and `--eq-count`:\n\n- The thread count value should not exceed the number of vCPU cores.\n- The maximum recommended thread count value is between 16 and 20. Beyond this number, there is little or no performance benefit, regardless of the number of available cores.\n- The event queue value should be half of the thread count value.\n\nIf your workload involves a very high number of small file operations and heavy\nmetadata access, you can experiment with increasing the numbers beyond these\nrecommendations.\n\nFile striping setting\n---------------------\n\nFile striping is a data storage technique where a file is divided into blocks,\nor stripes, and distributed across multiple storage targets. File striping can\nincrease performance by allowing parallel reads and writes to more than one\nstorage target backing the instance.\n\nWhen creating your Parallelstore instance, you can specify one of three file\nstriping settings:\n\n- Minimum\n- Balanced\n- Maximum\n\nThese settings can have a significant import on your Parallelstore performance.\nFor most workloads, we recommend the balanced setting, which should be a\nreasonable compromise for most workloads. If the performance with the balanced\nsetting is not acceptable:\n\n- The minimum setting may improve performance for workloads with many small\n files, particularly when the average file size is less than 256KB.\n\n- The maximum setting may improve performance for workloads with very large\n files, generally greater than 8GB, especially when many clients are sharing\n access to the same files.\n\nFor advanced tuning, the `daos` tool provides per-file or per-directory\nsettings. Experimenting with advanced tuning comes with performance-related\nrisks and is generally not recommended. See\n[Understanding Data Redundancy and Sharding in DAOS](https://www.intel.com/content/www/us/en/developer/articles/technical/understanding-data-redundancy-and-sharding-in-daos.html) for more\ndetails.\n\nDirectory striping setting\n--------------------------\n\nWhen creating your Parallelstore instance, you can specify one of three directory\nstriping settings:\n\n- Minimum\n- Balanced\n- Maximum\n\nFor most workloads, we recommend the maximum setting.\n\nFor workloads which involve lots of listing of large directories, the balanced\nor minimum settings can result in better list performance. However, the\nperformance of other operations, particularly file creation, may suffer.\n\nmulti-user\n----------\n\nWhen using the `dfuse` tool to mount your Parallelstore instance, we recommend\nspecifying the `--multi-user` flag. This flag tells the kernel to make the file\nsystem available to all users on a client, rather than only the user running\nthe DFuse process. DFuse then appears like a generic multi-user file\nsystem and the standard `chown` and `chgrp` calls are enabled. All file\nsystem entries are owned by the user that created them as is normal in a\nPOSIX file system,\n\nWhen specifying the `--multi-user` flag, you must also update `/etc/fuse.conf`\nas root by adding the following line: \n\n user_allow_other\n\nThere doesn't appear to be a performance implication to mounting your instance\nas multi-user.\n\nErasure coding setting\n----------------------\n\nErasure coding is set to 2+1. This setting cannot be changed. Any I/O that\ndoesn't use EC2+1 is rejected.\n\nGoogle Kubernetes Engine sidecar container resource allocation\n--------------------------------------------------------------\n\nIn most cases, unsatisfactory performance with Google Kubernetes Engine and Parallelstore is\ncaused by insufficient CPU or memory allocated to the Parallelstore sidecar\ncontainer. To properly allocate resources, consider the following suggestions:\n\n- Read the considerations highlighted in [Configure resources for the sidecar\n container](/parallelstore/docs/connect-from-kubernetes-engine#configure_resources_for_the_sidecar_container).\n You will learn about why you might need to increase the resource allocation,\n and how to configure the sidecar container resource allocation using Pod\n annotations.\n\n- You can use the value `0` to turn off any resource limits or requests on\n Standard clusters. For example, by setting\n `gke-parallelstore/cpu-limit: 0` and `gke-parallelstore/memory-limit: 0`, the\n sidecar container's CPU and memory limits will be empty, and the default\n requests will be used. This setting is useful when you don't know how many\n resources dfuse needs for your workloads and want it to use all available\n resources on a node. Once you've figured out how many resources dfuse needs\n based on your workload metrics, you can set appropriate limits.\n\n- On Autopilot clusters, you cannot use value `0` to unset the sidecar\n container resource limits and requests. You have to explicitly set a larger\n resource limit for the sidecar container on Autopilot\n clusters, and rely on Google Cloud metrics to decide whether increasing the\n resource limit is needed."]]