[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Best practices for bulk data loading\n====================================\n\nThis page describes the best practices when bulk loading data to Firestore with MongoDB compatibility\nwith tools like `mongoimport`.\n\nFirestore is a highly distributed system offering automatic\nscaling to meet the needs of your business. Firestore dynamically\nsplits and combines your data based on the load received by the system.\n\nLoad-based splitting happens automatically without any required\npre-configuration. The Firestore load-based splitting system has some\nimportant, unique characteristics compared to other document databases that\nare important to keep in mind as you model your data.\n\nFirestore's distributed nature can require changing some design choices\nto change, particularly for workloads that were optimized for\ndatabases where the primary replica is the bottleneck for write throughput.\n\nBest Practices\n--------------\n\nWorkloads that process large amounts of data in a single threaded client can\ncreate a bottleneck. Clients might be able to use single threading to bulk load\ndata, as the throughput of the client and server are similarly matched.\nA Firestore database can handle significantly more parallelism, but\nthis requires that you configure clients to send requests in parallel.\n\n### `mongoimport`\n\nWhen using the `mongoimport` tool, requests are made sequentially by default.\nTo improve the load time into Firestore,\nset the number of workers with the `--numInsertionWorkers` flag.\nThe correct setting might require tuning based on\nthe size of your client, but we generally recommend starting with at least `32`.\n\n### async programming\n\nWhen developing your own software using MongoDB compatible APIs, you can improve\nparallelism in the following ways:\n\n- *Async frameworks*: using async frameworks let you process and respond to requests in parallel. It is not necessary to develop any complex pooling or queuing when making calls to your database. Each request flow can use independent connections and make their database calls in parallel.\n- *Use parallelized compute offerings*: using services like Cloud Run, your system can scale the number of computation workers required to process data.\n\n### Transient Failures\n\nWhen working with a large distributed system like Firestore, you might encounter\ntransient failures such as network blips or contention on a\ndocument.\n\nWhen bulk loading large amounts of information, it's important to\nmaintain a retry strategy for failed writes without failing the larger bulk load\noperation.\n| **Note:** Firestore with MongoDB compatibility does not support `retryWrites`. We recommend using transactions to ensure your application guarantees idempotency."]]