Stay organized with collections
Save and categorize content based on your preferences.
Write data to the Firestore database
This page describes the second stage of the
migration process where
you set up a Dataflow pipeline and begin a concurrent data move
from the Cloud Storage bucket into your destination
Firestore with MongoDB compatibility database. This operation
will run concurrently with the Datastream stream.
Start the Dataflow pipeline
The following command starts a new, uniquely named, Dataflow
pipeline.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-26 UTC."],[],[],null,["Write data to the Firestore database\n\nThis page describes the second stage of the\n[migration process](/firestore/mongodb-compatibility/docs/migrate-data) where\nyou set up a Dataflow pipeline and begin a concurrent data move\nfrom the Cloud Storage bucket into your destination\nFirestore with MongoDB compatibility database. This operation\nwill run concurrently with the Datastream stream.\n\nStart the Dataflow pipeline\n\nThe following command starts a new, uniquely named, Dataflow\npipeline.\n**Note:** The start timestamp of the job is captured in the `DATAFLOW_START_TIME` environment variable. Make a note of this timestamp: it will appear as part of the job name in the Dataflow console. \n\n DATAFLOW_START_TIME=\"$(date +'%Y%m%d%H%M%S')\"\n\n gcloud dataflow flex-template run \"dataflow-mongodb-to-firestore-$DATAFLOW_START_TIME\" \\\n --template-file-gcs-location gs://dataflow-templates-us-central1/latest/flex/Cloud_Datastream_MongoDB_to_Firestore \\\n --region $LOCATION \\\n --num-workers $NUM_WORKERS \\\n --temp-location $TEMP_OUTPUT_LOCATION \\\n --additional-user-labels \"\" \\\n --parameters inputFilePattern=$INPUT_FILE_LOCATION,\\\n inputFileFormat=avro,\\\n fileReadConcurrency=10,\\\n connectionUri=$FIRESTORE_CONNECTION_URI,\\\n databaseName=$FIRESTORE_DATABASE_NAME,\\\n shadowCollectionPrefix=shadow_,\\\n batchSize=500,\\\n deadLetterQueueDirectory=$DLQ_LOCATION,\\\n dlqRetryMinutes=10,\\\n dlqMaxRetryCount=500,\\\n processBackfillFirst=false,\\\n useShadowTablesForBackfill=true,\\\n runMode=regular,\\\n directoryWatchDurationInMinutes=20,\\\n streamName=$DATASTREAM_NAME,\\\n stagingLocation=$STAGING_LOCATION,\\\n autoscalingAlgorithm=THROUGHPUT_BASED,\\\n maxNumWorkers=$MAX_WORKERS,\\\n workerMachineType=$WORKER_TYPE\n\nFor more information about monitoring the Dataflow pipeline,\nsee\n[Troubleshooting](/firestore/mongodb-compatibility/docs/migrate-troubleshooting).\n\nWhat's next\n\nProceed to\n[Migrate traffic to Firestore](/firestore/mongodb-compatibility/docs/migrate-traffic)."]]