[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-07-30。"],[[["Pipelines allow users to specify CPU and memory allocation for both the driver and each executor, configurable within the Cloud Data Fusion Studio pipeline settings."],["For most pipelines, the default driver configuration of 1 CPU and 2 GB of memory is sufficient, but memory may need to be increased for pipelines with many stages or large schemas, particularly those performing in-memory joins."],["While setting the number of CPUs per executor to one is usually adequate, users should focus primarily on adjusting memory, with 4 GB of executor memory being enough for most pipelines, even complex ones."],["Spark divides executor memory into sections for its internal usage, execution, and storage, with the execution and storage space being adjustable via Spark's `spark.memory.fraction` and `spark.memory.storageFraction` properties, respectively."],["The total memory YARN reserves for each executor exceeds the configured executor memory due to the `spark.executor.memoryOverhead` setting, and this YARN request is rounded up to a multiple of `yarn.scheduler.increment-allocation-mb`, which should be considered when sizing worker nodes."]]],[]]