Spark 指标

默认情况下,Dataproc Serverless 会启用可用 Spark 指标的收集,除非您使用 Spark 指标收集属性停用或替换一个或多个 Spark 指标的收集。

Spark 指标收集属性

您可以使用本部分列出的属性来停用或替换一个或多个可用 Spark 指标的收集。

属性 说明
spark.dataproc.driver.metrics 用于停用或替换 Spark 驱动程序指标
spark.dataproc.executor.metrics 用于停用或替换 Spark 执行器指标
spark.dataproc.system.metrics 用于停用 Spark 系统指标

gcloud CLI 示例:

  • 停用 Spark 驱动程序指标收集:

    gcloud dataproc batches submit spark \
        --properties spark.dataproc.driver.metrics="" \
        --region=region \
        other args ...
    
  • 将 Spark 默认驱动程序指标集合替换为仅收集 BlockManager:disk.diskSpaceUsed_MBDAGScheduler:stage.failedStages 指标:

    gcloud dataproc batches submit spark \
        --properties=^~^spark.dataproc.driver.metrics="BlockManager:disk.diskSpaceUsed_MB,DAGScheduler:stage.failedStages" \
        --region=region \
        other args ...
    

可用的 Spark 指标

Dataproc Serverless 会收集本部分中列出的 Spark 指标 除非您使用 Spark 指标集合属性 来停用或覆盖其集合。

custom.googleapis.com/METRIC_EXPLORER_NAME

Spark 驱动程序指标

指标 Metrics Explorer 名称
BlockManager:disk.diskSpaceUsed_MB spark/driver/BlockManager/disk/diskSpaceUsed_MB
BlockManager:memory.maxMem_MB spark/driver/BlockManager/memory/maxMem_MB
BlockManager:memory.memUsed_MB spark/driver/BlockManager/memory/memUsed_MB
DAGScheduler:job.activeJobs spark/driver/DAGScheduler/job/activeJobs
DAGScheduler:job.allJobs spark/driver/DAGScheduler/job/allJobs
DAGScheduler:messageProcessingTime spark/driver/DAGScheduler/messageProcessingTime
DAGScheduler:stage.failedStages spark/driver/DAGScheduler/stage/failedStages
DAGScheduler:stage.runningStages spark/driver/DAGScheduler/stage/runningStages
DAGScheduler:stage.waitingStages spark/driver/DAGScheduler/stage/waitingStages

Spark 执行器指标

指标 Metrics Explorer 名称
ExecutorAllocationManager:executors.numberExecutorsDecommissionUnfinished spark/driver/ExecutorAllocationManager/executors/numberExecutorsDecommissionUnfinished
ExecutorAllocationManager:executors.numberExecutorsExitedUnexpectedly spark/driver/ExecutorAllocationManager/executors/numberExecutorsExitedUnexpectedly
ExecutorAllocationManager:executors.numberExecutorsGracefullyDecommissioned spark/driver/ExecutorAllocationManager/executors/numberExecutorsGracefullyDecommissioned
ExecutorAllocationManager:executors.numberExecutorsKilledByDriver spark/driver/ExecutorAllocationManager/executors/numberExecutorsKilledByDriver
LiveListenerBus:queue.executorManagement.listenerProcessingTime spark/driver/LiveListenerBus/queue/executorManagement/listenerProcessingTime
executor:bytesRead spark/executor/bytesRead
executor:bytesWritten spark/executor/bytesWritten
executor:cpuTime spark/executor/cpuTime
executor:diskBytesSpilled spark/executor/diskBytesSpilled
executor:jvmGCTime spark/executor/jvmGCTime
executor:memoryBytesSpilled spark/executor/memoryBytesSpilled
executor:recordsRead spark/executor/recordsRead
executor:recordsWritten spark/executor/recordsWritten
executor:runTime spark/executor/runTime
executor:shuffleFetchWaitTime spark/executor/shuffleFetchWaitTime
executor:shuffleRecordsRead spark/executor/shuffleRecordsRead
executor:shuffleRecordsWritten spark/executor/shuffleRecordsWritten
executor:shuffleRemoteBytesReadToDisk spark/executor/shuffleRemoteBytesReadToDisk
executor:shuffleWriteTime spark/executor/shuffleWriteTime
executor:succeededTasks spark/executor/succeededTasks
ExecutorMetrics:MajorGCTime spark/executor/ExecutorMetrics/MajorGCTime
ExecutorMetrics:MinorGCTime spark/executor/ExecutorMetrics/MinorGCTime

系统指标

指标 Metrics Explorer 名称
agent:uptime agent/uptime
cpu:utilization CPU 利用率
disk:bytes_used disk/bytes_used
disk:percent_used disk/percent_used
memory:bytes_used 内存用量/已使用的字节数
memory:percent_used memory/percent_used
network:tcp_connections network/tcp_connections

查看 Spark 指标

如需查看批处理指标,请点击 Google Cloud 控制台中 Dataproc 批处理页面上的批处理 ID,打开批处理详细信息页面,该页面会在监控标签页下显示批处理工作负载的指标图表。

如需详细了解如何查看收集的指标,请参阅 Dataproc Cloud Monitoring