默认情况下,Dataproc Serverless 允许 可用的 Spark 指标,除非您使用 Spark 指标集合属性 来停用或替换一个或多个 Spark 指标的集合。
Spark 指标集合属性
您可以使用本部分列出的属性 来停用或覆盖一个或多个 可用的 Spark 指标。
属性 | 说明 |
---|---|
spark.dataproc.driver.metrics |
用于停用或替换 Spark 驱动程序指标。 |
spark.dataproc.executor.metrics |
用于停用或替换 Spark 执行器指标。 |
spark.dataproc.system.metrics |
用于停用 Spark 系统指标。 |
gcloud CLI 示例:
停用 Spark 驱动程序指标收集:
gcloud dataproc batches submit spark \ --properties spark.dataproc.driver.metrics="" \ --region=region \ other args ...
将 Spark 默认驱动程序指标集合替换为仅收集
BlockManager:disk.diskSpaceUsed_MB
和DAGScheduler:stage.failedStages
指标:gcloud dataproc batches submit spark \ --properties=^~^spark.dataproc.driver.metrics="BlockManager:disk.diskSpaceUsed_MB,DAGScheduler:stage.failedStages" \ --region=region \ other args ...
可用的 Spark 指标
Dataproc Serverless 会收集本部分中列出的 Spark 指标 除非您使用 Spark 指标集合属性 来停用或覆盖其集合。
custom.googleapis.com/METRIC_EXPLORER_NAME
。
Spark 驱动程序指标
指标 | Metrics Explorer 名称 |
---|---|
BlockManager:disk.diskSpaceUsed_MB | spark/driver/BlockManager/disk/diskSpaceUsed_MB |
BlockManager:memory.maxMem_MB | spark/driver/BlockManager/memory/maxMem_MB |
BlockManager:memory.memUsed_MB | spark/driver/BlockManager/memory/memUsed_MB |
DAGScheduler:job.activeJobs | spark/driver/DAGScheduler/job/activeJobs |
DAGScheduler:job.allJobs | spark/driver/DAGScheduler/job/allJobs |
DAGScheduler:messageProcessingTime | spark/driver/DAGScheduler/messageProcessingTime |
DAGScheduler:stage.failedStages | spark/driver/DAGScheduler/stage/failedStages |
DAGScheduler:stage.runningStages | spark/driver/DAGScheduler/stage/runningStages |
DAGScheduler:stage.waitingStages | spark/driver/DAGScheduler/stage/waitingStages |
Spark 执行器指标
指标 | Metrics Explorer 名称 |
---|---|
ExecutorAllocationManager:executors.numberExecutorsDecommissionUnfinished | spark/driver/ExecutorAllocationManager/executors/numberExecutorsDecommissionUnfinished |
ExecutorAllocationManager:executors.numberExecutorsExitedUnexpectedly | spark/driver/ExecutorAllocationManager/executors/numberExecutorsExitedUnexpectedly |
ExecutorAllocationManager:executors.numberExecutorsGracefullyDecommissioned | spark/driver/ExecutorAllocationManager/executors/numberExecutorsGracefullyDecommissioned |
ExecutorAllocationManager:executors.numberExecutorsKilledByDriver | spark/driver/ExecutorAllocationManager/executors/numberExecutorsKilledByDriver |
LiveListenerBus:queue.executorManagement.listenerProcessingTime | spark/driver/LiveListenerBus/queue/executorManagement/listenerProcessingTime |
executor:bytesRead | spark/executor/bytesRead |
executor:bytesWritten | spark/executor/bytesWrite |
executor:cpuTime | spark/executor/cpuTime |
executor:diskBytesSpilled | spark/executor/diskBytesSpilled |
executor:jvmGCTime | spark/executor/jvmGCTime |
executor:memoryBytesSpilled | spark/executor/memoryBytesSpilled |
executor:recordsRead | spark/executor/recordsRead |
executor:recordsWritten | Spark/执行器/记录写入 |
executor:runTime | spark/executor/runTime |
executor:shuffleFetchWaitTime | spark/executor/shuffleFetchWaitTime |
executor:shuffleRecordsRead | spark/executor/shuffleRecordsRead |
executor:shuffleRecordsWritten | spark/executor/shuffleRecordsWritten |
executor:shuffleRemoteBytesReadToDisk | spark/executor/shuffleRemoteBytesReadToDisk |
executor:shuffleWriteTime | spark/executor/shuffleWriteTime |
executor:succeededTasks | spark/executor/succeededTasks |
ExecutorMetrics:MajorGCTime | spark/executor/ExecutorMetrics/MajorGCTime |
ExecutorMetrics:MinorGCTime | spark/executor/ExecutorMetrics/MinorGCTime |
系统指标
指标 | Metrics Explorer 名称 |
---|---|
Agent:uptime | 客服人员/正常运行时间 |
cpu:利用率 | CPU 利用率 |
disk:bytes_used | 磁盘/已用字节数 |
disk:percent_used | 磁盘/已用百分比 |
memory:bytes_used | 内存用量/已使用的字节数 |
memory:percent_used | 内存/使用百分比 |
network:tcp_connections | network/tcp_connections |
查看 Spark 指标
要查看批处理指标,请在 Dataproc 上点击一个批处理 ID 批次页面 Google Cloud 控制台,打开批处理详细信息页面, 该标签页会在“监控”标签页下显示批量工作负载的指标图。
![](https://cloud.google.com/static/dataproc/images/spark-batch-metrics-graph.png?hl=zh-cn)
请参阅 Dataproc Cloud Monitoring ,详细了解如何查看收集的指标。