Dataproc 측정항목

Cloud Monitoring은 클라우드 기반 애플리케이션의 성능, 업타임, 전반적인 상태에 관한 정보를 제공합니다. Google Cloud Observability는 Dataproc 클러스터에서 클러스터별 HDFS, YARN, 작업, 작업 측정항목을 포함한 측정항목, 이벤트, 메타데이터를 수집하여 대시보드와 차트를 통해 통계를 생성합니다(Cloud Monitoring Dataproc 측정항목 참조).

Dataproc 리소스 측정항목 수집

Cloud Monitoring은 다음 Dataproc 리소스와 관련된 측정항목을 수집합니다.

  • Cloud Dataproc 클러스터
  • Cloud Dataproc 작업
  • Cloud Dataproc 배치
  • Cloud Dataproc 세션

Dataproc 리소스 측정항목은 다음 형식(dataproc.googleapis.com/RESOURCE/METRIC)으로 수집되며 여러 OSS 측정항목 수집을 포함합니다.

Dataproc 리소스 측정항목 보기

측정항목 탐색기에서 Filter by resource or metric name 상자에 'dataproc'을 입력하고 'Cloud Dataproc' 리소스를 선택하여 Dataproc 리소스 측정항목을 선택하고 볼 수 있습니다.

커스텀 측정항목 수집

Dataproc 클러스터를 만들 때 하나 이상의 커스텀 측정항목 소스에서 측정항목 수집을 사용 설정할 수 있습니다. 측정항목 소스에서 수집할 측정항목을 지정하지 않는 한, 사용 설정된 각 측정항목 소스에서 표준 측정항목 집합이 수집됩니다(사용자 지정 측정항목을 측정항목 '재정의'라고 함).

커스텀 OSS 측정항목은 다음 형식으로 수집됩니다: custom.googleapis.com/OSS_COMPONENT/METRIC

커스텀 OSS 측정항목 예시:

custom.googleapis.com/spark/driver/DAGScheduler/job/allJobs
custom.googleapis.com/hiveserver2/memory/MaxNonHeapMemory

커스텀 측정항목 수집 사용 설정

gcloud CLI 또는 Dataproc API를 사용하여 하나 이상의 측정항목 소스에서 커스텀 측정항목 수집을 사용 설정할 수 있습니다.

gcloud CLI

커스텀 측정항목 수집

gcloud dataproc clusters create --metric-sources 플래그를 사용하여 하나 이상의 측정항목 소스에서 커스텀 측정항목 수집을 사용 설정하세요.

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC_SOURCE(s) \
    ... other flags

참고:

측정항목 수집 재정의

원하는 경우 --metric-overrides 또는 --metric-overrides-file 플래그를 추가하여 하나 이상의 측정항목 소스에서 커스텀 측정항목을 하나 이상 수집하도록 사용 설정할 수 있습니다.

  • 모든 커스텀 측정항목과 모든 Spark 측정항목을 측정항목 재정의로 수집하도록 나열할 수 있습니다. 재정의 측정항목 값은 대소문자를 구분하며 적절한 경우 CamelCase 형식으로 제공해야 합니다.

    예를 들면 다음과 같습니다.

    • sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
    • hiveserver2:JVM:Memory:NonHeapMemoryUsage.used
    • yarn:ResourceManager:JvmMetrics:MemHeapMaxM

  • 지정된 측정항목 소스에서 재정의된 측정항목만 수집됩니다. 예를 들어 하나 이상의 spark:executive 측정항목이 측정항목 재정의로 나열되면 다른 SPARK 측정항목이 수집되지 않습니다. 다른 측정항목 소스의 커스텀 측정항목 수집은 영향을 받지 않습니다. 예를 들어 SPARKYARN 측정항목 소스가 모두 사용 설정되고 Spark 측정항목에 대해서만 재정의가 제공되는 경우 사용 설정된 YARN 메트릭의 표준 세트가 수집됩니다.
  • 지정된 측정항목 재정의 소스를 사용 설정해야 합니다. 예를 들어 하나 이상의 spark:driver 측정항목이 측정항목 재정의로 제공되는 경우 spark 측정항목 소스를 사용 설정해야 합니다(--metric-sources=spark).

측정항목 목록 재정의

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC_SOURCE(s) \
    --metric-overrides=LIST_OF_METRIC_OVERRIDES \
    ... other flags

참고:

  • --metric-sources: 커스텀 측정항목 수집을 사용 설정하는 데 필요합니다. 측정항목 소스 spark, flink, hdfs, yarn, spark-history-server, hiveserver2, hivemetastore, monitoring-agent-defaults 중 하나 이상을 지정하세요. 측정항목 소스 이름은 대소문자를 구분하지 않습니다(예: 'yarn' 또는 'YARN'이 허용됨).
  • --metric-overrides: 다음 형식의 측정항목 목록을 제공합니다.

    METRIC_SOURCE:INSTANCE:GROUP:METRIC

    예:--metric-overrides=sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed

    이 플래그는 --metric-overrides-file 플래그의 대안이며 함께 사용할 수 없습니다.

측정항목 파일 재정의

gcloud dataproc clusters create cluster-name \
    --metric-sources=METRIC-SOURCE(s) \
    --metric-overrides-file=METRIC_OVERRIDES_FILENAME \
    ... other flags

참고:

  • --metric-sources: 커스텀 측정항목 수집을 사용 설정하는 데 필요합니다. 측정항목 소스 spark, flink, hdfs, yarn, spark-history-server, hiveserver2, hivemetastore, monitoring-agent-defaults 중 하나 이상을 지정하세요. 측정항목 소스 이름은 대소문자를 구분하지 않습니다(예: 'yarn' 또는 'YARN'이 허용됨).
  • --metric-overrides-file: 다음 형식의 측정항목이 하나 이상 포함된 로컬 또는 Cloud Storage 파일(gs://bucket/filename)을 지정합니다.

    METRIC_SOURCE:INSTANCE:GROUP:METRIC

    camelcase 형식을 적절하게 사용합니다.

    예를 들면 다음과 같습니다.

    • --metric-overrides-file=gs://my-bucket/my-filename.txt
    • --metric-overrides-file=./local-directory/local-filename.txt

      이 플래그는 --metric-overrides 플래그의 대안이며 함께 사용할 수 없습니다.

REST API

clusters.create 요청의 일부로 DataprocMetricConfig를 사용하여 커스텀 측정항목 수집을 사용 설정합니다. 참고: 운영 에이전트가 설치되어 있지 않으면 2.2 이미지 버전 클러스터에서 monitoring-agent-defaults를 사용할 수 없습니다.

커스텀 측정항목 보기

측정항목 탐색기에서 VM Instance 리소스를 선택한 다음 Custom metrics를 선택하여 Dataproc 리소스 측정항목을 선택하고 볼 수 있습니다.

커스텀 측정항목

Dataproc을 사용 설정하여 다음 테이블에 나열된 커스텀 측정항목을 수집할 수 있습니다.

  • 연결된 측정항목 소스를 사용 설정할 때 Dataproc이 측정항목을 수집하면 사용 설정된 측정항목 열이 'y'로 표시됩니다.

  • 측정항목 소스의 기본 사용 설정된 측정항목의 표준 세트 수집을 재정의하는 경우 모든 측정항목과 모든 Spark 측정항목을 수집에 사용 설정할 수 있습니다(커스텀 측정항목 수집 사용 설정 참조).

  • Dataproc은 모니터링 에이전트를 사용하여 측정항목을 수집합니다. 측정항목 소스를 사용 설정하면 에이전트 측정항목 수집이 사용 설정됩니다. 이 측정항목은 사용자에게 청구되지 않습니다. Dataproc은 이를 사용하여 측정항목 수집 문제를 진단합니다.

Hadoop 측정항목

HDFS 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
hdfs:NameNode:FSNamesystem:CapacityTotalGB dfs/FSNamesystem/CapacityTotalGB y
hdfs:NameNode:FSNamesystem:CapacityUsedGB dfs/FSNamesystem/CapacityUsedGB y
hdfs:NameNode:FSNamesystem:CapacityRemainingGB dfs/FSNamesystem/CapacityRemainingGB y
hdfs:NameNode:FSNamesystem:FilesTotal dfs/FSNamesystem/FilesTotal y
hdfs:NameNode:FSNamesystem:MissingBlocks dfs/FSNamesystem/MissingBlocks n
hdfs:NameNode:FSNamesystem:ExpiredHeartbeats dfs/FSNamesystem/ExpiredHeartbeats n
hdfs:NameNode:FSNamesystem:TransactionsSinceLastCheckpoint dfs/FSNamesystem/TransactionsSinceLastCheckpoint n
hdfs:NameNode:FSNamesystem:TransactionsSinceLastLogRoll dfs/FSNamesystem/TransactionsSinceLastLogRoll n
hdfs:NameNode:FSNamesystem:LastWrittenTransactionId dfs/FSNamesystem/LastWrittenTransactionId n
hdfs:NameNode:FSNamesystem:CapacityTotal dfs/FSNamesystem/CapacityTotal n
hdfs:NameNode:FSNamesystem:CapacityUsed dfs/FSNamesystem/CapacityUsed n
hdfs:NameNode:FSNamesystem:CapacityRemaining dfs/FSNamesystem/CapacityRemaining n
hdfs:NameNode:FSNamesystem:CapacityUsedNonDFS dfs/FSNamesystem/CapacityUsedNonDFS n
hdfs:NameNode:FSNamesystem:TotalLoad dfs/FSNamesystem/TotalLoad n
hdfs:NameNode:FSNamesystem:SnapshottableDirectories dfs/FSNamesystem/SnapshottableDirectories n
hdfs:NameNode:FSNamesystem:Snapshots dfs/FSNamesystem/Snapshots n
hdfs:NameNode:FSNamesystem:BlocksTotal dfs/FSNamesystem/BlocksTotal n
hdfs:NameNode:FSNamesystem:PendingReplicationBlocks dfs/FSNamesystem/PendingReplicationBlocks n
hdfs:NameNode:FSNamesystem:UnderReplicatedBlocks dfs/FSNamesystem/UnderReplicatedBlocks n
hdfs:NameNode:FSNamesystem:CorruptBlocks dfs/FSNamesystem/CorruptBlocks n
hdfs:NameNode:FSNamesystem:ScheduledReplicationBlocks dfs/FSNamesystem/ScheduledReplicationBlocks n
hdfs:NameNode:FSNamesystem:PendingDeletionBlocks dfs/FSNamesystem/PendingDeletionBlocks n
hdfs:NameNode:FSNamesystem:ExcessBlocks dfs/FSNamesystem/ExcessBlocks n
hdfs:NameNode:FSNamesystem:PostponedMisreplicatedBlocks dfs/FSNamesystem/PostponedMisreplicatedBlocks n
hdfs:NameNode:FSNamesystem:PendingDataNodeMessageCourt dfs/FSNamesystem/PendingDataNodeMessageCourt n
hdfs:NameNode:FSNamesystem:MillisSinceLastLoadedEdits dfs/FSNamesystem/MillisSinceLastLoadedEdits n
hdfs:NameNode:FSNamesystem:BlockCapacity dfs/FSNamesystem/BlockCapacity n
hdfs:NameNode:FSNamesystem:StaleDataNodes dfs/FSNamesystem/StaleDataNodes n
hdfs:NameNode:FSNamesystem:TotalFiles dfs/FSNamesystem/TotalFiles n
hdfs:NameNode:JvmMetrics:MemHeapUsedM dfs/jvm/MemHeapUsedM n
hdfs:NameNode:JvmMetrics:MemHeapCommittedM dfs/jvm/MemHeapCommittedM n
hdfs:NameNode:JvmMetrics:MemHeapMaxM dfs/jvm/MemHeapMaxM n
hdfs:NameNode:JvmMetrics:MemMaxM dfs/jvm/MemMaxM n

YARN 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
yarn:ResourceManager:ClusterMetrics:NumActiveNMs yarn/ClusterMetrics/NumActiveNMs y
yarn:ResourceManager:ClusterMetrics:NumDecommissionedNMs yarn/ClusterMetrics/NumDecommissionedNMs n
yarn:ResourceManager:ClusterMetrics:NumLostNMs yarn/ClusterMetrics/NumLostNMs n
yarn:ResourceManager:ClusterMetrics:NumUnhealthyNMs yarn/ClusterMetrics/NumUnhealthyNMs n
yarn:ResourceManager:ClusterMetrics:NumRebootedNMs yarn/ClusterMetrics/NumRebootedNMs n
yarn:ResourceManager:QueueMetrics:running_0 yarn/QueueMetrics/running_0 y
yarn:ResourceManager:QueueMetrics:running_60 yarn/QueueMetrics/running_60 y
yarn:ResourceManager:QueueMetrics:running_300 yarn/QueueMetrics/running_300 y
yarn:ResourceManager:QueueMetrics:running_1440 yarn/QueueMetrics/running_1440 y
yarn:ResourceManager:QueueMetrics:AppsSubmitted yarn/QueueMetrics/AppsSubmitted y
yarn:ResourceManager:QueueMetrics:AvailableMB yarn/QueueMetrics/AvailableMB y
yarn:ResourceManager:QueueMetrics:PendingContainers yarn/QueueMetrics/PendingContainers y
yarn:ResourceManager:QueueMetrics:AppsRunning yarn/QueueMetrics/AppsRunning n
yarn:ResourceManager:QueueMetrics:AppsPending yarn/QueueMetrics/AppsPending n
yarn:ResourceManager:QueueMetrics:AppsCompleted yarn/QueueMetrics/AppsCompleted n
yarn:ResourceManager:QueueMetrics:AppsKilled yarn/QueueMetrics/AppsKilled n
yarn:ResourceManager:QueueMetrics:AppsFailed yarn/QueueMetrics/AppsFailed n
yarn:ResourceManager:QueueMetrics:AllocatedMB yarn/QueueMetrics/AllocatedMB n
yarn:ResourceManager:QueueMetrics:AllocatedVCores yarn/QueueMetrics/AllocatedVCores n
yarn:ResourceManager:QueueMetrics:AllocatedContainers yarn/QueueMetrics/AllocatedContainers n
yarn:ResourceManager:QueueMetrics:AggregateContainersAllocated yarn/QueueMetrics/AggregateContainersAllocated n
yarn:ResourceManager:QueueMetrics:AggregateContainersReleased yarn/QueueMetrics/AggregateContainersReleased n
yarn:ResourceManager:QueueMetrics:AvailableVCores yarn/QueueMetrics/AvailableVCores n
yarn:ResourceManager:QueueMetrics:PendingMB yarn/QueueMetrics/PendingMB n
yarn:ResourceManager:QueueMetrics:PendingVCores yarn/QueueMetrics/PendingVCores n
yarn:ResourceManager:QueueMetrics:ReservedMB yarn/QueueMetrics/ReservedMB n
yarn:ResourceManager:QueueMetrics:ReservedVCores yarn/QueueMetrics/ReservedVCores n
yarn:ResourceManager:QueueMetrics:ReservedContainers yarn/QueueMetrics/ReservedContainers n
yarn:ResourceManager:QueueMetrics:ActiveUsers yarn/QueueMetrics/ActiveUsers n
yarn:ResourceManager:QueueMetrics:ActiveApplications yarn/QueueMetrics/ActiveApplications n
yarn:ResourceManager:QueueMetrics:FairShareMB yarn/QueueMetrics/FairShareMB n
yarn:ResourceManager:QueueMetrics:FairShareVCores yarn/QueueMetrics/FairShareVCores n
yarn:ResourceManager:QueueMetrics:MinShareMB yarn/QueueMetrics/MinShareMB n
yarn:ResourceManager:QueueMetrics:MinShareVCores yarn/QueueMetrics/MinShareVCores n
yarn:ResourceManager:QueueMetrics:MaxShareMB yarn/QueueMetrics/MaxShareMB n
yarn:ResourceManager:QueueMetrics:MaxShareVCores yarn/QueueMetrics/MaxShareVCores n
yarn:ResourceManager:JvmMetrics:MemHeapUsedM yarn/jvm/MemHeapUsedM n
yarn:ResourceManager:JvmMetrics:MemHeapCommittedM yarn/jvm/MemHeapCommittedM n
yarn:ResourceManager:JvmMetrics:MemHeapMaxM yarn/jvm/MemHeapMaxM n
yarn:ResourceManager:JvmMetrics:MemMaxM yarn/jvm/MemMaxM n

Spark 측정항목

Spark 드라이버 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
spark:driver:BlockManager:disk.diskSpaceUsed_MB spark/driver/BlockManager/disk/diskSpaceUsed_MB y
spark:driver:BlockManager:memory.maxMem_MB spark/driver/BlockManager/memory/maxMem_MB y
spark:driver:BlockManager:memory.memUsed_MB spark/driver/BlockManager/memory/memUsed_MB y
spark:driver:DAGScheduler:job.allJobs spark/driver/DAGScheduler/job/allJobs y
spark:driver:DAGScheduler:stage.failedStages spark/driver/DAGScheduler/stage/failedStages y
spark:driver:DAGScheduler:stage.waitingStages spark/driver/DAGScheduler/stage/waitingStages y

Spark 실행자 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
spark:executor:executor:bytesRead spark/executor/bytesRead y
spark:executor:executor:bytesWritten spark/executor/bytesWritten y
spark:executor:executor:cpuTime spark/executor/cpuTime y
spark:executor:executor:diskBytesSpilled spark/executor/diskBytesSpilled y
spark:executor:executor:recordsRead spark/executor/recordsRead y
spark:executor:executor:recordsWritten spark/executor/recordsWritten y
spark:executor:executor:runTime spark/executor/runTime y
spark:executor:executor:shuffleRecordsRead spark/executor/shuffleRecordsRead y
spark:executor:executor:shuffleRecordsWritten spark/executor/shuffleRecordsWritten y
측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
flink:jobmanager:numRegisteredTaskManagers flink/jobmanager/numRegisteredTaskManagers n
flink:jobmanager:numRunningJobs flink/jobmanager/numRunningJobs n
flink:jobmanager:Status.JVM.ClassLoader.ClassesLoaded flink/jobmanager/Status.JVM.ClassLoader.ClassesLoaded n
flink:jobmanager:Status.JVM.ClassLoader.ClassesUnloaded flink/jobmanager/Status.JVM.ClassLoader.ClassesUnloaded n
flink:jobmanager:Status.JVM.CPU.Load flink/jobmanager/Status.JVM.CPU.Load n
flink:jobmanager:Status.JVM.CPU.Time flink/jobmanager/Status.JVM.CPU.Time y
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count n
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time n
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Count flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Count n
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Time flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Time n
flink:jobmanager:Status.JVM.Memory.Direct.Count flink/jobmanager/Status.JVM.Memory.Direct.Count y
flink:jobmanager:Status.JVM.Memory.Direct.MemoryUsed flink/jobmanager/Status.JVM.Memory.Direct.MemoryUsed y
flink:jobmanager:Status.JVM.Memory.Direct.TotalCapacity flink/jobmanager/Status.JVM.Memory.Direct.TotalCapacity y
flink:jobmanager:Status.JVM.Memory.Heap.Committed flink/jobmanager/Status.JVM.Memory.Heap.Committed y
flink:jobmanager:Status.JVM.Memory.Heap.Max flink/jobmanager/Status.JVM.Memory.Heap.Max y
flink:jobmanager:Status.JVM.Memory.Heap.Used flink/jobmanager/Status.JVM.Memory.Heap.Used y
flink:jobmanager:Status.JVM.Memory.Mapped.Count flink/jobmanager/Status.JVM.Memory.Mapped.Count y
flink:jobmanager:Status.JVM.Memory.Mapped.MemoryUsed flink/jobmanager/Status.JVM.Memory.Mapped.MemoryUsed y
flink:jobmanager:Status.JVM.Memory.Mapped.TotalCapacity flink/jobmanager/Status.JVM.Memory.Mapped.TotalCapacity y
flink:jobmanager:Status.JVM.Memory.Metaspace.Committed flink/jobmanager/Status.JVM.Memory.Metaspace.Committed n
flink:jobmanager:Status.JVM.Memory.Metaspace.Max flink/jobmanager/Status.JVM.Memory.Metaspace.Max n
flink:jobmanager:Status.JVM.Memory.Metaspace.Used flink/jobmanager/Status.JVM.Memory.Metaspace.Used n
flink:jobmanager:Status.JVM.Memory.NonHeap.Committed flink/jobmanager/Status.JVM.Memory.NonHeap.Committed n
flink:jobmanager:Status.JVM.Memory.NonHeap.Max flink/jobmanager/Status.JVM.Memory.NonHeap.Max n
flink:jobmanager:Status.JVM.Memory.NonHeap.Used flink/jobmanager/Status.JVM.Memory.NonHeap.Used n
flink:jobmanager:Status.JVM.Threads.Count flink/jobmanager/Status.JVM.Threads.Count n
flink:jobmanager:taskSlotsAvailable flink/jobmanager/taskSlotsAvailable y
flink:jobmanager:taskSlotsTotal flink/jobmanager/taskSlotsTotal y
flink:operator:numRecordsIn flink/operator/numRecordsIn n
flink:operator:numRecordsInPerSecond.count flink/operator/numRecordsInPerSecond.count n
flink:operator:numRecordsInPerSecond.rate flink/operator/numRecordsInPerSecond.rate n
flink:operator:numRecordsOut flink/operator/numRecordsOut n
flink:operator:numRecordsOutPerSecond.count flink/operator/numRecordsOutPerSecond.count n
flink:operator:numRecordsOutPerSecond.rate flink/operator/numRecordsOutPerSecond.rate n
flink:operator:numSplitsProcessed flink/operator/numSplitsProcessed n
flink:task:buffers.inPoolUsage flink/task/buffers.inPoolUsage n
flink:task:buffers.inputExclusiveBuffersUsage flink/task/buffers.inputExclusiveBuffersUsage n
flink:task:buffers.inputFloatingBuffersUsage flink/task/buffers.inputFloatingBuffersUsage n
flink:task:buffers.inputQueueLength flink/task/buffers.inputQueueLength n
flink:task:buffers.outPoolUsage flink/task/buffers.outPoolUsage n
flink:task:buffers.outputQueueLength flink/task/buffers.outputQueueLength n
flink:task:idleTimeMsPerSecond.count flink/task/idleTimeMsPerSecond.count n
flink:task:idleTimeMsPerSecond.rate flink/task/idleTimeMsPerSecond.rate n
flink:task:numBuffersInLocal flink/task/numBuffersInLocal n
flink:task:numBuffersInLocalPerSecond.count flink/task/numBuffersInLocalPerSecond.count n
flink:task:numBuffersInLocalPerSecond.rate flink/task/numBuffersInLocalPerSecond.rate n
flink:task:numBuffersInRemote flink/task/numBuffersInRemote n
flink:task:numBuffersInRemotePerSecond.count flink/task/numBuffersInRemotePerSecond.count n
flink:task:numBuffersInRemotePerSecond.rate flink/task/numBuffersInRemotePerSecond.rate n
flink:task:numBuffersOut flink/task/numBuffersOut n
flink:task:numBuffersOutPerSecond.count flink/task/numBuffersOutPerSecond.count n
flink:task:numBuffersOutPerSecond.rate flink/task/numBuffersOutPerSecond.rate n
flink:task:numBytesIn flink/task/numBytesIn n
flink:task:numBytesInLocal flink/task/numBytesInLocal n
flink:task:numBytesInLocalPerSecond.count flink/task/numBytesInLocalPerSecond.count n
flink:task:numBytesInLocalPerSecond.rate flink/task/numBytesInLocalPerSecond.rate n
flink:task:numBytesInPerSecond.count flink/task/numBytesInPerSecond.count n
flink:task:numBytesInPerSecond.rate flink/task/numBytesInPerSecond.rate n
flink:task:numBytesInRemote flink/task/numBytesInRemote n
flink:task:numBytesInRemotePerSecond.count flink/task/numBytesInRemotePerSecond.count n
flink:task:numBytesInRemotePerSecond.rate flink/task/numBytesInRemotePerSecond.rate n
flink:task:numBytesOut flink/task/numBytesOut n
flink:task:numBytesOutPerSecond.count flink/task/numBytesOutPerSecond.count n
flink:task:numBytesOutPerSecond.rate flink/task/numBytesOutPerSecond.rate n
flink:task:numRecordsIn flink/task/numRecordsIn n
flink:task:numRecordsInPerSecond.count flink/task/numRecordsInPerSecond.count n
flink:task:numRecordsInPerSecond.rate flink/task/numRecordsInPerSecond.rate n
flink:task:numRecordsOut flink/task/numRecordsOut n
flink:task:numRecordsOutPerSecond.count flink/task/numRecordsOutPerSecond.count n
flink:task:numRecordsOutPerSecond.rate flink/task/numRecordsOutPerSecond.rate n
flink:task:Shuffle.Netty.Input.Buffers.inPoolUsage flink/task/Shuffle.Netty.Input.Buffers.inPoolUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage flink/task/Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage flink/task/Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage n
flink:task:Shuffle.Netty.Input.Buffers.inputQueueLength flink/task/Shuffle.Netty.Input.Buffers.inputQueueLength n
flink:task:Shuffle.Netty.Input.numBuffersInLocal flink/task/Shuffle.Netty.Input.numBuffersInLocal n
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.count flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.count n
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate n
flink:task:Shuffle.Netty.Input.numBuffersInRemote flink/task/Shuffle.Netty.Input.numBuffersInRemote n
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.count flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.count n
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate n
flink:task:Shuffle.Netty.Input.numBytesInLocal flink/task/Shuffle.Netty.Input.numBytesInLocal n
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.count flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.count n
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.rate flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.rate n
flink:task:Shuffle.Netty.Input.numBytesInRemote flink/task/Shuffle.Netty.Input.numBytesInRemote n
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.count flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.count n
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.rate flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.rate n
flink:task:Shuffle.Netty.Output.Buffers.outPoolUsage flink/task/Shuffle.Netty.Output.Buffers.outPoolUsage n
flink:task:Shuffle.Netty.Output.Buffers.outputQueueLength flink/task/Shuffle.Netty.Output.Buffers.outputQueueLength n
flink:taskmanager:Status.flink.Memory.Managed.Total flink/taskmanager/Status.flink.Memory.Managed.Total n
flink:taskmanager:Status.flink.Memory.Managed.Used flink/taskmanager/Status.flink.Memory.Managed.Used n
flink:taskmanager:Status.JVM.ClassLoader.ClassesLoaded flink/taskmanager/Status.JVM.ClassLoader.ClassesLoaded n
flink:taskmanager:Status.JVM.ClassLoader.ClassesUnloaded flink/taskmanager/Status.JVM.ClassLoader.ClassesUnloaded n
flink:taskmanager:Status.JVM.CPU.Load flink/taskmanager/Status.JVM.CPU.Load n
flink:taskmanager:Status.JVM.CPU.Time flink/taskmanager/Status.JVM.CPU.Time y
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count n
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time n
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Count flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Count n
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Time flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Time n
flink:taskmanager:Status.JVM.Memory.Direct.Count flink/taskmanager/Status.JVM.Memory.Direct.Count y
flink:taskmanager:Status.JVM.Memory.Direct.MemoryUsed flink/taskmanager/Status.JVM.Memory.Direct.MemoryUsed y
flink:taskmanager:Status.JVM.Memory.Direct.TotalCapacity flink/taskmanager/Status.JVM.Memory.Direct.TotalCapacity y
flink:taskmanager:Status.JVM.Memory.Heap.Committed flink/taskmanager/Status.JVM.Memory.Heap.Committed y
flink:taskmanager:Status.JVM.Memory.Heap.Max flink/taskmanager/Status.JVM.Memory.Heap.Max y
flink:taskmanager:Status.JVM.Memory.Heap.Used flink/taskmanager/Status.JVM.Memory.Heap.Used y
flink:taskmanager:Status.JVM.Memory.Mapped.Count flink/taskmanager/Status.JVM.Memory.Mapped.Count y
flink:taskmanager:Status.JVM.Memory.Mapped.MemoryUsed flink/taskmanager/Status.JVM.Memory.Mapped.MemoryUsed y
flink:taskmanager:Status.JVM.Memory.Mapped.TotalCapacity flink/taskmanager/Status.JVM.Memory.Mapped.TotalCapacity y
flink:taskmanager:Status.JVM.Memory.Metaspace.Committed flink/taskmanager/Status.JVM.Memory.Metaspace.Committed n
flink:taskmanager:Status.JVM.Memory.Metaspace.Max flink/taskmanager/Status.JVM.Memory.Metaspace.Max n
flink:taskmanager:Status.JVM.Memory.Metaspace.Used flink/taskmanager/Status.JVM.Memory.Metaspace.Used n
flink:taskmanager:Status.JVM.Memory.NonHeap.Committed flink/taskmanager/Status.JVM.Memory.NonHeap.Committed n
flink:taskmanager:Status.JVM.Memory.NonHeap.Max flink/taskmanager/Status.JVM.Memory.NonHeap.Max n
flink:taskmanager:Status.JVM.Memory.NonHeap.Used flink/taskmanager/Status.JVM.Memory.NonHeap.Used n
flink:taskmanager:Status.JVM.Threads.Count flink/taskmanager/Status.JVM.Threads.Count n
flink:taskmanager:Status.Network.AvailableMemorySegments flink/taskmanager/Status.Network.AvailableMemorySegments n
flink:taskmanager:Status.Network.TotalMemorySegments flink/taskmanager/Status.Network.TotalMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.AvailableMemory flink/taskmanager/Status.Shuffle.Netty.AvailableMemory n
flink:taskmanager:Status.Shuffle.Netty.AvailableMemorySegments flink/taskmanager/Status.Shuffle.Netty.AvailableMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.TotalMemory flink/taskmanager/Status.Shuffle.Netty.TotalMemory n
flink:taskmanager:Status.Shuffle.Netty.TotalMemorySegments flink/taskmanager/Status.Shuffle.Netty.TotalMemorySegments n
flink:taskmanager:Status.Shuffle.Netty.UsedMemory flink/taskmanager/Status.Shuffle.Netty.UsedMemory n
flink:taskmanager:Status.Shuffle.Netty.UsedMemorySegments flink/taskmanager/Status.Shuffle.Netty.UsedMemorySegments n

Spark 기록 서버 측정항목

Dataproc은 다음 Spark 기록 서비스 JVM 메모리 측정항목을 수집합니다.

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.committed sparkHistoryServer/memory/CommittedHeapMemory y
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.used sparkHistoryServer/memory/UsedHeapMemory y
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.max sparkHistoryServer/memory/MaxHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed sparkHistoryServer/memory/CommittedNonHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.used sparkHistoryServer/memory/UsedNonHeapMemory y
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.max sparkHistoryServer/memory/MaxNonHeapMemory y

HiveServer 2 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
hiveserver2:JVM:Memory:HeapMemoryUsage.committed hiveserver2/memory/CommittedHeapMemory y
hiveserver2:JVM:Memory:HeapMemoryUsage.used hiveserver2/memory/UsedHeapMemory y
hiveserver2:JVM:Memory:HeapMemoryUsage.max hiveserver2/memory/MaxHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.committed hiveserver2/memory/CommittedNonHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used hiveserver2/memory/UsedNonHeapMemory y
hiveserver2:JVM:Memory:NonHeapMemoryUsage.max hiveserver2/memory/MaxNonHeapMemory y

Hive Metastore 측정항목

측정항목 측정항목 탐색기 이름 사용 설정된 측정항목
hivemetastore:API:GetDatabase:Mean hivemetastore/get_database/mean y
hivemetastore:API:CreateDatabase:Mean hivemetastore/create_database/mean y
hivemetastore:API:DropDatabase:Mean hivemetastore/drop_database/mean y
hivemetastore:API:AlterDatabase:Mean hivemetastore/alter_database/mean y
hivemetastore:API:GetAllDatabases:Mean hivemetastore/get_all_databases/mean y
hivemetastore:API:CreateTable:Mean hivemetastore/create_table/mean y
hivemetastore:API:DropTable:Mean hivemetastore/drop_table/mean y
hivemetastore:API:AlterTable:Mean hivemetastore/alter_table/mean y
hivemetastore:API:GetTable:Mean hivemetastore/get_table/mean y
hivemetastore:API:GetAllTables:Mean hivemetastore/get_all_tables/mean y
hivemetastore:API:AddPartitionsReq:Mean hivemetastore/add_partitions_req/mean y
hivemetastore:API:DropPartition:Mean hivemetastore/drop_partition/mean y
hivemetastore:API:AlterPartition:Mean hivemetastore/alter_partition/mean y
hivemetastore:API:GetPartition:Mean hivemetastore/get_partition/mean y
hivemetastore:API:GetPartitionNames:Mean hivemetastore/get_partition_names/mean y
hivemetastore:API:GetPartitionsPs:Mean hivemetastore/get_partitions_ps/mean y
hivemetastore:API:GetPartitionsPsWithAuth:Mean hivemetastore/get_partitions_ps_with_auth/mean y

Hive Metastore 측정항목 측정

통계 척도 샘플 측정항목 샘플 측정항목 이름
최대 hivemetastore:API:GetDatabase:Max hivemetastore/get_database/max
최소 hivemetastore:API:GetDatabase:Min hivemetastore/get_database/min
평균 hivemetastore:API:GetDatabase:Mean hivemetastore/get_database/mean
개수 hivemetastore:API:GetDatabase:Count hivemetastore/get_database/count
50번째 백분위수 hivemetastore:API:GetDatabase:50thPercentile hivemetastore/get_database/median
75번째 백분위수 hivemetastore:API:GetDatabase:75thPercentile hivemetastore/get_database/75th_percentile
95번째 백분위수 hivemetastore:API:GetDatabase:95thPercentile hivemetastore/get_database/95th_percentile
98번째 백분위수 hivemetastore:API:GetDatabase:98thPercentile hivemetastore/get_database/98th_percentile
99번째 백분위수 hivemetastore:API:GetDatabase:99thPercentile hivemetastore/get_database/99th_percentile
999째 백분위수 hivemetastore:API:GetDatabase:999thPercentile hivemetastore/get_database/999th_percentile
StdDev hivemetastore:API:GetDatabase:StdDev hivemetastore/get_database/stddev
FifteenMinuteRate hivemetastore:API:GetDatabase:FifteenMinuteRate hivemetastore/get_database/15min_rate
FiveMinuteRate hivemetastore:API:GetDatabase:FiveMinuteRate hivemetastore/get_database/5min_rate
OneMinuteRate hivemetastore:API:GetDatabase:OneMinuteRate hivemetastore/get_database/1min_rate
MeanRate hivemetastore:API:GetDatabase:MeanRate hivemetastore/get_database/mean_rate

Dataproc 모니터링 에이전트 측정항목

--metric-sources=monitoring-agent-defaults를 설정할 때 Dataproc에서 다음 Dataproc 모니터링 에이전트 측정항목을 수집합니다. 이러한 측정항목은 agent.googleapis.com 프리픽스와 함께 게시됩니다.

CPU
agent.googleapis.com/cpu/load_15m
agent.googleapis.com/cpu/load_1m
agent.googleapis.com/cpu/load_5m
agent.googleapis.com/cpu/usage_time*
agent.googleapis.com/cpu/utilization*

디스크
agent.googleapis.com/disk/bytes_used
agent.googleapis.com/disk/io_time
agent.googleapis.com/disk/merged_operations
agent.googleapis.com/disk/operation_count
agent.googleapis.com/disk/operation_time
agent.googleapis.com/disk/pending_operations
agent.googleapis.com/disk/percent_used
agent.googleapis.com/disk/read_bytes_count

스왑
agent.googleapis.com/swap/bytes_used
agent.googleapis.com/swap/io
agent.googleapis.com/swap/percent_used

메모리
agent.googleapis.com/memory/bytes_used
agent.googleapis.com/memory/percent_used

프로세스 - (몇 가지 속성에 대해 약간 다른 할당량 정책 준수)
agent.googleapis.com/processes/count_by_state
agent.googleapis.com/processes/cpu_time
agent.googleapis.com/processes/disk/read_bytes_count
agent.googleapis.com/processes/disk/write_bytes_count
agent.googleapis.com/processes/fork_count
agent.googleapis.com/processes/rss_usage
agent.googleapis.com/processes/vm_usage

인터페이스
agent.googleapis.com/interface/errors
agent.googleapis.com/interface/packets
agent.googleapis.com/interface/traffic

네트워크
agent.googleapis.com/network/tcp_connections

Monitoring 대시보드 빌드

선택한 Dataproc 측정항목의 차트를 표시하는 Monitoring 대시보드를 빌드할 수 있습니다.

  1. Monitoring 대시보드 개요 페이지에서 + 대시보드 만들기를 선택합니다. 대시 보드의 이름을 입력한 다음 오른쪽 상단 메뉴에서 차트 추가를 클릭하여 차트 추가 창을 엽니다. 리소스 유형으로 'Cloud Dataproc 클러스터'를 선택합니다. 하나 이상의 측정항목과 측정항목 및 차트 속성을 선택합니다. 그런 다음 차트를 저장합니다.

  2. 추가 차트를 대시보드에 추가할 수 있습니다. 대시보드를 저장하면 해당 제목이 Monitoring 대시보드 개요 페이지에 표시됩니다. 대시보드 차트를 대시보드 표시 페이지에서 보고, 업데이트하고, 삭제할 수 있습니다.

다음 단계