Cloud Monitoring은 클라우드 기반 애플리케이션의 성능, 업타임, 전반적인 상태에 관한 정보를 제공합니다. Google Cloud Observability는 Dataproc 클러스터에서 클러스터별 HDFS, YARN, 작업, 작업 측정항목을 포함한 측정항목, 이벤트, 메타데이터를 수집하여 대시보드와 차트를 통해 통계를 생성합니다(Cloud Monitoring Dataproc 측정항목 참조).
비용을 알아보려면 Cloud Monitoring 가격 책정을 참조하세요.
측정항목 데이터 보관에 대한 자세한 내용은 Monitoring 할당량 및 한도를 참조하세요.
Dataproc 리소스 측정항목 수집
Cloud Monitoring은 다음 Dataproc 리소스와 관련된 측정항목을 수집합니다.
- Cloud Dataproc 클러스터
- Cloud Dataproc 작업
- Cloud Dataproc 배치
- Cloud Dataproc 세션
Dataproc 리소스 측정항목은 다음 형식(dataproc.googleapis.com/RESOURCE/METRIC
)으로 수집되며 여러 OSS 측정항목 수집을 포함합니다.
Dataproc 리소스 측정항목 보기
측정항목 탐색기에서 Filter by resource or metric name
상자에 'dataproc'을 입력하고 'Cloud Dataproc' 리소스를 선택하여 Dataproc 리소스 측정항목을 선택하고 볼 수 있습니다.
커스텀 측정항목 수집
Dataproc 클러스터를 만들 때 하나 이상의 커스텀 측정항목 소스에서 측정항목 수집을 사용 설정할 수 있습니다. 측정항목 소스에서 수집할 측정항목을 지정하지 않는 한, 사용 설정된 각 측정항목 소스에서 표준 측정항목 집합이 수집됩니다(사용자 지정 측정항목을 측정항목 '재정의'라고 함).
커스텀 OSS 측정항목은 다음 형식으로 수집됩니다:
custom.googleapis.com/OSS_COMPONENT/METRIC
커스텀 OSS 측정항목 예시:
custom.googleapis.com/spark/driver/DAGScheduler/job/allJobs custom.googleapis.com/hiveserver2/memory/MaxNonHeapMemory
커스텀 측정항목 수집 사용 설정
gcloud CLI 또는 Dataproc API를 사용하여 하나 이상의 측정항목 소스에서 커스텀 측정항목 수집을 사용 설정할 수 있습니다.
gcloud CLI
커스텀 측정항목 수집
gcloud dataproc clusters create --metric-sources
플래그를 사용하여 하나 이상의 측정항목 소스에서 커스텀 측정항목 수집을 사용 설정하세요.
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC_SOURCE(s) \ ... other flags
참고:
--metric-sources
: 커스텀 측정항목 수집을 사용 설정하는 데 필요합니다. 측정항목 소스spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
,monitoring-agent-defaults
중 하나 이상을 지정하세요. 측정항목 소스 이름은 대소문자를 구분하지 않습니다(예: 'yarn' 또는 'YARN'이 허용됨).- 운영 에이전트가 설치되어 있지 않으면 2.2 이미지 버전 클러스터에서 monitoring-agent-defaults를 사용할 수 없습니다.
측정항목 수집 재정의
원하는 경우 --metric-overrides
또는 --metric-overrides-file
플래그를 추가하여 하나 이상의 측정항목 소스에서 커스텀 측정항목을 하나 이상 수집하도록 사용 설정할 수 있습니다.
-
모든 커스텀 측정항목과 모든 Spark 측정항목을 측정항목 재정의로 수집하도록 나열할 수 있습니다. 재정의 측정항목 값은 대소문자를 구분하며 적절한 경우 CamelCase 형식으로 제공해야 합니다.
예를 들면 다음과 같습니다.
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used
yarn:ResourceManager:JvmMetrics:MemHeapMaxM
-
지정된 측정항목 소스에서 재정의된 측정항목만 수집됩니다. 예를 들어 하나 이상의
spark:executive
측정항목이 측정항목 재정의로 나열되면 다른SPARK
측정항목이 수집되지 않습니다. 다른 측정항목 소스의 커스텀 측정항목 수집은 영향을 받지 않습니다. 예를 들어SPARK
및YARN
측정항목 소스가 모두 사용 설정되고 Spark 측정항목에 대해서만 재정의가 제공되는 경우 사용 설정된 YARN 메트릭의 표준 세트가 수집됩니다. -
지정된 측정항목 재정의 소스를 사용 설정해야 합니다. 예를 들어 하나 이상의
spark:driver
측정항목이 측정항목 재정의로 제공되는 경우spark
측정항목 소스를 사용 설정해야 합니다(--metric-sources=spark
).
측정항목 목록 재정의
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC_SOURCE(s) \ --metric-overrides=LIST_OF_METRIC_OVERRIDES \ ... other flags
참고:
--metric-sources
: 커스텀 측정항목 수집을 사용 설정하는 데 필요합니다. 측정항목 소스spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
,monitoring-agent-defaults
중 하나 이상을 지정하세요. 측정항목 소스 이름은 대소문자를 구분하지 않습니다(예: 'yarn' 또는 'YARN'이 허용됨).--metric-overrides
: 다음 형식의 측정항목 목록을 제공합니다.METRIC_SOURCE:INSTANCE:GROUP:METRIC
예:
--metric-overrides=sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
이 플래그는
--metric-overrides-file
플래그의 대안이며 함께 사용할 수 없습니다.
측정항목 파일 재정의
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC-SOURCE(s) \ --metric-overrides-file=METRIC_OVERRIDES_FILENAME \ ... other flags
참고:
-
--metric-sources
: 커스텀 측정항목 수집을 사용 설정하는 데 필요합니다. 측정항목 소스spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
,monitoring-agent-defaults
중 하나 이상을 지정하세요. 측정항목 소스 이름은 대소문자를 구분하지 않습니다(예: 'yarn' 또는 'YARN'이 허용됨). -
--metric-overrides-file
: 다음 형식의 측정항목이 하나 이상 포함된 로컬 또는 Cloud Storage 파일(gs://bucket/filename
)을 지정합니다.METRIC_SOURCE:INSTANCE:GROUP:METRIC
camelcase 형식을 적절하게 사용합니다.예를 들면 다음과 같습니다.
--metric-overrides-file=gs://my-bucket/my-filename.txt
--metric-overrides-file=./local-directory/local-filename.txt
이 플래그는
--metric-overrides
플래그의 대안이며 함께 사용할 수 없습니다.
REST API
clusters.create 요청의 일부로 DataprocMetricConfig를 사용하여 커스텀 측정항목 수집을 사용 설정합니다. 참고: 운영 에이전트가 설치되어 있지 않으면 2.2 이미지 버전 클러스터에서 monitoring-agent-defaults를 사용할 수 없습니다.
커스텀 측정항목 보기
측정항목 탐색기에서 VM Instance
리소스를 선택한 다음 Custom metrics
를 선택하여 Dataproc 리소스 측정항목을 선택하고 볼 수 있습니다.
커스텀 측정항목
Dataproc을 사용 설정하여 다음 테이블에 나열된 커스텀 측정항목을 수집할 수 있습니다.
연결된 측정항목 소스를 사용 설정할 때 Dataproc이 측정항목을 수집하면 사용 설정된 측정항목 열이 'y'로 표시됩니다.
측정항목 소스의 기본 사용 설정된 측정항목의 표준 세트 수집을 재정의하는 경우 모든 측정항목과 모든 Spark 측정항목을 수집에 사용 설정할 수 있습니다(커스텀 측정항목 수집 사용 설정 참조).
Dataproc은 모니터링 에이전트를 사용하여 측정항목을 수집합니다. 측정항목 소스를 사용 설정하면 에이전트 측정항목 수집이 사용 설정됩니다. 이 측정항목은 사용자에게 청구되지 않습니다. Dataproc은 이를 사용하여 측정항목 수집 문제를 진단합니다.
Hadoop 측정항목
HDFS 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
hdfs:NameNode:FSNamesystem:CapacityTotalGB | dfs/FSNamesystem/CapacityTotalGB | y |
hdfs:NameNode:FSNamesystem:CapacityUsedGB | dfs/FSNamesystem/CapacityUsedGB | y |
hdfs:NameNode:FSNamesystem:CapacityRemainingGB | dfs/FSNamesystem/CapacityRemainingGB | y |
hdfs:NameNode:FSNamesystem:FilesTotal | dfs/FSNamesystem/FilesTotal | y |
hdfs:NameNode:FSNamesystem:MissingBlocks | dfs/FSNamesystem/MissingBlocks | n |
hdfs:NameNode:FSNamesystem:ExpiredHeartbeats | dfs/FSNamesystem/ExpiredHeartbeats | n |
hdfs:NameNode:FSNamesystem:TransactionsSinceLastCheckpoint | dfs/FSNamesystem/TransactionsSinceLastCheckpoint | n |
hdfs:NameNode:FSNamesystem:TransactionsSinceLastLogRoll | dfs/FSNamesystem/TransactionsSinceLastLogRoll | n |
hdfs:NameNode:FSNamesystem:LastWrittenTransactionId | dfs/FSNamesystem/LastWrittenTransactionId | n |
hdfs:NameNode:FSNamesystem:CapacityTotal | dfs/FSNamesystem/CapacityTotal | n |
hdfs:NameNode:FSNamesystem:CapacityUsed | dfs/FSNamesystem/CapacityUsed | n |
hdfs:NameNode:FSNamesystem:CapacityRemaining | dfs/FSNamesystem/CapacityRemaining | n |
hdfs:NameNode:FSNamesystem:CapacityUsedNonDFS | dfs/FSNamesystem/CapacityUsedNonDFS | n |
hdfs:NameNode:FSNamesystem:TotalLoad | dfs/FSNamesystem/TotalLoad | n |
hdfs:NameNode:FSNamesystem:SnapshottableDirectories | dfs/FSNamesystem/SnapshottableDirectories | n |
hdfs:NameNode:FSNamesystem:Snapshots | dfs/FSNamesystem/Snapshots | n |
hdfs:NameNode:FSNamesystem:BlocksTotal | dfs/FSNamesystem/BlocksTotal | n |
hdfs:NameNode:FSNamesystem:PendingReplicationBlocks | dfs/FSNamesystem/PendingReplicationBlocks | n |
hdfs:NameNode:FSNamesystem:UnderReplicatedBlocks | dfs/FSNamesystem/UnderReplicatedBlocks | n |
hdfs:NameNode:FSNamesystem:CorruptBlocks | dfs/FSNamesystem/CorruptBlocks | n |
hdfs:NameNode:FSNamesystem:ScheduledReplicationBlocks | dfs/FSNamesystem/ScheduledReplicationBlocks | n |
hdfs:NameNode:FSNamesystem:PendingDeletionBlocks | dfs/FSNamesystem/PendingDeletionBlocks | n |
hdfs:NameNode:FSNamesystem:ExcessBlocks | dfs/FSNamesystem/ExcessBlocks | n |
hdfs:NameNode:FSNamesystem:PostponedMisreplicatedBlocks | dfs/FSNamesystem/PostponedMisreplicatedBlocks | n |
hdfs:NameNode:FSNamesystem:PendingDataNodeMessageCourt | dfs/FSNamesystem/PendingDataNodeMessageCourt | n |
hdfs:NameNode:FSNamesystem:MillisSinceLastLoadedEdits | dfs/FSNamesystem/MillisSinceLastLoadedEdits | n |
hdfs:NameNode:FSNamesystem:BlockCapacity | dfs/FSNamesystem/BlockCapacity | n |
hdfs:NameNode:FSNamesystem:StaleDataNodes | dfs/FSNamesystem/StaleDataNodes | n |
hdfs:NameNode:FSNamesystem:TotalFiles | dfs/FSNamesystem/TotalFiles | n |
hdfs:NameNode:JvmMetrics:MemHeapUsedM | dfs/jvm/MemHeapUsedM | n |
hdfs:NameNode:JvmMetrics:MemHeapCommittedM | dfs/jvm/MemHeapCommittedM | n |
hdfs:NameNode:JvmMetrics:MemHeapMaxM | dfs/jvm/MemHeapMaxM | n |
hdfs:NameNode:JvmMetrics:MemMaxM | dfs/jvm/MemMaxM | n |
YARN 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
yarn:ResourceManager:ClusterMetrics:NumActiveNMs | yarn/ClusterMetrics/NumActiveNMs | y |
yarn:ResourceManager:ClusterMetrics:NumDecommissionedNMs | yarn/ClusterMetrics/NumDecommissionedNMs | n |
yarn:ResourceManager:ClusterMetrics:NumLostNMs | yarn/ClusterMetrics/NumLostNMs | n |
yarn:ResourceManager:ClusterMetrics:NumUnhealthyNMs | yarn/ClusterMetrics/NumUnhealthyNMs | n |
yarn:ResourceManager:ClusterMetrics:NumRebootedNMs | yarn/ClusterMetrics/NumRebootedNMs | n |
yarn:ResourceManager:QueueMetrics:running_0 | yarn/QueueMetrics/running_0 | y |
yarn:ResourceManager:QueueMetrics:running_60 | yarn/QueueMetrics/running_60 | y |
yarn:ResourceManager:QueueMetrics:running_300 | yarn/QueueMetrics/running_300 | y |
yarn:ResourceManager:QueueMetrics:running_1440 | yarn/QueueMetrics/running_1440 | y |
yarn:ResourceManager:QueueMetrics:AppsSubmitted | yarn/QueueMetrics/AppsSubmitted | y |
yarn:ResourceManager:QueueMetrics:AvailableMB | yarn/QueueMetrics/AvailableMB | y |
yarn:ResourceManager:QueueMetrics:PendingContainers | yarn/QueueMetrics/PendingContainers | y |
yarn:ResourceManager:QueueMetrics:AppsRunning | yarn/QueueMetrics/AppsRunning | n |
yarn:ResourceManager:QueueMetrics:AppsPending | yarn/QueueMetrics/AppsPending | n |
yarn:ResourceManager:QueueMetrics:AppsCompleted | yarn/QueueMetrics/AppsCompleted | n |
yarn:ResourceManager:QueueMetrics:AppsKilled | yarn/QueueMetrics/AppsKilled | n |
yarn:ResourceManager:QueueMetrics:AppsFailed | yarn/QueueMetrics/AppsFailed | n |
yarn:ResourceManager:QueueMetrics:AllocatedMB | yarn/QueueMetrics/AllocatedMB | n |
yarn:ResourceManager:QueueMetrics:AllocatedVCores | yarn/QueueMetrics/AllocatedVCores | n |
yarn:ResourceManager:QueueMetrics:AllocatedContainers | yarn/QueueMetrics/AllocatedContainers | n |
yarn:ResourceManager:QueueMetrics:AggregateContainersAllocated | yarn/QueueMetrics/AggregateContainersAllocated | n |
yarn:ResourceManager:QueueMetrics:AggregateContainersReleased | yarn/QueueMetrics/AggregateContainersReleased | n |
yarn:ResourceManager:QueueMetrics:AvailableVCores | yarn/QueueMetrics/AvailableVCores | n |
yarn:ResourceManager:QueueMetrics:PendingMB | yarn/QueueMetrics/PendingMB | n |
yarn:ResourceManager:QueueMetrics:PendingVCores | yarn/QueueMetrics/PendingVCores | n |
yarn:ResourceManager:QueueMetrics:ReservedMB | yarn/QueueMetrics/ReservedMB | n |
yarn:ResourceManager:QueueMetrics:ReservedVCores | yarn/QueueMetrics/ReservedVCores | n |
yarn:ResourceManager:QueueMetrics:ReservedContainers | yarn/QueueMetrics/ReservedContainers | n |
yarn:ResourceManager:QueueMetrics:ActiveUsers | yarn/QueueMetrics/ActiveUsers | n |
yarn:ResourceManager:QueueMetrics:ActiveApplications | yarn/QueueMetrics/ActiveApplications | n |
yarn:ResourceManager:QueueMetrics:FairShareMB | yarn/QueueMetrics/FairShareMB | n |
yarn:ResourceManager:QueueMetrics:FairShareVCores | yarn/QueueMetrics/FairShareVCores | n |
yarn:ResourceManager:QueueMetrics:MinShareMB | yarn/QueueMetrics/MinShareMB | n |
yarn:ResourceManager:QueueMetrics:MinShareVCores | yarn/QueueMetrics/MinShareVCores | n |
yarn:ResourceManager:QueueMetrics:MaxShareMB | yarn/QueueMetrics/MaxShareMB | n |
yarn:ResourceManager:QueueMetrics:MaxShareVCores | yarn/QueueMetrics/MaxShareVCores | n |
yarn:ResourceManager:JvmMetrics:MemHeapUsedM | yarn/jvm/MemHeapUsedM | n |
yarn:ResourceManager:JvmMetrics:MemHeapCommittedM | yarn/jvm/MemHeapCommittedM | n |
yarn:ResourceManager:JvmMetrics:MemHeapMaxM | yarn/jvm/MemHeapMaxM | n |
yarn:ResourceManager:JvmMetrics:MemMaxM | yarn/jvm/MemMaxM | n |
Spark 측정항목
Spark 드라이버 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
spark:driver:BlockManager:disk.diskSpaceUsed_MB | spark/driver/BlockManager/disk/diskSpaceUsed_MB | y |
spark:driver:BlockManager:memory.maxMem_MB | spark/driver/BlockManager/memory/maxMem_MB | y |
spark:driver:BlockManager:memory.memUsed_MB | spark/driver/BlockManager/memory/memUsed_MB | y |
spark:driver:DAGScheduler:job.allJobs | spark/driver/DAGScheduler/job/allJobs | y |
spark:driver:DAGScheduler:stage.failedStages | spark/driver/DAGScheduler/stage/failedStages | y |
spark:driver:DAGScheduler:stage.waitingStages | spark/driver/DAGScheduler/stage/waitingStages | y |
Spark 실행자 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
spark:executor:executor:bytesRead | spark/executor/bytesRead | y |
spark:executor:executor:bytesWritten | spark/executor/bytesWritten | y |
spark:executor:executor:cpuTime | spark/executor/cpuTime | y |
spark:executor:executor:diskBytesSpilled | spark/executor/diskBytesSpilled | y |
spark:executor:executor:recordsRead | spark/executor/recordsRead | y |
spark:executor:executor:recordsWritten | spark/executor/recordsWritten | y |
spark:executor:executor:runTime | spark/executor/runTime | y |
spark:executor:executor:shuffleRecordsRead | spark/executor/shuffleRecordsRead | y |
spark:executor:executor:shuffleRecordsWritten | spark/executor/shuffleRecordsWritten | y |
Flink 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
flink:jobmanager:numRegisteredTaskManagers | flink/jobmanager/numRegisteredTaskManagers | n |
flink:jobmanager:numRunningJobs | flink/jobmanager/numRunningJobs | n |
flink:jobmanager:Status.JVM.ClassLoader.ClassesLoaded | flink/jobmanager/Status.JVM.ClassLoader.ClassesLoaded | n |
flink:jobmanager:Status.JVM.ClassLoader.ClassesUnloaded | flink/jobmanager/Status.JVM.ClassLoader.ClassesUnloaded | n |
flink:jobmanager:Status.JVM.CPU.Load | flink/jobmanager/Status.JVM.CPU.Load | n |
flink:jobmanager:Status.JVM.CPU.Time | flink/jobmanager/Status.JVM.CPU.Time | y |
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count | flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time | flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Count | flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Count | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Time | flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Time | n |
flink:jobmanager:Status.JVM.Memory.Direct.Count | flink/jobmanager/Status.JVM.Memory.Direct.Count | y |
flink:jobmanager:Status.JVM.Memory.Direct.MemoryUsed | flink/jobmanager/Status.JVM.Memory.Direct.MemoryUsed | y |
flink:jobmanager:Status.JVM.Memory.Direct.TotalCapacity | flink/jobmanager/Status.JVM.Memory.Direct.TotalCapacity | y |
flink:jobmanager:Status.JVM.Memory.Heap.Committed | flink/jobmanager/Status.JVM.Memory.Heap.Committed | y |
flink:jobmanager:Status.JVM.Memory.Heap.Max | flink/jobmanager/Status.JVM.Memory.Heap.Max | y |
flink:jobmanager:Status.JVM.Memory.Heap.Used | flink/jobmanager/Status.JVM.Memory.Heap.Used | y |
flink:jobmanager:Status.JVM.Memory.Mapped.Count | flink/jobmanager/Status.JVM.Memory.Mapped.Count | y |
flink:jobmanager:Status.JVM.Memory.Mapped.MemoryUsed | flink/jobmanager/Status.JVM.Memory.Mapped.MemoryUsed | y |
flink:jobmanager:Status.JVM.Memory.Mapped.TotalCapacity | flink/jobmanager/Status.JVM.Memory.Mapped.TotalCapacity | y |
flink:jobmanager:Status.JVM.Memory.Metaspace.Committed | flink/jobmanager/Status.JVM.Memory.Metaspace.Committed | n |
flink:jobmanager:Status.JVM.Memory.Metaspace.Max | flink/jobmanager/Status.JVM.Memory.Metaspace.Max | n |
flink:jobmanager:Status.JVM.Memory.Metaspace.Used | flink/jobmanager/Status.JVM.Memory.Metaspace.Used | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Committed | flink/jobmanager/Status.JVM.Memory.NonHeap.Committed | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Max | flink/jobmanager/Status.JVM.Memory.NonHeap.Max | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Used | flink/jobmanager/Status.JVM.Memory.NonHeap.Used | n |
flink:jobmanager:Status.JVM.Threads.Count | flink/jobmanager/Status.JVM.Threads.Count | n |
flink:jobmanager:taskSlotsAvailable | flink/jobmanager/taskSlotsAvailable | y |
flink:jobmanager:taskSlotsTotal | flink/jobmanager/taskSlotsTotal | y |
flink:operator:numRecordsIn | flink/operator/numRecordsIn | n |
flink:operator:numRecordsInPerSecond.count | flink/operator/numRecordsInPerSecond.count | n |
flink:operator:numRecordsInPerSecond.rate | flink/operator/numRecordsInPerSecond.rate | n |
flink:operator:numRecordsOut | flink/operator/numRecordsOut | n |
flink:operator:numRecordsOutPerSecond.count | flink/operator/numRecordsOutPerSecond.count | n |
flink:operator:numRecordsOutPerSecond.rate | flink/operator/numRecordsOutPerSecond.rate | n |
flink:operator:numSplitsProcessed | flink/operator/numSplitsProcessed | n |
flink:task:buffers.inPoolUsage | flink/task/buffers.inPoolUsage | n |
flink:task:buffers.inputExclusiveBuffersUsage | flink/task/buffers.inputExclusiveBuffersUsage | n |
flink:task:buffers.inputFloatingBuffersUsage | flink/task/buffers.inputFloatingBuffersUsage | n |
flink:task:buffers.inputQueueLength | flink/task/buffers.inputQueueLength | n |
flink:task:buffers.outPoolUsage | flink/task/buffers.outPoolUsage | n |
flink:task:buffers.outputQueueLength | flink/task/buffers.outputQueueLength | n |
flink:task:idleTimeMsPerSecond.count | flink/task/idleTimeMsPerSecond.count | n |
flink:task:idleTimeMsPerSecond.rate | flink/task/idleTimeMsPerSecond.rate | n |
flink:task:numBuffersInLocal | flink/task/numBuffersInLocal | n |
flink:task:numBuffersInLocalPerSecond.count | flink/task/numBuffersInLocalPerSecond.count | n |
flink:task:numBuffersInLocalPerSecond.rate | flink/task/numBuffersInLocalPerSecond.rate | n |
flink:task:numBuffersInRemote | flink/task/numBuffersInRemote | n |
flink:task:numBuffersInRemotePerSecond.count | flink/task/numBuffersInRemotePerSecond.count | n |
flink:task:numBuffersInRemotePerSecond.rate | flink/task/numBuffersInRemotePerSecond.rate | n |
flink:task:numBuffersOut | flink/task/numBuffersOut | n |
flink:task:numBuffersOutPerSecond.count | flink/task/numBuffersOutPerSecond.count | n |
flink:task:numBuffersOutPerSecond.rate | flink/task/numBuffersOutPerSecond.rate | n |
flink:task:numBytesIn | flink/task/numBytesIn | n |
flink:task:numBytesInLocal | flink/task/numBytesInLocal | n |
flink:task:numBytesInLocalPerSecond.count | flink/task/numBytesInLocalPerSecond.count | n |
flink:task:numBytesInLocalPerSecond.rate | flink/task/numBytesInLocalPerSecond.rate | n |
flink:task:numBytesInPerSecond.count | flink/task/numBytesInPerSecond.count | n |
flink:task:numBytesInPerSecond.rate | flink/task/numBytesInPerSecond.rate | n |
flink:task:numBytesInRemote | flink/task/numBytesInRemote | n |
flink:task:numBytesInRemotePerSecond.count | flink/task/numBytesInRemotePerSecond.count | n |
flink:task:numBytesInRemotePerSecond.rate | flink/task/numBytesInRemotePerSecond.rate | n |
flink:task:numBytesOut | flink/task/numBytesOut | n |
flink:task:numBytesOutPerSecond.count | flink/task/numBytesOutPerSecond.count | n |
flink:task:numBytesOutPerSecond.rate | flink/task/numBytesOutPerSecond.rate | n |
flink:task:numRecordsIn | flink/task/numRecordsIn | n |
flink:task:numRecordsInPerSecond.count | flink/task/numRecordsInPerSecond.count | n |
flink:task:numRecordsInPerSecond.rate | flink/task/numRecordsInPerSecond.rate | n |
flink:task:numRecordsOut | flink/task/numRecordsOut | n |
flink:task:numRecordsOutPerSecond.count | flink/task/numRecordsOutPerSecond.count | n |
flink:task:numRecordsOutPerSecond.rate | flink/task/numRecordsOutPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.Buffers.inPoolUsage | flink/task/Shuffle.Netty.Input.Buffers.inPoolUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage | flink/task/Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage | flink/task/Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputQueueLength | flink/task/Shuffle.Netty.Input.Buffers.inputQueueLength | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocal | flink/task/Shuffle.Netty.Input.numBuffersInLocal | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.count | flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate | flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemote | flink/task/Shuffle.Netty.Input.numBuffersInRemote | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.count | flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate | flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBytesInLocal | flink/task/Shuffle.Netty.Input.numBytesInLocal | n |
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.count | flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.rate | flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBytesInRemote | flink/task/Shuffle.Netty.Input.numBytesInRemote | n |
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.count | flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.rate | flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.rate | n |
flink:task:Shuffle.Netty.Output.Buffers.outPoolUsage | flink/task/Shuffle.Netty.Output.Buffers.outPoolUsage | n |
flink:task:Shuffle.Netty.Output.Buffers.outputQueueLength | flink/task/Shuffle.Netty.Output.Buffers.outputQueueLength | n |
flink:taskmanager:Status.flink.Memory.Managed.Total | flink/taskmanager/Status.flink.Memory.Managed.Total | n |
flink:taskmanager:Status.flink.Memory.Managed.Used | flink/taskmanager/Status.flink.Memory.Managed.Used | n |
flink:taskmanager:Status.JVM.ClassLoader.ClassesLoaded | flink/taskmanager/Status.JVM.ClassLoader.ClassesLoaded | n |
flink:taskmanager:Status.JVM.ClassLoader.ClassesUnloaded | flink/taskmanager/Status.JVM.ClassLoader.ClassesUnloaded | n |
flink:taskmanager:Status.JVM.CPU.Load | flink/taskmanager/Status.JVM.CPU.Load | n |
flink:taskmanager:Status.JVM.CPU.Time | flink/taskmanager/Status.JVM.CPU.Time | y |
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count | flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time | flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Count | flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Count | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Time | flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Time | n |
flink:taskmanager:Status.JVM.Memory.Direct.Count | flink/taskmanager/Status.JVM.Memory.Direct.Count | y |
flink:taskmanager:Status.JVM.Memory.Direct.MemoryUsed | flink/taskmanager/Status.JVM.Memory.Direct.MemoryUsed | y |
flink:taskmanager:Status.JVM.Memory.Direct.TotalCapacity | flink/taskmanager/Status.JVM.Memory.Direct.TotalCapacity | y |
flink:taskmanager:Status.JVM.Memory.Heap.Committed | flink/taskmanager/Status.JVM.Memory.Heap.Committed | y |
flink:taskmanager:Status.JVM.Memory.Heap.Max | flink/taskmanager/Status.JVM.Memory.Heap.Max | y |
flink:taskmanager:Status.JVM.Memory.Heap.Used | flink/taskmanager/Status.JVM.Memory.Heap.Used | y |
flink:taskmanager:Status.JVM.Memory.Mapped.Count | flink/taskmanager/Status.JVM.Memory.Mapped.Count | y |
flink:taskmanager:Status.JVM.Memory.Mapped.MemoryUsed | flink/taskmanager/Status.JVM.Memory.Mapped.MemoryUsed | y |
flink:taskmanager:Status.JVM.Memory.Mapped.TotalCapacity | flink/taskmanager/Status.JVM.Memory.Mapped.TotalCapacity | y |
flink:taskmanager:Status.JVM.Memory.Metaspace.Committed | flink/taskmanager/Status.JVM.Memory.Metaspace.Committed | n |
flink:taskmanager:Status.JVM.Memory.Metaspace.Max | flink/taskmanager/Status.JVM.Memory.Metaspace.Max | n |
flink:taskmanager:Status.JVM.Memory.Metaspace.Used | flink/taskmanager/Status.JVM.Memory.Metaspace.Used | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Committed | flink/taskmanager/Status.JVM.Memory.NonHeap.Committed | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Max | flink/taskmanager/Status.JVM.Memory.NonHeap.Max | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Used | flink/taskmanager/Status.JVM.Memory.NonHeap.Used | n |
flink:taskmanager:Status.JVM.Threads.Count | flink/taskmanager/Status.JVM.Threads.Count | n |
flink:taskmanager:Status.Network.AvailableMemorySegments | flink/taskmanager/Status.Network.AvailableMemorySegments | n |
flink:taskmanager:Status.Network.TotalMemorySegments | flink/taskmanager/Status.Network.TotalMemorySegments | n |
flink:taskmanager:Status.Shuffle.Netty.AvailableMemory | flink/taskmanager/Status.Shuffle.Netty.AvailableMemory | n |
flink:taskmanager:Status.Shuffle.Netty.AvailableMemorySegments | flink/taskmanager/Status.Shuffle.Netty.AvailableMemorySegments | n |
flink:taskmanager:Status.Shuffle.Netty.TotalMemory | flink/taskmanager/Status.Shuffle.Netty.TotalMemory | n |
flink:taskmanager:Status.Shuffle.Netty.TotalMemorySegments | flink/taskmanager/Status.Shuffle.Netty.TotalMemorySegments | n |
flink:taskmanager:Status.Shuffle.Netty.UsedMemory | flink/taskmanager/Status.Shuffle.Netty.UsedMemory | n |
flink:taskmanager:Status.Shuffle.Netty.UsedMemorySegments | flink/taskmanager/Status.Shuffle.Netty.UsedMemorySegments | n |
Spark 기록 서버 측정항목
Dataproc은 다음 Spark 기록 서비스 JVM 메모리 측정항목을 수집합니다.
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.committed | sparkHistoryServer/memory/CommittedHeapMemory | y |
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.used | sparkHistoryServer/memory/UsedHeapMemory | y |
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.max | sparkHistoryServer/memory/MaxHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed | sparkHistoryServer/memory/CommittedNonHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.used | sparkHistoryServer/memory/UsedNonHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.max | sparkHistoryServer/memory/MaxNonHeapMemory | y |
HiveServer 2 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
hiveserver2:JVM:Memory:HeapMemoryUsage.committed | hiveserver2/memory/CommittedHeapMemory | y |
hiveserver2:JVM:Memory:HeapMemoryUsage.used | hiveserver2/memory/UsedHeapMemory | y |
hiveserver2:JVM:Memory:HeapMemoryUsage.max | hiveserver2/memory/MaxHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.committed | hiveserver2/memory/CommittedNonHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used | hiveserver2/memory/UsedNonHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.max | hiveserver2/memory/MaxNonHeapMemory | y |
Hive Metastore 측정항목
측정항목 | 측정항목 탐색기 이름 | 사용 설정된 측정항목 |
---|---|---|
hivemetastore:API:GetDatabase:Mean | hivemetastore/get_database/mean | y |
hivemetastore:API:CreateDatabase:Mean | hivemetastore/create_database/mean | y |
hivemetastore:API:DropDatabase:Mean | hivemetastore/drop_database/mean | y |
hivemetastore:API:AlterDatabase:Mean | hivemetastore/alter_database/mean | y |
hivemetastore:API:GetAllDatabases:Mean | hivemetastore/get_all_databases/mean | y |
hivemetastore:API:CreateTable:Mean | hivemetastore/create_table/mean | y |
hivemetastore:API:DropTable:Mean | hivemetastore/drop_table/mean | y |
hivemetastore:API:AlterTable:Mean | hivemetastore/alter_table/mean | y |
hivemetastore:API:GetTable:Mean | hivemetastore/get_table/mean | y |
hivemetastore:API:GetAllTables:Mean | hivemetastore/get_all_tables/mean | y |
hivemetastore:API:AddPartitionsReq:Mean | hivemetastore/add_partitions_req/mean | y |
hivemetastore:API:DropPartition:Mean | hivemetastore/drop_partition/mean | y |
hivemetastore:API:AlterPartition:Mean | hivemetastore/alter_partition/mean | y |
hivemetastore:API:GetPartition:Mean | hivemetastore/get_partition/mean | y |
hivemetastore:API:GetPartitionNames:Mean | hivemetastore/get_partition_names/mean | y |
hivemetastore:API:GetPartitionsPs:Mean | hivemetastore/get_partitions_ps/mean | y |
hivemetastore:API:GetPartitionsPsWithAuth:Mean | hivemetastore/get_partitions_ps_with_auth/mean | y |
Hive Metastore 측정항목 측정
통계 척도 | 샘플 측정항목 | 샘플 측정항목 이름 |
---|---|---|
최대 | hivemetastore:API:GetDatabase:Max | hivemetastore/get_database/max |
최소 | hivemetastore:API:GetDatabase:Min | hivemetastore/get_database/min |
평균 | hivemetastore:API:GetDatabase:Mean | hivemetastore/get_database/mean |
개수 | hivemetastore:API:GetDatabase:Count | hivemetastore/get_database/count |
50번째 백분위수 | hivemetastore:API:GetDatabase:50thPercentile | hivemetastore/get_database/median |
75번째 백분위수 | hivemetastore:API:GetDatabase:75thPercentile | hivemetastore/get_database/75th_percentile |
95번째 백분위수 | hivemetastore:API:GetDatabase:95thPercentile | hivemetastore/get_database/95th_percentile |
98번째 백분위수 | hivemetastore:API:GetDatabase:98thPercentile | hivemetastore/get_database/98th_percentile |
99번째 백분위수 | hivemetastore:API:GetDatabase:99thPercentile | hivemetastore/get_database/99th_percentile |
999째 백분위수 | hivemetastore:API:GetDatabase:999thPercentile | hivemetastore/get_database/999th_percentile |
StdDev | hivemetastore:API:GetDatabase:StdDev | hivemetastore/get_database/stddev |
FifteenMinuteRate | hivemetastore:API:GetDatabase:FifteenMinuteRate | hivemetastore/get_database/15min_rate |
FiveMinuteRate | hivemetastore:API:GetDatabase:FiveMinuteRate | hivemetastore/get_database/5min_rate |
OneMinuteRate | hivemetastore:API:GetDatabase:OneMinuteRate | hivemetastore/get_database/1min_rate |
MeanRate | hivemetastore:API:GetDatabase:MeanRate | hivemetastore/get_database/mean_rate |
Dataproc 모니터링 에이전트 측정항목
--metric-sources=monitoring-agent-defaults를 설정할 때 Dataproc에서 다음 Dataproc 모니터링 에이전트 측정항목을 수집합니다.
이러한 측정항목은 agent.googleapis.com
프리픽스와 함께 게시됩니다.
CPU
agent.googleapis.com/cpu/load_15m
agent.googleapis.com/cpu/load_1m
agent.googleapis.com/cpu/load_5m
agent.googleapis.com/cpu/usage_time*
agent.googleapis.com/cpu/utilization*
디스크
agent.googleapis.com/disk/bytes_used
agent.googleapis.com/disk/io_time
agent.googleapis.com/disk/merged_operations
agent.googleapis.com/disk/operation_count
agent.googleapis.com/disk/operation_time
agent.googleapis.com/disk/pending_operations
agent.googleapis.com/disk/percent_used
agent.googleapis.com/disk/read_bytes_count
스왑
agent.googleapis.com/swap/bytes_used
agent.googleapis.com/swap/io
agent.googleapis.com/swap/percent_used
메모리
agent.googleapis.com/memory/bytes_used
agent.googleapis.com/memory/percent_used
프로세스 - (몇 가지 속성에 대해 약간 다른 할당량 정책 준수)
agent.googleapis.com/processes/count_by_state
agent.googleapis.com/processes/cpu_time
agent.googleapis.com/processes/disk/read_bytes_count
agent.googleapis.com/processes/disk/write_bytes_count
agent.googleapis.com/processes/fork_count
agent.googleapis.com/processes/rss_usage
agent.googleapis.com/processes/vm_usage
인터페이스
agent.googleapis.com/interface/errors
agent.googleapis.com/interface/packets
agent.googleapis.com/interface/traffic
네트워크
agent.googleapis.com/network/tcp_connections
Monitoring 대시보드 빌드
선택한 Dataproc 측정항목의 차트를 표시하는 Monitoring 대시보드를 빌드할 수 있습니다.
Monitoring 대시보드 개요 페이지에서 + 대시보드 만들기를 선택합니다. 대시 보드의 이름을 입력한 다음 오른쪽 상단 메뉴에서 차트 추가를 클릭하여 차트 추가 창을 엽니다. 리소스 유형으로 'Cloud Dataproc 클러스터'를 선택합니다. 하나 이상의 측정항목과 측정항목 및 차트 속성을 선택합니다. 그런 다음 차트를 저장합니다.
추가 차트를 대시보드에 추가할 수 있습니다. 대시보드를 저장하면 해당 제목이 Monitoring 대시보드 개요 페이지에 표시됩니다. 대시보드 차트를 대시보드 표시 페이지에서 보고, 업데이트하고, 삭제할 수 있습니다.