Cloud Monitoring memberikan visibilitas terkait performa, waktu beroperasi, dan kondisi keseluruhan aplikasi yang didukung oleh teknologi cloud. Kemampuan Observabilitas Google Cloud mengumpulkan dan menyerap metrik, peristiwa, dan metadata dari cluster Dataproc, termasuk metrik HDFS per cluster, YARN, tugas, dan operasi, untuk menghasilkan insight melalui dasbor dan diagram (lihat metrik Dataproc Cloud Monitoring).
Lihat Harga Cloud Monitoring untuk mengetahui biaya Anda.
Lihat Kuota dan batas Monitoring untuk mengetahui informasi tentang retensi data metrik.
Kumpulan metrik resource Dataproc
Cloud Monitoring mengumpulkan metrik yang terkait dengan resource Dataproc berikut:
- Cluster Cloud Dataproc
- Tugas Cloud Dataproc
- Batch Cloud Dataproc
- Sesi Cloud Dataproc
Metrik resource Dataproc dikumpulkan dalam format berikut: dataproc.googleapis.com/RESOURCE/METRIC
, dan mencakup kumpulan beberapa metrik OSS.
Melihat metrik resource Dataproc
Anda dapat memilih dan menampilkan metrik resource Dataproc di Metrics Explorer dengan mengetik "dataproc" di kotak Filter by resource or metric name
, lalu memilih resource "Cloud Dataproc".
Pengumpulan metrik kustom
Saat membuat cluster Dataproc, Anda dapat mengaktifkan kumpulan metrik dari satu atau beberapa sumber metrik kustom. Kumpulan metrik standar dikumpulkan dari setiap sumber metrik yang diaktifkan, kecuali jika Anda menentukan metrik yang akan dikumpulkan dari sumber metrik (metrik yang ditentukan pengguna disebut "penggantian" metrik).
Metrik OSS kustom dikumpulkan dalam format berikut:
custom.googleapis.com/OSS_COMPONENT/METRIC
Contoh metrik OSS kustom:
custom.googleapis.com/spark/driver/DAGScheduler/job/allJobs custom.googleapis.com/hiveserver2/memory/MaxNonHeapMemory
Aktifkan kumpulan metrik kustom
Anda dapat menggunakan gcloud CLI atau Dataproc API untuk mengaktifkan pengumpulan metrik kustom dari satu atau beberapa sumber metrik.
gcloud CLI
Pengumpulan metrik kustom
Gunakan flag
gcloud dataproc clusters create --metric-sources
untuk mengaktifkan pengumpulan metrik kustom dari satu atau beberapa sumber metrik.
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC_SOURCE(s) \ ... other flags
Catatan:
--metric-sources
: Wajib untuk mengaktifkan pengumpulan metrik kustom. Tentukan satu atau beberapa sumber metrik berikut:spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
, danmonitoring-agent-defaults
. Nama sumber metrik tidak peka huruf besar/kecil, misalnya, "yarn" atau "YARN" dapat diterima.- monitoring-agent-defaults tidak tersedia di cluster versi image 2.2 kecuali jika Agen Operasional diinstal.
Mengganti pengumpulan metrik
Secara opsional, tambahkan flag --metric-overrides
atau --metric-overrides-file
untuk mengaktifkan pengumpulan satu atau beberapa metrik kustom dari satu atau beberapa sumber metrik.
-
Setiap metrik kustom dan semua metrik Spark dapat dicantumkan untuk pengumpulan sebagai penggantian metrik. Nilai metrik penggantian peka huruf besar/kecil, dan harus diberikan, jika sesuai, dalam format CamelCase.
Contoh:
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used
yarn:ResourceManager:JvmMetrics:MemHeapMaxM
-
Hanya metrik yang diganti dan ditentukan yang akan dikumpulkan dari sumber metrik tertentu. Misalnya, jika satu atau beberapa metrik
spark:executor
tercantum sebagai penggantian metrik, metrikSPARK
lainnya tidak akan dikumpulkan. Pengumpulan metrik kustom dari sumber metrik lain tidak akan terpengaruh. Misalnya, jika sumber metrikSPARK
danYARN
diaktifkan, dan penggantian hanya disediakan untuk metrik Spark, kumpulan standar metrik YARN yang diaktifkan akan dikumpulkan. -
Sumber penggantian metrik yang ditentukan harus diaktifkan. Misalnya, jika satu atau beberapa metrik
spark:driver
disediakan sebagai penggantian metrik, sumber metrikspark
harus diaktifkan (--metric-sources=spark
).
Ganti daftar metrik
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC_SOURCE(s) \ --metric-overrides=LIST_OF_METRIC_OVERRIDES \ ... other flags
Catatan:
--metric-sources
: Wajib untuk mengaktifkan pengumpulan metrik kustom. Tentukan satu atau beberapa sumber metrik berikut:spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
, danmonitoring-agent-defaults
. Nama sumber metrik tidak peka huruf besar/kecil, misalnya, "yarn" atau "YARN" dapat diterima.--metric-overrides
: Berikan daftar metrik dalam format berikut:METRIC_SOURCE:INSTANCE:GROUP:METRIC
Contoh:
--metric-overrides=sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed
Flag ini adalah alternatif dan tidak dapat digunakan dengan flag
--metric-overrides-file
.
Mengganti file metrik
gcloud dataproc clusters create cluster-name \ --metric-sources=METRIC-SOURCE(s) \ --metric-overrides-file=METRIC_OVERRIDES_FILENAME \ ... other flags
Catatan:
-
--metric-sources
: Wajib untuk mengaktifkan pengumpulan metrik kustom. Tentukan satu atau beberapa sumber metrik berikut:spark
,flink
,hdfs
,yarn
,spark-history-server
,hiveserver2
,hivemetastore
, danmonitoring-agent-defaults
. Nama sumber metrik tidak peka huruf besar/kecil, misalnya, "yarn" atau "YARN" dapat diterima. -
--metric-overrides-file
: Tentukan file lokal atau Cloud Storage (gs://bucket/filename
) yang berisi satu atau beberapa metrik dalam format berikut:METRIC_SOURCE:INSTANCE:GROUP:METRIC
Gunakan format camelcase yang sesuai.Contoh:
--metric-overrides-file=gs://my-bucket/my-filename.txt
--metric-overrides-file=./local-directory/local-filename.txt
Flag ini adalah alternatif dan tidak dapat digunakan dengan flag
--metric-overrides
.
REST API
Gunakan DataprocMetricConfig sebagai bagian dari permintaan clusters.create untuk mengaktifkan pengumpulan metrik kustom. Catatan: monitoring-agent-defaults tidak tersedia di cluster versi gambar 2.2 kecuali jika Agen Operasional diinstal.
Melihat metrik kustom
Anda dapat memilih dan menampilkan metrik resource Dataproc di Metrics Explorer dengan memilih VM Instance
resource, lalu memilih Custom metrics
.
Metrik kustom
Anda dapat mengaktifkan Dataproc untuk mengumpulkan metrik kustom yang tercantum dalam tabel berikut.
Kolom Metrik yang diaktifkan ditandai dengan "y" jika Dataproc mengumpulkan metrik saat Anda mengaktifkan sumber metrik terkait.
Setiap metrik yang tercantum untuk sumber metrik, dan semua metrik Spark), dapat diaktifkan untuk pengumpulan jika Anda mengganti pengumpulan kumpulan standar metrik yang diaktifkan untuk sumber metrik (lihat Mengaktifkan kumpulan metrik kustom).
Dataproc menggunakan agen pemantauan untuk mengumpulkan metrik. Mengaktifkan sumber metrik apa pun akan mengaktifkan pengumpulan metrik agen. Metrik ini tidak ditagihkan kepada pengguna; Dataproc menggunakannya untuk mendiagnosis masalah pengumpulan metrik.
Metrik Hadoop
Metrik HDFS
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
hdfs:NameNode:FSNamesystem:CapacityTotalGB | dfs/FSNamesystem/CapacityTotalGB | y |
hdfs:NameNode:FSNamesystem:CapacityUsedGB | dfs/FSNamesystem/CapacityUsedGB | y |
hdfs:NameNode:FSNamesystem:CapacityRemainingGB | dfs/FSNamesystem/CapacityRemainingGB | y |
hdfs:NameNode:FSNamesystem:FilesTotal | dfs/FSNamesystem/FilesTotal | y |
hdfs:NameNode:FSNamesystem:MissingBlocks | dfs/FSNamesystem/MissingBlocks | n |
hdfs:NameNode:FSNamesystem:ExpiredHeartbeats | dfs/FSNamesystem/ExpiredHeartbeats | n |
hdfs:NameNode:FSNamesystem:TransactionsSinceLastCheckpoint | dfs/FSNamesystem/TransactionsSinceLastCheckpoint | n |
hdfs:NameNode:FSNamesystem:TransactionsSinceLastLogRoll | dfs/FSNamesystem/TransactionsSinceLastLogRoll | n |
hdfs:NameNode:FSNamesystem:LastWrittenTransactionId | dfs/FSNamesystem/LastWrittenTransactionId | n |
hdfs:NameNode:FSNamesystem:CapacityTotal | dfs/FSNamesystem/CapacityTotal | n |
hdfs:NameNode:FSNamesystem:CapacityUsed | dfs/FSNamesystem/CapacityUsed | n |
hdfs:NameNode:FSNamesystem:CapacityRemaining | dfs/FSNamesystem/CapacityRemaining | n |
hdfs:NameNode:FSNamesystem:CapacityUsedNonDFS | dfs/FSNamesystem/CapacityUsedNonDFS | n |
hdfs:NameNode:FSNamesystem:TotalLoad | dfs/FSNamesystem/TotalLoad | n |
hdfs:NameNode:FSNamesystem:SnapshottableDirectories | dfs/FSNamesystem/SnapshottableDirectories | n |
hdfs:NameNode:FSNamesystem:Snapshots | dfs/FSNamesystem/Snapshots | n |
hdfs:NameNode:FSNamesystem:BlocksTotal | dfs/FSNamesystem/BlocksTotal | n |
hdfs:NameNode:FSNamesystem:PendingReplicationBlocks | dfs/FSNamesystem/PendingReplicationBlocks | n |
hdfs:NameNode:FSNamesystem:UnderReplicatedBlocks | dfs/FSNamesystem/UnderReplicatedBlocks | n |
hdfs:NameNode:FSNamesystem:CorruptBlocks | dfs/FSNamesystem/CorruptBlocks | n |
hdfs:NameNode:FSNamesystem:ScheduledReplicationBlocks | dfs/FSNamesystem/ScheduledReplicationBlocks | n |
hdfs:NameNode:FSNamesystem:PendingDeletionBlocks | dfs/FSNamesystem/PendingDeletionBlocks | n |
hdfs:NameNode:FSNamesystem:ExcessBlocks | dfs/FSNamesystem/ExcessBlocks | n |
hdfs:NameNode:FSNamesystem:PostponedMisreplicatedBlocks | dfs/FSNamesystem/PostponedMisreplicatedBlocks | n |
hdfs:NameNode:FSNamesystem:PendingDataNodeMessageCourt | dfs/FSNamesystem/PendingDataNodeMessageCourt | n |
hdfs:NameNode:FSNamesystem:MillisSinceLastLoadedEdits | dfs/FSNamesystem/MillisSinceLastLoadedEdits | n |
hdfs:NameNode:FSNamesystem:BlockCapacity | dfs/FSNamesystem/BlockCapacity | n |
hdfs:NameNode:FSNamesystem:StaleDataNodes | dfs/FSNamesystem/StaleDataNodes | n |
hdfs:NameNode:FSNamesystem:TotalFiles | dfs/FSNamesystem/TotalFiles | n |
hdfs:NameNode:JvmMetrics:MemHeapUsedM | dfs/jvm/MemHeapUsedM | n |
hdfs:NameNode:JvmMetrics:MemHeapCommittedM | dfs/jvm/MemHeapCommittedM | n |
hdfs:NameNode:JvmMetrics:MemHeapMaxM | dfs/jvm/MemHeapMaxM | n |
hdfs:NameNode:JvmMetrics:MemMaxM | dfs/jvm/MemMaxM | n |
Metrik YARN
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
yarn:ResourceManager:ClusterMetrics:NumActiveNMs | benang/ClusterMetrics/NumActiveNM | y |
yarn:ResourceManager:ClusterMetrics:NumDecommissionedNMs | yarn/ClusterMetrics/NumDecommissionedNM | n |
yarn:ResourceManager:ClusterMetrics:NumLostNMs | yarn/ClusterMetrics/NumLostNM | n |
yarn:ResourceManager:ClusterMetrics:NumUnhealthyNMs | benang/ClusterMetrics/NumUnhealthyNM | n |
yarn:ResourceManager:ClusterMetrics:NumRebootedNMs | yarn/ClusterMetrics/NumRebootedNMs | n |
yarn:ResourceManager:QueueMetrics:running_0 | benang/QueueMetrics/running_0 | y |
yarn:ResourceManager:QueueMetrics:running_60 | benang/QueueMetrics/running_60 | y |
yarn:ResourceManager:QueueMetrics:running_300 | benang/QueueMetrics/running_300 | y |
yarn:ResourceManager:QueueMetrics:running_1440 | benang/QueueMetrics/running_1440 | y |
yarn:ResourceManager:QueueMetrics:AppsSubmitted | yarn/QueueMetrics/AppsSubmitted | y |
yarn:ResourceManager:QueueMetrics:AvailableMB | yarn/QueueMetrics/AvailableMB | y |
yarn:ResourceManager:QueueMetrics:PendingContainers | yarn/QueueMetrics/PendingContainers | y |
yarn:ResourceManager:QueueMetrics:AppsRunning | yarn/QueueMetrics/AppsRunning | n |
yarn:ResourceManager:QueueMetrics:AppsPending | yarn/QueueMetrics/AppsPending | n |
yarn:ResourceManager:QueueMetrics:AppsCompleted | yarn/QueueMetrics/AppsCompleted | n |
yarn:ResourceManager:QueueMetrics:AppsKilled | yarn/QueueMetrics/AppsKilled | n |
yarn:ResourceManager:QueueMetrics:AppsFailed | yarn/QueueMetrics/AppsFailed | n |
yarn:ResourceManager:QueueMetrics:AllocatedMB | yarn/QueueMetrics/AllocatedMB | n |
yarn:ResourceManager:QueueMetrics:AllocatedVCores | yarn/QueueMetrics/AllocatedVCores | n |
yarn:ResourceManager:QueueMetrics:AllocatedContainers | yarn/QueueMetrics/AllocatedContainers | n |
yarn:ResourceManager:QueueMetrics:AggregateContainersAllocated | yarn/QueueMetrics/AggregateContainersAllocated | n |
yarn:ResourceManager:QueueMetrics:AggregateContainersReleased | yarn/QueueMetrics/AggregateContainersReleased | n |
yarn:ResourceManager:QueueMetrics:AvailableVCores | yarn/QueueMetrics/AvailableVCores | n |
yarn:ResourceManager:QueueMetrics:PendingMB | yarn/QueueMetrics/PendingMB | n |
yarn:ResourceManager:QueueMetrics:PendingVCores | yarn/QueueMetrics/PendingVCores | n |
yarn:ResourceManager:QueueMetrics:ReservedMB | yarn/QueueMetrics/ReservedMB | n |
yarn:ResourceManager:QueueMetrics:ReservedVCores | yarn/QueueMetrics/ReservedVCores | n |
yarn:ResourceManager:QueueMetrics:ReservedContainers | yarn/QueueMetrics/ReservedContainers | n |
yarn:ResourceManager:QueueMetrics:ActiveUsers | yarn/QueueMetrics/ActiveUsers | n |
yarn:ResourceManager:QueueMetrics:ActiveApplications | yarn/QueueMetrics/ActiveApplications | n |
yarn:ResourceManager:QueueMetrics:FairShareMB | yarn/QueueMetrics/FairShareMB | n |
yarn:ResourceManager:QueueMetrics:FairShareVCores | yarn/QueueMetrics/FairShareVCores | n |
yarn:ResourceManager:QueueMetrics:MinShareMB | yarn/QueueMetrics/MinShareMB | n |
yarn:ResourceManager:QueueMetrics:MinShareVCores | yarn/QueueMetrics/MinShareVCores | n |
yarn:ResourceManager:QueueMetrics:MaxShareMB | yarn/QueueMetrics/MaxShareMB | n |
yarn:ResourceManager:QueueMetrics:MaxShareVCores | yarn/QueueMetrics/MaxShareVCores | n |
yarn:ResourceManager:JvmMetrics:MemHeapUsedM | yarn/jvm/MemHeapUsedM | n |
yarn:ResourceManager:JvmMetrics:MemHeapCommittedM | yarn/jvm/MemHeapCommittedM | n |
yarn:ResourceManager:JvmMetrics:MemHeapMaxM | yarn/jvm/MemHeapMaxM | n |
yarn:ResourceManager:JvmMetrics:MemMaxM | yarn/jvm/MemMaxM | n |
Metrik percikan
Metrik driver Spark
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
spark:driver:BlockManager:disk.diskSpaceUsed_MB | spark/driver/BlockManager/disk/diskSpaceUsed_MB | y |
spark:driver:BlockManager:memory.maxMem_MB | spark/driver/BlockManager/memory/maxMem_MB | y |
spark:driver:BlockManager:memory.memUsed_MB | spark/driver/BlockManager/memory/memUsed_MB | y |
spark:driver:DAGScheduler:job.allJobs | spark/driver/DAGScheduler/job/allJobs | y |
spark:driver:DAGScheduler:stage.failedStages | spark/driver/DAGScheduler/stage/failedStages | y |
spark:driver:DAGScheduler:stage.waitingStages | spark/driver/DAGScheduler/stage/waitingStages | y |
Metrik eksekutor Spark
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
spark:executor:executor:bytesRead | spark/executor/bytesRead | y |
spark:executor:executor:bytesWritten | spark/executor/bytesWritten | y |
spark:executor:executor:cpuTime | spark/executor/cpuTime | y |
spark:executor:executor:diskBytesSpilled | spark/executor/diskBytesSpilled | y |
spark:executor:executor:recordsRead | spark/executor/recordsRead | y |
spark:executor:executor:recordsWritten | spark/executor/recordsWritten | y |
spark:executor:executor:runTime | spark/executor/runTime | y |
spark:executor:executor:shuffleRecordsRead | spark/executor/shuffleRecordsRead | y |
spark:executor:executor:shuffleRecordsWritten | spark/executor/shuffleRecordsWritten | y |
Metrik Flink
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
flink:jobmanager:numRegisteredTaskManagers | flink/jobmanager/numRegisteredTaskManagers | n |
flink:jobmanager:numRunningJobs | flink/jobmanager/numRunningJobs | n |
flink:jobmanager:Status.JVM.ClassLoader.ClassesLoaded | flink/jobmanager/Status.JVM.ClassLoader.ClassesLoaded | n |
flink:jobmanager:Status.JVM.ClassLoader.ClassesUnloaded | flink/jobmanager/Status.JVM.ClassLoader.ClassesUnloaded | n |
flink:jobmanager:Status.JVM.CPU.Load | {i>flink/jobmanager/Status.JVM.CPU.Load<i} | n |
flink:jobmanager:Status.JVM.CPU.Time | flink/jobmanager/Status.JVM.CPU.Time | y |
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count | flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time | flink/jobmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Count | flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Count | n |
flink:jobmanager:Status.JVM.GarbageCollector.PSScavenge.Time | flink/jobmanager/Status.JVM.GarbageCollector.PSScavenge.Time | n |
flink:jobmanager:Status.JVM.Memory.Direct.Count | flink/jobmanager/Status.JVM.Memory.Direct.Count | y |
flink:jobmanager:Status.JVM.Memory.Direct.MemoryUsed | flink/jobmanager/Status.JVM.Memory.Direct.MemoryUsed | y |
flink:jobmanager:Status.JVM.Memory.Direct.TotalCapacity | flink/jobmanager/Status.JVM.Memory.Direct.TotalCapacity | y |
flink:jobmanager:Status.JVM.Memory.Heap.Committed | flink/jobmanager/Status.JVM.Memory.Heap.Committed | y |
flink:jobmanager:Status.JVM.Memory.Heap.Max | flink/jobmanager/Status.JVM.Memory.Heap.Max | y |
flink:jobmanager:Status.JVM.Memory.Heap.Used | flink/jobmanager/Status.JVM.Memory.Heap.Used | y |
flink:jobmanager:Status.JVM.Memory.Mapped.Count | flink/jobmanager/Status.JVM.Memory.Mapped.Count | y |
flink:jobmanager:Status.JVM.Memory.Mapped.MemoryUsed | flink/jobmanager/Status.JVM.Memory.Mapped.MemoryUsed | y |
flink:jobmanager:Status.JVM.Memory.Mapped.TotalCapacity | flink/jobmanager/Status.JVM.Memory.Mapped.TotalCapacity | y |
flink:jobmanager:Status.JVM.Memory.Metaspace.Committed | flink/jobmanager/Status.JVM.Memory.Metaspace.Committed | n |
flink:jobmanager:Status.JVM.Memory.Metaspace.Max | flink/jobmanager/Status.JVM.Memory.Metaspace.Max | n |
flink:jobmanager:Status.JVM.Memory.Metaspace.Used | flink/jobmanager/Status.JVM.Memory.Metaspace.Used | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Committed | flink/jobmanager/Status.JVM.Memory.NonHeap.Committed | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Max | flink/jobmanager/Status.JVM.Memory.NonHeap.Max | n |
flink:jobmanager:Status.JVM.Memory.NonHeap.Used | flink/jobmanager/Status.JVM.Memory.NonHeap.Used | n |
flink:jobmanager:Status.JVM.Threads.Count | flink/jobmanager/Status.JVM.Threads.Count | n |
flink:jobmanager:taskSlotsAvailable | flink/jobmanager/taskSlotsAvailable | y |
flink:jobmanager:taskSlotsTotal | flink/jobmanager/taskSlotsTotal | y |
flink:operator:numRecordsIn | flink/operator/numRecordsIn | n |
flink:operator:numRecordsInPerSecond.count | flink/operator/numRecordsInPerSecond.count | n |
flink:operator:numRecordsInPerSecond.rate | flink/operator/numRecordsInPerSecond.rate | n |
flink:operator:numRecordsOut | flink/operator/numRecordsOut | n |
flink:operator:numRecordsOutPerSecond.count | flink/operator/numRecordsOutPerSecond.count | n |
flink:operator:numRecordsOutPerSecond.rate | flink/operator/numRecordsOutPerSecond.rate | n |
flink:operator:numSplitsDiproses | flink/operator/numSplitsDiproses | n |
flink:task:buffers.inPoolUsage | {i>flink/task/buffers.inPoolUsage<i} | n |
flink:task:buffers.inputExclusiveBuffersUsage | flink/task/buffers.inputExclusiveBuffersUsage | n |
flink:task:buffers.inputFloatingBuffersUsage | flink/task/buffers.inputFloatingBuffersUsage | n |
flink:task:buffers.inputQueueLength | flink/task/buffers.inputQueueLength | n |
flink:task:buffers.outPoolUsage | {i>flink/task/buffers.outPoolUsage<i} | n |
flink:task:buffers.outputQueueLength | flink/task/buffers.outputQueueLength | n |
flink:task:idleTimeMsPerSecond.count | flink/task/idleTimeMsPerSecond.count | n |
flink:task:idleTimeMsPerSecond.rate | flink/task/idleTimeMsPerSecond.rate | n |
flink:task:numBuffersInLocal | flink/task/numBuffersInLocal | n |
flink:task:numBuffersInLocalPerSecond.count | flink/task/numBuffersInLocalPerSecond.count | n |
flink:task:numBuffersInLocalPerSecond.rate | flink/task/numBuffersInLocalPerSecond.rate | n |
flink:task:numBuffersInRemote | flink/task/numBuffersInRemote | n |
flink:task:numBuffersInRemotePerSecond.count | flink/task/numBuffersInRemotePerSecond.count | n |
flink:task:numBuffersInRemotePerSecond.rate | flink/task/numBuffersInRemotePerSecond.rate | n |
flink:task:numBuffersOut | flink/task/numBuffersOut | n |
flink:task:numBuffersOutPerSecond.count | flink/task/numBuffersOutPerSecond.count | n |
flink:task:numBuffersOutPerSecond.rate | flink/task/numBuffersOutPerSecond.rate | n |
flink:task:numBytesIn | flink/task/numBytesIn | n |
flink:task:numBytesInLocal | flink/task/numBytesInLocal | n |
flink:task:numBytesInLocalPerSecond.count | flink/task/numBytesInLocalPerSecond.count | n |
flink:task:numBytesInLocalPerSecond.rate | flink/task/numBytesInLocalPerSecond.rate | n |
flink:task:numBytesInPerSecond.count | flink/task/numBytesInPerSecond.count | n |
flink:task:numBytesInPerSecond.rate | flink/task/numBytesInPerSecond.rate | n |
flink:task:numBytesInRemote | flink/task/numBytesInRemote | n |
flink:task:numBytesInRemotePerSecond.count | flink/task/numBytesInRemotePerSecond.count | n |
flink:task:numBytesInRemotePerSecond.rate | flink/task/numBytesInRemotePerSecond.rate | n |
flink:task:numBytesOut | flink/task/numBytesOut | n |
flink:task:numBytesOutPerSecond.count | flink/task/numBytesOutPerSecond.count | n |
flink:task:numBytesOutPerSecond.rate | flink/task/numBytesOutPerSecond.rate | n |
flink:task:numRecordsIn | flink/task/numRecordsIn | n |
flink:task:numRecordsInPerSecond.count | flink/task/numRecordsInPerSecond.count | n |
flink:task:numRecordsInPerSecond.rate | flink/task/numRecordsInPerSecond.rate | n |
flink:task:numRecordsOut | flink/task/numRecordsOut | n |
flink:task:numRecordsOutPerSecond.count | flink/task/numRecordsOutPerSecond.count | n |
flink:task:numRecordsOutPerSecond.rate | flink/task/numRecordsOutPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.Buffers.inPoolUsage | flink/task/Shuffle.Netty.Input.Buffers.inPoolUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage | flink/task/Shuffle.Netty.Input.Buffers.inputExclusiveBuffersUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage | flink/task/Shuffle.Netty.Input.Buffers.inputFloatingBuffersUsage | n |
flink:task:Shuffle.Netty.Input.Buffers.inputQueueLength | flink/task/Shuffle.Netty.Input.Buffers.inputQueueLength | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocal | flink/task/Shuffle.Netty.Input.numBuffersInLocal | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.count | flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate | flink/task/Shuffle.Netty.Input.numBuffersInLocalPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemote | flink/task/Shuffle.Netty.Input.numBuffersInRemote | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.count | flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate | flink/task/Shuffle.Netty.Input.numBuffersInRemotePerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBytesInLocal | {i>flink/task/Shuffle.Netty.Input.numBytesInLocal<i} | n |
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.count | flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBytesInLocalPerSecond.rate | flink/task/Shuffle.Netty.Input.numBytesInLocalPerSecond.rate | n |
flink:task:Shuffle.Netty.Input.numBytesInRemote | flink/task/Shuffle.Netty.Input.numBytesInRemote | n |
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.count | flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.count | n |
flink:task:Shuffle.Netty.Input.numBytesInRemotePerSecond.rate | flink/task/Shuffle.Netty.Input.numBytesInRemotePerSecond.rate | n |
flink:task:Shuffle.Netty.Output.Buffers.outPoolUsage | flink/task/Shuffle.Netty.Output.Buffers.outPoolUsage | n |
flink:task:Shuffle.Netty.Output.Buffers.outputQueueLength | {i>flink/task/Shuffle.Netty.Output.Buffers.outputQueueLength<i} | n |
flink:taskmanager:Status.flink.Memory.Managed.Total | flink/taskmanager/Status.flink.Memory.Managed.Total | n |
flink:taskmanager:Status.flink.Memory.Managed.Used | flink/taskmanager/Status.flink.Memory.Managed.Used | n |
flink:taskmanager:Status.JVM.ClassLoader.ClassesLoaded | flink/taskmanager/Status.JVM.ClassLoader.ClassesLoaded | n |
flink:taskmanager:Status.JVM.ClassLoader.ClassesUnloaded | flink/taskmanager/Status.JVM.ClassLoader.ClassesUnloaded | n |
flink:taskmanager:Status.JVM.CPU.Load | {i>flink/taskmanager/Status.JVM.CPU.Load<i} | n |
flink:taskmanager:Status.JVM.CPU.Time | flink/taskmanager/Status.JVM.CPU.Time | y |
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Count | flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Count | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSMarkSweep.Time | flink/taskmanager/Status.JVM.GarbageCollector.PSMarkSweep.Time | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Count | flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Count | n |
flink:taskmanager:Status.JVM.GarbageCollector.PSScavenge.Time | flink/taskmanager/Status.JVM.GarbageCollector.PSScavenge.Time | n |
flink:taskmanager:Status.JVM.Memory.Direct.Count | flink/taskmanager/Status.JVM.Memory.Direct.Count | y |
flink:taskmanager:Status.JVM.Memory.Direct.MemoryUsed | flink/taskmanager/Status.JVM.Memory.Direct.MemoryUsed | y |
flink:taskmanager:Status.JVM.Memory.Direct.TotalCapacity | flink/taskmanager/Status.JVM.Memory.Direct.TotalCapacity | y |
flink:taskmanager:Status.JVM.Memory.Heap.Committed | flink/taskmanager/Status.JVM.Memory.Heap.Committed | y |
flink:taskmanager:Status.JVM.Memory.Heap.Max | {i>flink/taskmanager/Status.JVM.Memory.Heap.Max<i} | y |
flink:taskmanager:Status.JVM.Memory.Heap.Used | flink/taskmanager/Status.JVM.Memory.Heap.Used | y |
flink:taskmanager:Status.JVM.Memory.Mapped.Count | flink/taskmanager/Status.JVM.Memory.Mapped.Count | y |
flink:taskmanager:Status.JVM.Memory.Mapped.MemoryUsed | flink/taskmanager/Status.JVM.Memory.Mapped.MemoryUsed | y |
flink:taskmanager:Status.JVM.Memory.Mapped.TotalCapacity | flink/taskmanager/Status.JVM.Memory.Mapped.TotalCapacity | y |
flink:taskmanager:Status.JVM.Memory.Metaspace.Committed | flink/taskmanager/Status.JVM.Memory.Metaspace.Committed | n |
flink:taskmanager:Status.JVM.Memory.Metaspace.Max | flink/taskmanager/Status.JVM.Memory.Metaspace.Max | n |
flink:taskmanager:Status.JVM.Memory.Metaspace.Used | flink/taskmanager/Status.JVM.Memory.Metaspace.Used | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Committed | flink/taskmanager/Status.JVM.Memory.NonHeap.Committed | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Max | flink/taskmanager/Status.JVM.Memory.NonHeap.Max | n |
flink:taskmanager:Status.JVM.Memory.NonHeap.Used | flink/taskmanager/Status.JVM.Memory.NonHeap.Used | n |
flink:taskmanager:Status.JVM.Threads.Count | {i>flink/taskmanager/Status.JVM.Threads.Count<i} | n |
flink:taskmanager:Status.Network.AvailableMemorySegments | flink/taskmanager/Status.Network.AvailableMemorySegments | n |
flink:taskmanager:Status.Network.TotalMemorySegments | flink/taskmanager/Status.Network.TotalMemorySegments | n |
flink:taskmanager:Status.Shuffle.Netty.AvailableMemory | flink/taskmanager/Status.Shuffle.Netty.AvailableMemory | n |
flink:taskmanager:Status.Shuffle.Netty.AvailableMemorySegments | flink/taskmanager/Status.Shuffle.Netty.AvailableMemorySegments | n |
{i>flink:taskmanager:Status.Shuffle.Netty.TotalMemory<i} | {i>flink/taskmanager/Status.Shuffle.Netty.TotalMemory<i} | n |
flink:taskmanager:Status.Shuffle.Netty.TotalMemorySegments | flink/taskmanager/Status.Shuffle.Netty.TotalMemorySegments | n |
{i>flink:taskmanager:Status.Shuffle.Netty.UsedMemory<i} | {i>flink/taskmanager/Status.Shuffle.Netty.UsedMemory<i} | n |
flink:taskmanager:Status.Shuffle.Netty.UsedMemorySegments | flink/taskmanager/Status.Shuffle.Netty.UsedMemorySegments | n |
Metrik Server Histori Spark
Dataproc mengumpulkan metrik memori JVM layanan histori Spark berikut:
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.committed | sparkHistoryServer/memori/CommittedHeapMemory | y |
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.used | sparkHistoryServer/memory/UsedHeapMemory | y |
sparkHistoryServer:JVM:Memory:HeapMemoryUsage.max | sparkHistoryServer/memori/MaxHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.committed | sparkHistoryServer/memori/CommittedNonHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.used | sparkHistoryServer/memory/UsedNonHeapMemory | y |
sparkHistoryServer:JVM:Memory:NonHeapMemoryUsage.max | sparkHistoryServer/memory/MaxNonHeapMemory | y |
Metrik HiveServer 2
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
hiveserver2:JVM:Memory:HeapMemoryUsage.committed | hiveserver2/memori/CommittedHeapMemory | y |
hiveserver2:JVM:Memory:HeapMemoryUsage.used | hiveserver2/memory/UsedHeapMemory | y |
hiveserver2:JVM:Memory:HeapMemoryUsage.max | hiveserver2/memori/MaxHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.committed | hiveserver2/memory/CommittedNonHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.used | hiveserver2/memory/UsedNonHeapMemory | y |
hiveserver2:JVM:Memory:NonHeapMemoryUsage.max | hiveserver2/memori/MaxNonHeapMemory | y |
Metrik Metastore Hive
Metrik | Nama Metrics Explorer | Metrik yang diaktifkan |
---|---|---|
hivemetastore:API:GetDatabase:Mean | hivemetastore/get_database/mean | y |
hivemetastore:API:CreateDatabase:Mean | hivemetastore/create_database/mean | y |
hivemetastore:API:DropDatabase:Mean | hivemetastore/drop_database/mean | y |
hivemetastore:API:AlterDatabase:Mean | hivemetastore/alter_database/mean | y |
hivemetastore:API:GetAllDatabases:Mean | hivemetastore/get_all_databases/mean | y |
hivemetastore:API:CreateTable:Mean | hivemetastore/create_table/mean | y |
hivemetastore:API:DropTable:Mean | hivemetastore/drop_table/mean | y |
hivemetastore:API:AlterTable:Mean | hivemetastore/alter_table/mean | y |
hivemetastore:API:GetTable:Mean | hivemetastore/get_table/mean | y |
hivemetastore:API:GetAllTables:Mean | hivemetastore/get_all_tables/mean | y |
hivemetastore:API:AddPartitionsReq:Mean | hivemetastore/add_partitions_req/mean | y |
hivemetastore:API:DropPartition:Mean | hivemetastore/drop_partition/mean | y |
hivemetastore:API:AlterPartition:Mean | hivemetastore/alter_partition/mean | y |
hivemetastore:API:GetPartition:Mean | hivemetastore/get_partition/mean | y |
hivemetastore:API:GetPartitionNames:Mean | hivemetastore/get_partition_names/mean | y |
hivemetastore:API:GetPartitionsPs:Mean | hivemetastore/get_partitions_ps/mean | y |
hivemetastore:API:GetPartitionsPsWithAuth:Mean | hivemetastore/get_partitions_ps_with_auth/mean | y |
Pengukuran metrik Metastore Hive
Ukuran statistik | Contoh metrik | Contoh nama metrik |
---|---|---|
Maks | hivemetastore:API:GetDatabase:Max | hivemetastore/get_database/max |
Min | hivemetastore:API:GetDatabase:Min | hivemetastore/get_database/min |
Rata-rata | hivemetastore:API:GetDatabase:Mean | hivemetastore/get_database/mean |
Jumlah | hivemetastore:API:GetDatabase:Count | hivemetastore/get_database/count |
Persentil ke-50 | hivemetastore:API:GetDatabase:50thPercentile | hivemetastore/get_database/median |
Persentil ke-75 | hivemetastore:API:GetDatabase:75thPercentile | {i>hivemetastore/get_database/75th_percentile<i} |
Persentil ke-95 | hivemetastore:API:GetDatabase:95thPercentile | hivemetastore/get_database/95th_percentile |
Persentil ke-98 | hivemetastore:API:GetDatabase:98thPercentile | hivemetastore/get_database/98th_percentile |
Persentil ke-99 | hivemetastore:API:GetDatabase:99thPercentile | hivemetastore/get_database/99th_percentile |
Persentil ke-999 | hivemetastore:API:GetDatabase:999thPercentile | hivemetastore/get_database/999th_percentile |
StdDev | hivemetastore:API:GetDatabase:StdDev | hivemetastore/get_database/stddev |
FifteenMinuteRate | hivemetastore:API:GetDatabase:FifteenMinuteRate | {i>hivemetastore/get_database/15min_rate<i} |
FiveMinuteRate | hivemetastore:API:GetDatabase:FiveMinuteRate | {i>hivemetastore/get_database/5min_rate<i} |
OneMinuteRate | hivemetastore:API:GetDatabase:OneMinuteRate | hivemetastore/get_database/1min_rate |
MeanRate | hivemetastore:API:GetDatabase:MeanRate | hivemetastore/get_database/mean_rate |
Metrik agen pemantauan Dataproc
Dataproc mengumpulkan metrik agen pemantauan Dataproc berikut saat Anda menetapkan --metric-sources=monitoring-agent-defaults.
Metrik ini dipublikasikan dengan awalan agent.googleapis.com
.
CPU
agent.googleapis.com/cpu/load_15m
agent.googleapis.com/cpu/load_1m
agent.googleapis.com/cpu/load_5m
agent.googleapis.com/cpu/usage_time*
agent.googleapis.com/cpu/utilization*
Disk
agent.googleapis.com/disk/bytes_bego
Tukar
agent.googleapis.com/swap/bytes_used
agent.googleapis.com/swap/io
agent.googleapis.com/swap/percent_used
Memori
agent.googleapis.com/memory/bytes_used
agent.googleapis.com/memory/percent_used
Proses - (mengikuti kebijakan kuota yang sedikit berbeda untuk beberapa atribut)
agent.googleapis.com/processes/count_by_state
agent.googleapis.com/processes/cpu_time
agent.googleapis.com/processes/disk/read_bytes_count
agent.googleapis.com/processes/disk/write_bytes_countrs
agent.googleapis.com/processes/fork_googleapis.
Antarmuka
agent.googleapis.com/interface/errors
agent.googleapis.com/interface/packets
agent.googleapis.com/interface/traffic
Jaringan
agent.googleapis.com/network/tcp_connections
Membangun dasbor Monitoring
Anda dapat membuat dasbor Monitoring yang menampilkan diagram metrik Dataproc yang dipilih.
Pilih + CREATE DASBOR dari halaman Dashboards Overview Monitoring. Beri nama dasbor, lalu klik Add Chart di menu di bagian kanan atas untuk membuka jendela Add Chart. Pilih "Cluster Cloud Dataproc" sebagai jenis resource. Pilih satu atau beberapa metrik serta properti metrik dan diagram. Kemudian, pilih Simpan diagram.
Anda dapat menambahkan diagram lain ke dasbor. Setelah Anda Menyimpan dasbor, judulnya akan muncul di halaman Dashboard Overview Monitoring. Diagram dasbor dapat dilihat, diperbarui, dan dihapus dari halaman tampilan dasbor.
Langkah berikutnya
- Lihat Dokumentasi Cloud Monitoring
- Pelajari cara Membuat pemberitahuan metrik Dataproc