4.1. BookKeeper Server Metrics¶
These metrics are available on the BookKeeper server.
Metrics relating to daemon & service health.
|The number of workers currently reporting to the master node.
|Mismatch with number reported by engine (Presto, Spark, etc.)
|The number of workers reporting caching validation success.
|Mismatch with live worker count (one or more workers failed validation)
Metrics relating to cache interactions.
|The current size of the local cache in MB.
|Cache size is bigger than configured capacity
|The current disk space available for cache in MB.
|The number of files removed from the local cache due to size constraints.
|No cache evictions & cache has exceeded configured capacity
|The number of files invalidated from the local cache when the source file has been modified.
|The number of files removed from the local cache once expired.
|The percentage of cache hits for the local cache.
|Cache hit rate near 0%
|The percentage of cache misses for the local cache.
|Cache miss rate near 100%
|The total number of requests made to read data.
|The number of requests made to read data cached locally.
|No cache requests made
|The number of requests made to read data from another node.
|No non-local requests made
|The number of requests made to download data from the data store.
|No remote requests made
|The total number of requests made to download data asynchronously.
|The total number of asynchronous download requests that have already been processed.
|The current number of queued asynchronous download requests.
|High queue size (requests not being processed)
|The amount of data asynchronously
downloaded, in MB.
(If there are no cache evictions, this
|Total time spent on downloading data in sec
Metrics relating to JVM statistics, supplied by the Dropwizard Metrics
metrics-jvm module. (https://metrics.dropwizard.io/3.1.0/manual/jvm/)
|Metrics relating to garbage collection (GarbageCollectorMetricSet)
|Metrics relating to memory usage (MemoryUsageGaugeSet)
|Metrics relating to thread states (CachedThreadStatesGaugeSet)
4.2. Client side Metrics¶
These metrics are available on the client side i.e. Presto or Spark where the jobs to read data are run.
Client side Metrics is divided into two:
- Basic stats: These stats are available under name rubix:name=stats
- Detailed stats: These stats are available under name rubix:name=stats,type=detailed
If Rubix is used in embedded mode, an engine specific suffix is added to these names, e.g., Presto adds catalog=<catalog_name> suffix.
Following sections cover the metrics available under both these types in detail.
|Data read from cache by the client jobs
|Data read from Source by the client jobs
|Cache Hit ratio, between 0 and 1
Data unit in all metrics above is MB