When utilizing Exadata Service on OCI or Cloud@Customer, reviewing the performance metrics of the storage servers becomes indispensable. although we cannot ssh to the storage servers, Oracle has exposed limited metrics via the ExaCLI utility.
This blog will discuss the various metrics accessible via ExaCLI and try to expand on a few key metrics. The blog will explain how to understand the purpose of the metric, capturing over 700 metrics, which can be overwhelming. This in turn will help you identify metrics relevant to your issue.
We will use cookies to allow for batch capture of metrics, and if you want to check out how to establish passwordless connection to ExaCLI, refer to the previous blog.
Listing Cell Version
Once you have set up the cookies, you can use the commands to fetch general attributes of the cell. In the below example, we use the list command to check the version of the cell software.
# get the version of the cell software
# list cell attributes attribute_name;
$ exacli -c cloud_user_clustername@10.10.10.10 --cookie-jar -e list cell attributes releaseVersion
23.1.10.0.0.240208
Understanding ExaCLI commands to fetch Cell Metrics
There are two main commands that you can use to identify and fetch the metrics LIST & DESCRIBE. Just as in SQLPlus, the DESCRIBE command gives a description of the attributes associated with the metric, and the LIST gets you the Metric value. The format of the commands is listed below where you ca
- VERB & OBJECT_TYPE – can be found by using the command “help” on ExaCLI
- OBJECT_NAME – I could not find a direct command to list all objects when the object_type is a metric, I have provided the list below, this list is valid as of cell server version 23.1.10
- ATTRIBUTES – Describe object_type generally lists the attributes associated with the object_type
<verb> <object_type> <object_name|attribute_filter> <DETAIL|ATTRIBUTES attribute_list>
.
## <verb> <object_type> <object_name> ATTRIBUTES <attribute_list> ;
$ list metricdefinition CL_FSUT ATTRIBUTES name,description ;
CL_FSUT "Percentage of total space on this file system that is currently used"
## <verb> <object_type> <attribute_filter> ATTRIBUTES <attribute_list> ;
$ list metricdefinition where objectType ='CELL' ATTRIBUTES name,description ;
CL_CPUT "Percentage of time over the previous minute that the system CPUs were not idle."
CL_CPUT_CS "Percentage of CPU time used by CELLSRV"
.....
## list of attributes
$ describe metricdefinition
name
description
fineGrained
metricType
objectType
persistencePolicy
streaming
unit
Although I couldn’t find a direct command to list object types that can be used to filter metrics, I could compile the following list. In the above example, when you filter using the clause “where objectType ='CELL'
“, you can see the definitions for all the CELL-related metrics that are at your disposal.
###LIST of OBJECT TYPES
Object Type Number of Metrics available for the metric
------------- --------------------------------------------
CELL 23
CELL_FILESYSTEM 1
CELLDISK 38
DEVICE 4
DISK 12
FLASHCACHE 140
FLASHLOG 30
GRIDDISK 34
HOST_INTERCONNECT 11
IBPORT 6
IORM_CATEGORY 61
IORM_CLUSTER 61
IORM_CONSUMER_GROUP 61
IORM_DATABASE 67
IORM_PLUGGABLE_DATABASE 66
NET_INTERFACE 6
NETDEV_QUEUE 3
SERVER 37
SMARTIO 38
XRMEMCACHE 1
If you describe the METRICCURRENT & METRIC History object_type, you will notice another important attribute, collectionTime. This attribute helps filter the metric values for a certain time. It’s a handy attribute, as it can further restrict the data.
Remember pulling metrics is an overhead on the cell servers, and hence one should be mindful and try to pull only the metrics for the required time frame.
With that bit of advice in mind, now let’s try to view the metrics for a cell server.
exacli -l cloud_user_clustername -c 10.10.10.10 --cookie-jar -e "list METRICHISTORY WHERE objectType = 'FLASHCACHE' AND collectionTime > '2024-04-01T23:00:09+00:00' and collectionTime < '2024-04-01T11:00:09+00:00' " > cell-01.txt
## Multiple objects
exacli -l cloud_user_clustername -c 10.10.10.10 --cookie-jar -e "list METRICHISTORY WHERE objectType like 'FLASHCACHE|CELLDISK|IORM_DATABASE|FLASHLOG|SMARTIO|IORM_CATEGORY|IORM_CONSUMER_GROUP' AND collectionTime > '2024-03-18T23:00:09+00:00' and collectionTime < '2024-04-01T11:00:09+00:00' " > cell-01.txt
Again a word of caution, only try and pull the data you need. The CELLSRV process collects the metrics and places them in the memory, and every hour the Management Server (MS) summarizes them and flushes them to the internal disk. This great blog delves further into Exadata Storage cell metrics and how to utilize them effectively.