Cassandra Monitoring
Apache Cassandra is an open-source, distributed NoSQL database management system that is built to handle large volumes of data. Monitor Apache Cassandra to identify slowdowns and resource limitations, and resolve them instantly. Use our ready-to-install plugin integration and gain visibility into health, performance, and resource usage of the datastore.
Prerequisites
- Download and install the latest version of the Site24x7 Linux agent in the server where you intend to run the plugin.
- Install the JMXQuery module for python using the query below:
pip install jmxquery
- Follow the steps below to set up the JMX port for Cassandra:
- Open the cassandra-env.sh file from the location "/etc/cassandra"
- By default, JMX security is disabled in Cassandra. To enable it, locate the following lines of code in the cassandra-env.sh file.
if [ "$LOCAL_JMX" = "yes" ]; then
JVM_OPTS="$JVM_OPTS -Dcassandra.jmx.local.port=$JMX_PORT -XX:+DisableExplicitGC" -
Then, add the following lines to the "else block" of the above "if block" in the cassandra-env.sh file.
JVM_OPTS="$JVM_OPTS -Dcassandra.jmx.remote.login.config=CassandraLogin"'
JVM_OPTS="$JVM_OPTS -Djava.security.auth.login.config=$CASSANDRA_HOME/conf/cassandra-jaas.config"
JVM_OPTS="$JVM_OPTS -Dcassandra.jmx.authorizer=org.apache.cassandra.auth.jmx.AuthorizationProxy" - Also, comment out the following lines in the cassandra-env.sh file:
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.access.file=/etc/cassandra/jmxremote.access" -
Change the authentication in the cassandra.yaml file to PasswordAuthenticator:
authenticator: PasswordAuthenticator
-
Change the authorization in the cassandra.yaml file to CassandraAuthorizer:
authorizer: CassandraAuthorizer
- Restart Cassandra once you are done.
Plugin Installation
- Create a folder named after the plugin "cassandra_monitoring" under the Site24x7 Linux Agent plugin directory:
Linux --> /opt/site24x7/monagent/plugins/cassandra_monitoring
- Download the plugin file from our GitHub repository and place it inside the folder created. For example, download all the files in the cassandra_monitoring folder and place them under the cassandra_monitoring directory.
- Execute the following command in your server to install JMXQuery:
pip install jmxquery
- Execute the command below with appropriate arguments to check for a valid JSON output:
python3 cassandra_monitoring.py --hostname localhost --port 7199 --logs_enabled False
Configurations
Provide your Cassandra configurations in the cassandra_monitoring.cfg file as shown below:
[cassandra_1]
hostname=<HOSTNAME>
port=<PORT NUMBER>
logs_enabled=False
log_type_name=None
log_file_path=None
The agent will automatically execute the plugin within five minutes and send performance data to the Site24x7 data center.
Performance Metrics
Troubleshoot your Cassandra environment with ease by keeping track of critical metrics, including:
Performance metrics | Description |
Total Latency (Read) | Read response time in microseconds. |
Total Latency (Write) | Write response time in microseconds. |
Cross Node Latency | The time period starts when a node sends a message and ends when the current node receives it. |
Total Hints | The number of hint messages written to this node since [re]start. This includes one hint per host. |
Throughput (Writes) | Write requests per second. |
Throughput (Read) | Read requests per second. |
Key cache hit rate | Rate of read requests for keys present in the cache. |
Disk Used | Disk space used on a node, in bytes. |
Completed compaction tasks | Total compaction tasks completed. |
Pending compaction tasks | Total compaction tasks in the queue. |
ParNew garbage collections (count) | The number of young-generation collections. |
ParNew garbage collections (time) | The elapsed time of young-generation collections, in milliseconds. |
CMS garbage collections (count) | The number of old-generation collections. |
CMS garbage collections (time) | Elapsed time of old-generation collections, in milliseconds. |
Exceptions | Requests for which Cassandra encountered an error. |
Timeout exceptions (write) | Requests that are not acknowledged within the set timeout window during writing. |
Timeout exception (read) | Requests that are not acknowledged within the set timeout window during reading. |
Unavailable exceptions (write) | Requests for which the required number of nodes was unavailable during writing. |
Unavailable exceptions (read) | Requests for which the required number of nodes was unavailable during reading. |
Pending tasks | Tasks in a queue awaiting a thread for processing. |
Dropped Mutations | The number of dropped mutations in the all table. |
Pending Flushes | The estimated number of flush tasks pending for the all table. |
Blocked On Allocation | Number threads are blocked by memtable allocation. |
Currently Blocked Tasks | The number of tasks currently blocked due to queue saturation but on retry will become unblocked. |
Each table in Cassandra has metrics responsible for tracking its state and performance.There is a special table called all without a keyspace. This represents the aggregation of metrics across all tables and keyspaces on the node.
Set Thresholds
Once a plugin monitor is added, associate a threshold and availability profile to help the alarms engine decide if a specific resource has to be declared critical or down. Configure Downtime Rules to reduce false alerts.
Licensing
The first Casandra plugin added for a server monitor is free. After that, each plugin monitor is considered as a basic monitor.