Amazon ElastiCache Monitoring Integration
Amazon ElastiCache is an in-memory data store in the cloud that speeds up queries and helps in improving latency and throughput of your application. With Site24x7's CloudWatch integration you can visualize, monitor and get alerted on important metrics for both the Redis and Memcached engine.
Setup and configuration
- If you haven't done already, enable access to your AWS resources either by creating Site24x7 as an IAM user or by creating a cross account IAM role between your AWS account and Site24x7's AWS account. Learn more.
- In the Integrate AWS Account page, make sure the check box next to the ElastiCache listing is selected. Learn more.
Policies and permissions
The following permissions are required by Site24x7 to discover your configured Redis/Memcached nodes and Memcached clusters and to collect configuration information.
- "elasticache:DescribeCacheClusters",
- "elasticache:DescribeCacheSubnetGroups",
- "elasticache:ListTagsForResource",
- "elasticache:DescribeServiceUpdates",
- "elasticache:DescribeReplicationGroups"
- "elasticache:DescribeCacheClusters",
- "elasticache:DescribeCacheSubnetGroups",
- "elasticache:ListTagsForResource",
- "elasticache:DescribeServiceUpdates"
- "elasticache:DescribeCacheClusters",
- "elasticache:DescribeCacheSubnetGroups",
- "elasticache:ListTagsForResource",
- "elasticache:DescribeServiceUpdates"
Polling Frequency
Site24x7 queries AWS to collect AWS ElastiCache performance metrics according to the configured polling frequency. The polling interval is one hour by default. Learn more
IT Automations
You can add automations for the AWS services supported by Site24x7. Log in to Site24x7 and go to Admin > IT Automation Templates (+) > Add Automation Templates. Once automations are added, you can schedule them to be executed one after the other.
You can now reboot ElastiCache clusters using Amazon ElastiCache automations.
Supported performance counters
Host-level data
The following host-level data is collected:
Attribute | Description | Statistic | Unit |
---|---|---|---|
CPU utilization | Measures the CPU utilization of the host. | Average, minimum and maximum | Percent |
Freeable memory | Measures the amount of available free memory in the host. | Average, minimum and maximum | Bytes |
Network bytes in | Measures the number of bytes the host has read from the network. | Average, minimum and maximum | Bytes |
Network bytes out | Measures the number of bytes the host has written to the network. | Average, minimum and maximum | Bytes |
Swap usage | Measures the swap used by the host. | Average, minimum and maximum | Bytes |
Common cache metrics
The following data is supported for both Redis and Memcached
Attribute | Description | Statistic | Unit |
---|---|---|---|
CurrConnections | The number of application clients connected to Redis/Memcached. | Average, sum | Count |
CurrItems | The number of keys in the in-memory database. | Average,sum | Count |
Evictions | The number of keys that have been removed due to reaching the maxmemory limit. | Average,sum | Count |
NewConnections | the total number of connections that has been accepted by the database server. | Average,sum | Count |
Metrics supported for the Redis cache engine
The following data is collected only for the Redis node:
Attribute | Description | Statistic | Unit |
---|---|---|---|
ActiveDefragHits | The number of value reallocations per minute performed by the active defragmentation process. | Average | Count |
AuthenticationFailures | The total number of failed attempts to Redis authentication using the AUTH command. | Maximum | Count |
BytesReadFromDisk | The total number of bytes read from disk per minute. | Sum | MB |
BytesUsedForCache | The number of bytes allocated by Redis. | Average | Bytes |
BytesWrittenToDisk | The total number of bytes written to disk per minute. | Sum | MB |
CacheHits | Number of successful keyspace lookups. | Sum | Count |
CacheMisses | Number of unsuccesful keyspace lookups. | Sum | Count |
CacheHitRate | Indicates the usage efficiency of the Redis instance. | Average | Percent |
CommandAuthorizationFailures | The total number of failed attempts by users to run commands they don’t have permission to call. | Maximum | Count |
CurrVolatileItems | The total number of keys in all databases that have a Time To Live (TTL) set. | Maximum | Count |
DatabaseMemoryUsagePercentage | Percentage of the memory available for the cluster that is in use. | Maximum | Percent |
DatabaseMemoryUsageCountedForEvictPercentage | Percentage of the memory available for the cluster that is in use, excluding the memory used for overhead and Client Output Buffer (COB). | Maximum | Percent |
DB0AverageTTL | Exposes avg_ttl of DBO from the key space statistic. | Average | Ms |
EngineCPUUtilization | Provides the CPU utilization of the Redis engine thread. | Maximum | Percent |
GetTypeCmds | The total number of Get types of commands. | Sum | Count |
GlobalDatastoreReplicationLag | The lag between the secondary region's primary node and the primary region's primary node. | Average | Seconds |
HashBasedCmds | The total number of commands that are hash-based. | Sum | Count |
HyperLogLogBasedCmds | The total number of HyperLogLog based commands. | Sum | Count |
KeyAuthorizationFailures | The total number of failed attempts by users to access keys, they don’t have permission to access. | Maximum | Count |
KeyBasedCmds | The total number of commands that are key-based. | Sum | Count |
KeysTracked | The number of keys being tracked by Redis key tracking as a percentage of tracking-table-max-keys. | Maximum | Count |
ListBasedCmds | The total number of commands that are list-based. | Sum | Count |
MemoryFragmentationRatio | Indicates the efficiency in the allocation of memory of the Redis engine. | Minimum | Count |
NumItemsReadFromDisk | The total number of items retrieved from disk per minute. | Sum | Count |
NumItemsWrittenToDisk | The total number of items written to disk per minute. | Sum | Count |
Reclaimed | The total numebr of key expiration events. | Sum | Count |
ReplicationBytes | Total number of bytes the primaru node is sending to all replicas. | Sum | Count |
ReplicationLag | In seconds, how far behind the read replica is in applying changes from the primary node. | Average | Seconds |
SaveInProgress | The metric is incremented whenever a bacground save is in progress. | Sum | Count |
SetBasedCmds | Total number of commands that are set-based. | Sum | Count |
SetTypeCmds | Total number of set types of commands. | Sum | Count |
SortedSetBasedCmds | The total number of commands that are sorted set-based. | Sum | Count |
StringBasedCmds | Total number of commads that are string-based. | Sum | Count |
ClusterBasedCmdsLatency | The latency of cluster-based commands. | Maximum | Microseconds |
EvalBasedCmdsLatency | The latency of eval-based commands. | Maximum | Microseconds |
GeoSpatialBasedCmdsLatency | The latency of geospatial-based commands. | Maximum | Microseconds |
GetTypeCmdsLatency | The latency of read commands. | Maximum | Microseconds |
HashBasedCmdsLatency | The latency of hash-based commands. | Maximum | Microseconds |
HyperLogLogBasedCmdsLatency | The latency of HyperLogLog-based commands. | Maximum | Microseconds |
JsonBasedCmdsLatency | Exposes the aggregate latency (server side CPU time) calculated as Delta[Usec]/Delta[Calls] of all commands that act upon one or more JSON document objects. | Maximum | Microseconds |
KeyBasedCmdsLatency | The latency of key-based commands. | Maximum | Microseconds |
ListBasedCmdsLatency | The latency of list-based commands. | Maximum | Microseconds |
PubSubBasedCmdsLatency | The latency of pub/sub-based commands. | Maximum | Microseconds |
SetBasedCmdsLatency | The latency of set-based commands. | Maximum | Microseconds |
SetTypeCmdsLatency | The latency of write commands. | Maximum | Microseconds |
SortedSetBasedCmdsLatency | The latency of sorted-based commands. | Maximum | Microseconds |
StringBasedCmdsLatency | The latency of string-based commands. | Maximum | Microseconds |
StreamBasedCmdsLatency | The latency of stream-based commands. | Maximum | Microseconds |
NetworkBytesIn | The number of bytes the host has read from the network. | Sum | MB |
NetworkBytesOut | The number of bytes sent out on all network interfaces by the instance. | Sum | MB |
NetworkPacketsIn | The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance. | Sum | MB |
NetworkPacketsOut | The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance. | Sum | MB |
Metrics supported for the Memcached engine
Sit24x7 collects the following performance data for your Memcached nodes and aggregates the values across your nodes and provides calculated metrics for your Memcached clusters.
Attribute | Description | Statistic | Unit |
---|---|---|---|
BytesReadIntoMemcached | The total number of bytes that have been read by the node from the network. | Average | Bytes |
BytesUsedForCacheItems | The total number of bytes used to store cache items. | Average | Bytes |
BytesWrittenOutFromMemcached | The total number of bytes that have written by the node to the network. | Average | Bytes |
CasBadval | The total number of check and set requests received by the cache where Cas value did not match. | Sum | Count |
CasHits | The total number of check and set requests received by the cache where key and value both matched. | Sum | Count |
CasMisses | The total number of check and set request received by the cache where key was not found. | Sum | Count |
CmdFlush | Number of Flush commands received. | Sum | Count |
CmdGet | Number of Get commands received. | Sum | Count |
CmdSet | Number of Set commands received. | Sum | Count |
DecrHits | The number of decrement requests received by the cache where the key matched. | Sum | Count |
DecrMisses | The number of decrement requests received by the cache where the key was not found. | Sum | Count |
DeleteHits | The number of delete requests received by the cache where the key matched. | Sum | Count |
DeleteMisses | The number of delete requests received by the cache where the key was not found. | Sum | Count |
GetHits | The number of Get requests received by the cache where the request key was found. | Sum | Count |
GetMisses | The number of Get requests received by the cache where the key was not found. | Sum | Count |
IncrHits | The number of increment requests received by the cache where the key was found. | Sum | Count |
IncrMisses | The number of increment requests received by the cache where the key was not found | Sum | Count |
Reclaimed | The number of expired items the cache evicted to aide new writes. | Sum | Count |
Add or edit a threshold profile for ElastiCache
Site24x7 support individual threshold profiles for your Memcached cluster, Memcached node and Redis node. To learn more on how you can create, edit and delete a threshold profile for your ElastiCache deployment, visit our configuration profiles page.
Forecast
Estimate future values of the following Amazon ElastiCache Memcached Nodes, Memcached Clusters, and Elasticache Redis performance metrics and make informed decisions about adding capacity or scaling your AWS infrastructure.
- CPU Utilization
- Evictions
- Reclaimed
- Connections
- CurrConnections
- Swap Usage