-->

Monitoring EC2 Instance Disk Space with AWS CloudWatch

awsdevops cloudwatch, custom_metric

I had an issue recently with an EC2 instance running out of disk space. Unfortunately free disk space is not a metric that comes out of the box with AWS CloudWatch. This post is about implementing a custom metric and getting notifications via AWs CloudWatch based on that metric.

Steps to monitor disk space with CloudWatch

Step 1: Download sample config file

AWS provides a sample JSON file at this location: https://s3.amazonaws.com/ec2-downloads-windows/CloudWatchConfig/AWS.EC2.Windows.CloudWatch.json

Download a copy of this file.

Step 2: Set IsEnabled to true

By default it comes disabled so set the value as shown below:

"IsEnabled": true

Step 3: Add the custom metric for disk usage

Add the custom metric to monitor disk space:

{
    "Id": "PerformanceCounterDisk",
    "FullName": "AWS.EC2.Windows.CloudWatch.PerformanceCounterComponent.PerformanceCounterInputComponent,AWS.EC2.Windows.CloudWatch",
    "Parameters": {
        "CategoryName": "LogicalDisk",
        "CounterName": "% Free Space",
        "InstanceName": "C:",
        "MetricName": "FreeDiskPercentage",
        "Unit": "Percent",
        "DimensionName": "InstanceId",
        "DimensionValue": "{instance_id}"
    }
}

Step 4: Add the new metric to flows

After defining the metric we need to add it to the flows so that it can be sent to CloudWatch. To achieve this update the flows section as shown below:

"Flows": {
    "Flows": 
    [
        "(ApplicationEventLog,SystemEventLog),CloudWatchLogs",
        "(PerformanceCounter,PerformanceCounterDisk),CloudWatch"
    ]
}

Step 5: Add IAM role to server

It’s a good practice to manage permissions of EC2 instances via IAM roles assigned to the machine. To enable sending logs to CloudWatch add AmazonEC2RoleForSSM policy to the machine’s role

Without this role SSM agent service gets an access denied error.

Step 6: Restart Amazon SSM Agent service

Either by using Windows Services Manager or running the following command:

Restart-Service AmazonSSMAgent

Once this is all done wait a few minutes and check CloudWatch metrics. Under All -> Windows/Default you should be able to see new metric under InstanceId group (as that’s what we are using to group the logs). And when you click the metric you should be able to see a nice time-based graph of free disk space on the server:

Notes

  • It’s useful to know where SSM Agent’s logs are stored. They can be found in this path:

    %PROGRAMDATA%\Amazon\SSM\Logs\

  • The service reports every 5 minutes. The PollInterval in the JSON file is in seconds and is different than service report interval.

Resources