Welcome to CloudAffaire and this is Debjeet
In the last blog post, we have discussed CloudWatch Agent. We have also installed and configured CloudWatch Agent in an EC2 instance.
In this blog post, we are going to discuss CloudWatch Alarm. We are also going to configure CloudWatch Alarm on EC2 instance memory utilization.
You can create a CloudWatch alarm that watches a single CloudWatch metric or the result of a math expression based on CloudWatch metrics. The alarm performs one or more actions based on the value of the metric or expression relative to a threshold over a number of time periods. The action can be an Amazon EC2 action, an Amazon EC2 Auto Scaling action, or a notification sent to an Amazon SNS topic.
You can also add alarms to CloudWatch dashboards and monitor them visually. When an alarm is on a dashboard, it turns red when it is in the ALARM state, making it easier for you to monitor its status proactively. Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods.
- OK: The metric or expression is within the defined threshold.
- ALARM: The metric or expression is outside of the defined threshold.
- INSUFFICIENT_DATA: The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.
When you create an alarm, you specify three settings to enable CloudWatch to evaluate when to change the alarm state:
- Period is the length of time to evaluate the metric or expression to create each individual data point for an alarm. It is expressed in seconds. If you choose one minute as the period, there is one datapoint every minute.
- Evaluation Period is the number of the most recent periods, or data points, to evaluate when determining alarm state.
- Datapoints to Alarm is the number of data points within the evaluation period that must be breaching to cause the alarm to go to the ALARM state. The breaching data points do not have to be consecutive, they just must all be within the last number of data points equal to Evaluation Period.
Sometimes some data points for a metric with an alarm do not get reported to CloudWatch. For example, this can happen when a connection is lost, a server goes down, or when a metric reports data only intermittently by design. CloudWatch enables you to specify how to treat missing data points when evaluating an alarm. This can help you configure your alarm to go to the ALARM state when appropriate for the type of data being monitored. You can avoid false positives when missing data does not indicate a problem.
For each alarm, you can specify CloudWatch to treat missing data points as any of the following:
- missing: The alarm does not consider missing data points when evaluating whether to change state
- notBreaching: Missing data points are treated as being within the threshold
- breaching: Missing data points are treated as breaching the threshold
- ignore: The current alarm state is maintained
Next, we are going to configure a CloudWatch Alarm on EC2 instance memory utilization, that will send email notification upon high memory usage.
Prerequisite for this demo:
- A running EC2 instance
- SNS topic with email subscription.
We have already created the SNS topic with email subscription.
Step 1: Login to AWS console and navigate to CloudWatch.
Step 2: Click ‘Create Alarm’ located under ‘Alarms’.
Step 3: Click ‘Select metrics’.
Step 4: Select the metric and click ‘Select metric’.
Step 5: Provide name, description, condition and missing data point action for your CloudWatch Alarm.
Select the SNS topic for ‘Actions’ and click ‘Create Alarm’.
Note: Our Alarm threshold is 90%. Once the Alarm is created if the memory utilization is greater than equal to 90% the Alarm will be triggered.
Our CloudWatch Alarm is successfully created and since the current memory utilization is below 90% the alarm state is OK.
Next, we will reduce the alarm threshold so that the alarm is triggered.
Step 6: Select the alarm and from ‘Actions’ click ‘Modify’.
Step 7: Change the threshold to 70 and click ‘Save Changes’.
Observe: The Alarm state is ALARM. Since the current memory utilization is above 70%, the alarm state changed from OK to ALARM. If we open our email to which the SNS has a subscription, we will get an email from AWS as part of alarm action.
Hope you have enjoyed this article. In the next blog post, we are going to discuss CloudWatch Events.
To get more details on CloudWatch, please refer below AWS documentation