Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
357 views
in Technique[技术] by (71.8m points)

amazon web services - Configuring a cloudwatch "idle" alarm

I'd like to configure a cloudwatch alarm that can turn adjust the desired capacity of an autoscaling group down when one of the instances in the autoscaling group is detected to be "idle". Idleness could be defined, for example, by having CPU Utilization below 1% for a sustained period of time.

I'm using terraform, so right now I have the alarm congfigured as follows.

resource "aws_cloudwatch_metric_alarm" "backend-x86_64" {
  alarm_name          = "backend-x86_64-is-idle"
  comparison_operator = "LessThanOrEqualToThreshold"
  evaluation_periods  = "4"
  period              = "120"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  statistic           = "Average"
  threshold           = "1"
  treat_missing_data  = "notBreaching"

  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.backend-x86_64.name
  }

  alarm_description = "This metric monitors ec2 cpu utilization"
  alarm_actions     = [aws_autoscaling_policy.backend-x86_64-shutdown.arn]
}

However, it's not working. Below is a picture of the graph of CPU Utilization over time -- it's at 0% for 2 hours, and yet it doesn't fire. I suspect this is because I'm misunderstanding how "evaluation_periods" and "period" interact.

Can anyone explain how to configure terraform /cloudwatch to do what I want here?

enter image description here

question from:https://stackoverflow.com/questions/65835276/configuring-a-cloudwatch-idle-alarm

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The gaps indicate that your instance only has basic monitoring where data is available in 5 minute periods.

This, with your current monitoring policy, which checks for CPU utilization less than 1% for 4 data points in 8 minutes, meaning one data point every 2 minutes plus with missing data treated as good means that your alert will never get triggered.


Enabling Detailed Monitoring gets you data in 1 min intervals and you should be able to get this alarm right.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...