ec2 Instance Status Check Failed

Question:

I am currently running a process on an ec2 server that needs to run consistently in the background. I tried to login to the server and I continue to get a Network Error: Connection timed out prompt. When I check the instance, I get the following message:

Instance reachability check failed at February 22, 2020 at 11:15:00 PM UTC-5 (1 days, 13 hours and 34 minutes ago)

To troubleshoot, I have tried rebooting the server but that did not correct the problem. How do I correct this and also prevent it from happening again?

Answer:

An instance status check failure indicates a problem with the
instance, such as:

  • Failure to boot the operating system
  • Failure to mount volumes correctly
  • File system issues
  • Incompatible drivers
  • Kernel panic
  • Severe memory pressures

You can check following for troubleshooting
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/TroubleshootingInstancesStopping.html

For future reprting and auto recovery you can create a CloudWatch
Alarm

For second part

Nothing you can do to stop its occurrence, but for up-time and availability YES you can create another EC2 and add ALB on the top of both instances which checks the health of instance, so that your users/customers/service might be available during recovery time (from second instance). You can increase number of instances as more as you want for high availability (obviously it involves cost)

Leave a Reply