I've been working at my first software dev job for a few months now. I sat down at work today and, for the first time, I had to launch and configure an EC2 instance. Of course, within the first few minutes of getting started AWS starts having issues.
At Zapier we saw half the internet on AWS blip out for a bit (us too), but it seems to have been short lived. Approximately Jul 27, 2017 13:47:45 to Jul 27, 2017 13:59:33 (UTC) as far as we could tell.
[RESOLVED] Network Connectivity 07:28 AM PDT Between 6:47 AM and 7:10 AM PDT we experienced increased launch failures for EC2 Instances, degraded EBS volume performance and connectivity issues for some instances in a single Availability Zone in the US-EAST-1 Region.
edit: looks like this message is now on the status page
There was definitely an issue. Around 25% of our servers in one availability zone of us-east-1 fell off the network for 15 minutes or so, starting around 13:47 GMT. They're back now.
During this time period, we were also unable to access the console (500 errors).
East 1 is indeed the oldest and has the most non-standard configuration of the bunch (besides China, of course). I definitely would recommend east 2 or west 2 for any new deployments.
us-west-2 (Oregon) and us-east-2 (Ohio) are the same price as us-east-1 (Virginia). At least that's true for most resources, I didn't check the full price list.
I don't know about Ohio since I don't use it, but we've had far fewer problems in us-west-2 than in us-east-1
If I have a single-region service, I always put it in us-west-2. It's super reliable, and gets updates after us-east-1 and us-west-1, which means all the kinks are out before they hit us-west-2.
On days like today, I without fail get a message from my friend who works at a shop where everything is in us-east-1 (multi-AZs) about how much he hates me for avoiding east like the plague.
"C", but that's meaningless because AWS scrambles the zone names for each account. (Presumably to prevent everyone from putting all their servers in "A".)
Haha - I didn't know that. Makes sense. I've got a dropdown in one of my Cloudformation scripts for AZs, and every time I get to it, I spend way more time thinking about it than I should. You've saved me some time.
Now getting Lambda provisioning errors in us-east-1:
LAMBDA_FAILED: ServiceException: We currently do not have sufficient capacity in the region you requested. Our system will be working on provisioning additional capacity. You can avoid getting this error by temporarily reducing your request rate.
I wonder if they had to take part of their fleet offline due to the issues
Everything requires a manual and has a list of quirks if you're doing something non-trivial or high volume. Everything has trade-offs, the only question is how much you get to know upfront.
Can't find a permalink, but I had this notification in our AWS console:
> Beginning at Thu, 27 Jul 2017 13:53:00 GMT, some instances are experiencing elevated packet loss in the us-east-1a Availability Zone. We are now investigating this issue.
Some of our instances weren't reachable for about 10 minutes.
Interesting - I use both those services in us-east-1 and have not experienced issues. https://status.aws.amazon.com/ also shows a sea of green, although I'm not sure this page is even functional, because even when I know AWS has been having issues it's a sea of green.
I think they've scrapped the AWS dependencies there, which were awfully silly. But it doesn't really seem to update regardless, and when it does it's a cute little 'i' on the green checkmark to inform you that everything is fine except for the 'actually working' part.
I have a hard time having sympathy for someone who puts together something that critical with infrastructure meant primarily to sell to modern day gold diggers disguised as technologists.