AWS outage reminds us that the fate of the internet lies in the hands of the few

If an app, website, or online service is down, a cloud service outage is probably to blame

When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.

The AWS Outage is a wake-up call. Trust me.Back in the day, we ranwebsitesoff personal and corporateservers, usually located within our homes and offices. As the internet grew, we built server racks, co-locations and datacenters. Eventually, though, businesses and services of all sizes offloaded server efforts to third parties—or as they’re known now,cloudservices.

The logic is solid. We live in homes, but do not physically build our own houses. The act of serving and scaling websites is not core to the service they provide. Well, it sort of is in that without servers there is no service. But the server is running through APIs, scripts, and other algorithms and programs developed by the company to deliver things like your Netflix stream, the details of your Coinbase wallet account, or the next Tinder prospect.

The ability of cloud services likeAmazonWeb Services (AWS) andMicrosoft’s Azure to, if you pay enough, rapidly scale up (or down, as needed) makes them a smart strategic decision for any business of any size. You never know, for instance, when a small business is going to balloon into a big one and when it needs to service 10,000 simultaneous users instead of 500.

That’s the obvious upside of Cloud-based web services. The downside is what happened this week with AWS.

AWS outage

AWS outage

Tuesday afternoon, hugechunks of AWS crumbled. TheAWS Health Dashboardprovides a nice play-by-play of the nearly seven-hour outage. At the heart of it was not, at least according to Amazon, an attack, hack, or Denial of Service (DDoS) assault. It was a pair of misbehaving APIs in one sector of the massive service.

We all live in fear of a major DDoS or hack breaching these systems (really any system we rely on) and bringing them to their knees, but that’s rarely the case. When Cloudflare went down in 2019, it was initially assumed to be an attack on its system. However, we soon found out that it was just a bad software deployment,essentially human error.

Even with the AWS outage contained to what Amazon calls “US-EAST-1 Region,” the impact was significant and widespread. It was felt across consumer-facing platforms like Disney+ and, naturally, Amazon.com and some Alexa services.

Are you a pro? Subscribe to our newsletter

Are you a pro? Subscribe to our newsletter

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

When I posted the ongoingnews on Twitter, I noticed how many people virtually slapped their heads and exclaimed, “That’s why [insert service] was out!”

It occurred to me that many of these users had no idea that AWS sits behind their favorite consumer and business systems. No one, by the way, has the exact number (outside Amazon), but recent reports claimAWS serves millions. Microsoft’s Azure also reports millions of users and the majority of Fortune 500 companies.Google Cloudhas big names like Verizon, NewsCorp and Facebook.

Does something need to change?

The widespread use of cloud services is not a bad thing, though the lack of insight can lead to confusion and finger-pointing, like the guy who couldn’t amend orders in his system and got multiple error messages blaming his own systems (and not a third-party provider like AWS).

The combination of cloud systems’ wide reach and general lack of information and real-time feedback to affected customers is cause for some concern. The scale of one any one outage is probably cause for alarm, especially as we consider the inevitable next one.

Gone are the days when someone’s server rack goes down and one website hiccups. Now we have small failures in big cloud systems like AWS, Axure and Cloudflare that trigger a tsunami of outages.

One person on Twitter asked, “What happened to scaling and load balancing?” It’s a fair question. AWS is built on hundreds of separate cloud server clusters and has tons of redundancies, scaling, and load balancing. And still, sometimes, it isn’t enough. Complex systems can misbehave and are especially vulnerable to software updates that can collide with ageing code. For as powerful and distributed as all these cloud services are, AWS included, they’re still programmed, run, and serviced byfallible humans.

So how do we better inform the public and, more importantly, protect AWS, Azure, Cloudflare, and others from these kinds of errors, ones that lead not only to downed sites and services but the loss ofmillions of dollars?

It may be time to step back and look at cloud systems integrity, security, in the same way we watch out for water systems. None of them are too big to fail, it seems, but all are too important to damage, violate, or lose.

A 38-year industry veteran andaward-winning journalist, Lance has covered technology since PCs were the size of suitcases and “on line” meant “waiting.” He’s a former Lifewire Editor-in-Chief, Mashable Editor-in-Chief, and, before that, Editor in Chief of PCMag.com and Senior Vice President of Content for Ziff Davis, Inc. He also wrote a popular, weekly tech column for Medium called The Upgrade.

Lance Ulanoffmakes frequent appearances on national, international, and local news programs including Live with Kelly and Mark, theToday Show, Good Morning America, CNBC, CNN, and the BBC.

iStorage Group acquires Kanguru Solutions as it looks to expand security offering

Phishing attacks surge in 2024 as cybercriminals adopt AI tools and multi-channel tactics

Smeg Combi Steam Oven review: a multi-functional countertop oven that looks stunning and cooks well