Cloudflare outage brings down half of the internet

Screenshot-2019-06-24-at-19.22.34.png#asset:10560

Couldn't load Slack or Discord yesterday? You aren't alone—Cloudflare and many large hosting companies suffered from a huge outage yesterday because Verizon appears to have made a simple routing mistake.

According to Cloudflare, Verizon set the preferred path for many internet routes to a tiny company in Northern Pennsylvania:

This was the equivalent of Waze routing an entire freeway down a neighborhood street — resulting in many websites on Cloudflare, and many other providers, to be unavailable from large parts of the Internet. This should never have happened because Verizon should never have forwarded those routes to the rest of the Internet.

The too long, didn't read of this is actually somewhat predictable: a protocol called Border Gateway Protocol (BGP) was responsible, a tool that's used to join disparate networks together and build the 'map' of the internet for ISPs. 

BGP is almost always to blame for problems like this. Just last year, a tiny ISP in Nigeria brought Google to its knees after it accidentally broadcasting a new, incorrect route to the service—and having it repeated out to the wider internet via China Telecom.

If the protocol is used correctly, the leak usually stops at the ISP responsible for configuring it, but as Cloudflare points out, the company wasn't using best practices and the routes began being repeated by other internet providers. 

More than 15 percent of Cloudflare's global traffic disappeared during the incident, pictured above, a scale that seems small, but is enormous enough that you almost certainly ran into an affected service at least once yesterday.

What's astounding about this incident is Verizon didn't even respond to Cloudflare during the incident. The company constantly tried to contact engineers at Verizon via email and phone, none of which were answered—leaving Cloudflare to do its best to route around them.

It's an important reminder of how fragile the internet really is, and how a single problem can cause catastrophic outages for services that aren't technically affected, just caught in the dragnet of someone's bad setup.


Tab Dump

Facebook now summoned to appear at a House committee hearing on July 17 as well
Two hearings, back to back, about Facebook's proposed cryptocurrency, Libra. David Marcus, lead for the project, will testify at both the Senate hearing, which will happen on July 16, and the House committee a day later. 

FYI: I wrote about Libra late last week for Medium—that magic link unlocks the paywall for you—and discussed why Facebook might be the reason for the currency's success and failure.

Quartz finally killed its 'chat with the news' app
The company bet big on chat-bots three years ago, when the craze was in full swing, and the Qz app was one of the best demonstrations of how a chat-first interface could make the news more accessible. As it turns out, it wasn't that great for consuming the news, and Quartz has gone back to the traditional news app again.

Sidewalk Labs finally publishes its master plan for a city of the future in Toronto

chrome_2019-06-25_10-11-40.jpg#asset:10559

It's been slow going, but Sidewalk Labs, which is an Alphabet company, is still working on building an entire neighborhood on the waterfront of Toronto. It just released its thousands-pages long document detailing its plans, motivation, and why it should be allowed to perform the experiment.

Researchers warn that hackers have been stealing call records from ten cell networks globally to perform targeted surveillance

The Raspberry Pi 4 is faster, supports dual 4K monitors, and has up to 4GB of RAM

Good read: Busy Being Born—the Macintosh interface wasn't designed all at once, but over the space of five years, piece by piece.