How Facebook Was Temporarily Wiped Off the Internet: What We Know So Far
Cloudflare said, “It was as if someone had ‘pulled the cables’ from their data centres all at once."
Facebook, Instagram, and WhatsApp disappeared from the internet for nearly six hours on Monday at 11:40 am in the US (around 9 pm IST) in what is being dubbed as the biggest internet outage in history by Downdetector, a site that monitors internet outages.
The global outage knocked out every corner of Mark Zuckerberg’s tech empire, which meant that a significant portion of the 3.5 billion people, who use Facebook's services worldwide, could not communicate with friends and family, and businesses that use these services were unable to reach their customers.
While such outages are not uncommon, the duration and scale of the outage makes it significant. Such was the chaos that followed that, according to reports, employees at various Facebook sites and offices were unable to access internal communication systems and tools to analyse the problem, and many were even locked out of the office areas as their access cards stopped working.
“Today's events are a gentle reminder that the Internet is a very complex and interdependent system of millions of systems and protocols working together,” internet infrastructure company Cloudflare said after the social media blackout on Monday.
But why did this happen? Here’s what we know so far.
Facebook and its apps, that billions of users depend on, went missing on Monday, when their Domain Name System (DNS) records became unreachable.
Often thought of as the internet’s phonebook, DNS converts the URL a user types (ex: google.com) into IP addresses, where websites live.
Symptom of the Problem: Unreachable DNS
According to WIRED, mistakes are common with DNS and are generally the reason behind sites being shut down. But since they are considered easy to resolve, something murkier seems to have happened with Facebook.
In Facebook’s case, “It was as if someone had ‘pulled the cables’ from their data centres all at once and disconnected them from the Internet,” Cloudflare said.
Chief research officer of cyber threat intelligence company Bad Packets, Troy Mursch was reported as saying, “Facebook's outage appears to be caused by DNS; however that's just a symptom of the problem.”
Facebook made Border Gateway Protocol (BGP) updates, or routing changes on Monday, after which its routes were withdrawn, leading to Facebook’s DNS servers going offline.
But How Does DNS Work?
Cloudflare explains that when someone types a website's URL in the browser (like https://facebook.com) the DNS resolver, responsible for translating domain names into actual IP addresses to connect to, "tries to grab the answer from the domain nameservers, typically hosted by the entity that owns it."
If, however, the nameservers are unreachable or fail to respond because of some other reason, then a SERVFAIL error is returned.
Falling Off the Internet's Map: How Does BGP Come Into the Picture?
If one considers DNS as the phonebook, then BGP would be its navigation system, which decides what route data takes as it contains the IP addresses of its DNS nameservers.
The Internet is literally a network of networks, and it’s bound together by BGP. Basically, without BGP, the Internet routers wouldn't know what to do, and the Internet would stop working altogether.
So, in Facebook's case. it was as if Facebook had fallen off the internet’s map. And if you tried to reach its IP addresses, “The packets ended up in a black hole”, Wired quoted Mursch as saying.
John Graham-Cumming, CTO of Cloudflare, said the same thing “It appears that Facebook has done something to their routers, the ones that connect the Facebook network to the rest of the internet.”
During the outage, without divulging details, Facebook said in a tweet that it's "working to get things back to normal as quickly as possible."
Later, Facebook explained the cause of the problem saying, "Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt."
Impact on Third-party Sites?
Since the BGP was withdrawn, resulting in the inaccessibility of the Facebook’s Domain Name System, it could not be accessed on third party sites.
Moreover, it’s considered probable that since the company’s internal network couldn’t reach the outside internet, it employees couldn’t work. Imagine having left the car keys inside a locked car.
Meanwhile, DNS resolvers like Cloudflare, services that convert those domain names (url) into IP addresses, saw double the usual traffic, as people kept trying to reload the social media apps.
Graham-Cumming added, “It’s not so much the dramatic story of the whole internet could fall over, or some nonsense like that. It’s more that it’s an interconnected system and it stays up partly because of technical things and partly because of people who keep an eye on it day and night”, WIRED reported.
Facebook’s Loss, Twitter’s Gain
With Facebook, Instagram, and WhatsApp down, traffic on Twitter and Signal app increased massively.
Meanwhile, Twitter reacted to the massive increase in its active users with a tweet saying "Hello literally everyone!", sparking off a wild Twitter thread that saw participation from several major brands and celebrities.
All this was happening as Zuckerberg's personal wealth fell by more than $6 billion because of the outage, a report by Bloomberg said.
The social media giant's stock also plummeted by 4.9 percent on Monday, taking Zuckerberg's worth down from nearly $140 billion to $121.6 billion.
Services of Facebook, Instagram, and WhatsApp started coming back online on Tuesday morning and have been restored for most users.
(With inputs from WIRED, The New York Times, and Cloudflare.)
Subscribe To Our Daily Newsletter And Get News Delivered Straight To Your Inbox.