Facebook And Related Sites Go Down, Chaos Ensues

Facebook, along with WhatsApp, Instagram, and Facebook Messenger, have gone offline and it appears to be not just a glitch.

Andrew Donaldson October 4, 2021

Facebook, along with WhatsApp, Instagram, and Facebook Messenger, have gone offline and it appears to be not just a glitch.

Washington Post:

Facebook and many apps in its suite of social media and chat services went dark for hours Monday in a widespread outage that appeared to affect users globally.

Facebook, Instagram, WhatsApp and Messenger were unreachable for many users, who instead saw a spinning wheel on their apps that never loaded.

Facebook’s internal communication platform, Workplace, went down altogether, said a person familiar with the matter who spoke on condition of anonymity because they weren’t authorized to speak publicly. As employees turned to third-party tools such as Slack, many found themselves locked out of even those, because Facebook’s mechanism for signing on to them was not working said another person familiar with the matter who spoke under the same conditions.

Facebook spokesman Andy Stone tweeted that the company was aware of the issues and was “working to get things back to normal as quickly as possible, and we apologize for any inconvenience.”

Reports on Downdetector suggest users across the United States, in Egypt, in Serbia and many other places were impacted. The issues began at about 11:39 a.m. Eastern time.

“Something happened internally at Facebook that messed with their network settings on how Facebook talks to the rest of the world and accesses the Internet,” said Courtney Nash, senior research analyst at security company Verica.

The issue seems to be with Facebook’s border gateway protocol routes, or paths that allow routers to exchange information, said Doug Madory, director of Internet analysis for Kentik, a network monitoring company. Madory calls them the “underpinnings of how the Internet operates.”

Brian Krebs offers a preliminary explainer:

Doug Madory is director of internet analysis at Kentik, a San Francisco-based network monitoring company. Madory said at approximately 11:39 a.m. ET today (15:39 UTC), someone at Facebook caused an update to be made to the company’s Border Gateway Protocol (BGP) records. BGP is a mechanism by which Internet service providers of the world share information about which providers are responsible for routing Internet traffic to which specific groups of Internet addresses.

In simpler terms, sometime this morning Facebook took away the map telling the world’s computers how to find its various online properties. As a result, when one types Facebook.com into a web browser, the browser has no idea where to find Facebook.com, and so returns an error page.

In addition to stranding billions of users, the Facebook outage also has stranded its employees from communicating with one another using their internal Facebook tools. That’s because Facebook’s email and tools are all managed in house and via the same domains that are now stranded.

“Not only are Facebook’s services and apps down for the public, its internal tools and communications platforms, including Workplace, are out as well,” New York Times tech reporter Ryan Mac tweeted. “No one can do any work. Several people I’ve talked to said this is the equivalent of a ‘snow day’ at the company.”

The mass outage comes just hours after CBS’s 60 Minutes aired a much-anticipated interview with Frances Haugen, the Facebook whistleblower who recently leaked a number of internal Facebook investigations showing the company knew its products were causing mass harm, and that it prioritized profits over taking bolder steps to curtail abuse on its platform — including disinformation and hate speech.

We don’t know how or why the outages persist at Facebook and its other properties, but the changes had to have come from inside the company, as Facebook manages those records internally. Whether the changes were made maliciously or by accident is anyone’s guess at this point.

Obviously, this is a developing story and probably one that will have some twists and turns as information comes out. Until then, second look at My Space?

35 thoughts on “Facebook And Related Sites Go Down, Chaos Ensues”

Michael Cain says:

October 4, 2021 at 5:27 pm

Come now. Haven’t we all inadvertently crushed the corporate network at least once?Report
1. Marchmaine in reply to Michael Cain says:
  
  October 4, 2021 at 6:21 pm
  
  I definitely did *not* click that link.Report
  1. Ozzzy! in reply to Marchmaine says:
    
    October 4, 2021 at 9:02 pm
    
    I reply’d all. How often do you get the chance?Report
  2. Oscar Gordon in reply to Marchmaine says:
    
    October 5, 2021 at 2:09 pm
    
    Oh man, FB hit with ransomware – that would be funny.Report
2. JS in reply to Michael Cain says:
  
  October 4, 2021 at 6:34 pm
  
  I’d be surprised if that wasn’t the case.
  
  I once attended an 800am Monday morning emergency meeting called “Do not mount the user folder to /tmp”. This is because, of course, every Sunday night a garbage collection routine would empty out /tmp, as it’s just there to hold temporary files, and if it wasn’t cleaned regularly it’d bloat up.
  
  Someone, who wasn’t at the meeting because he was in the middle of a flight overseas but was going to arrive to a TON of angry emails — had, in fact, mounted /user to /tmp, and then didn’t dismount it. And then hopped a flight to Europe for a work trip.
  
  So around 2:00am, the garbage collector cleaned out /tmp. And also /user, the entire user partition with all the useful data.
  
  I wasn’t thrilled to be there at 800am for a problem I had nothing to do with, but it was called by the angry people who had gotten woken up at 2:00am as numerous critical processes started failing and automated “OH CRAP” alerts had fired off.
  
  It took them about 12 hours to get everything back up.Report
  1. Brandon Berg in reply to JS says:
    
    October 5, 2021 at 4:51 am
    
    Better to be there at 8 AM for a problem someone else caused than not to be there at 8 AM for a problem you caused.Report
  2. Oscar Gordon in reply to JS says:
    
    October 5, 2021 at 2:11 pm
    
    I bet that garbage collector routine now looks for /user and dismounts it before going to work.Report
Burt Likko says:

October 4, 2021 at 6:19 pm

Well. I guess we’ll have to back to using MySpace, Friendster, GeoCities, or even (horrors!) blogs!Report
Jaybird says:

October 4, 2021 at 6:37 pm

Former vice-presidential candidate points out that this looks exactly like an op:

FB, WhatsApp and IG go down the same day that a whistleblower is advocating for more government regulation of social media.

This is more than an outage.

This all looks and feels similar to the kind of destabilization the US government routinely does to opposing states.

— Spike Cohen (@RealSpikeCohen) October 4, 2021

Report
1. Saul Degraw in reply to Jaybird says:
  
  October 4, 2021 at 6:42 pm
  
  Current nutbar posts unsubstantiated conspiracy theory without evidence to twitter because it is a bigger soapbox than the Kinko’s copier at 3:00 a.m. Wishthinker reposts to just ask questions. In other news, Franco is still dead.
  
  Yes FB has had a lot of bad press and the whistleblower interview last night was horrible for them but the idea that someone did something idiotic engineering wise is still a better explanation for the shut down.Report
  1. Jaybird in reply to Saul Degraw says:
    
    October 4, 2021 at 6:50 pm
    
    My theory is that they got hacked. *BAD*.
    
    The “something idiotic engineering wise” involved ignoring security best practices and they got hit by somebody or a group of somebodies like DarkSide.Report
    1. Saul Degraw in reply to Jaybird says:
      
      October 4, 2021 at 7:04 pm
      
      Brian Kerbs makes the most sensible argument. Someone made a silly mistake, even engineers are capable of this. The timing of the event is letting people fly their rejected Shadowrun games in the open.Report
      1. Jaybird in reply to Saul Degraw says:
        
        October 4, 2021 at 7:22 pm
        
        Shadowrun? Where the hell did that come from?
        
        Wait, have you been playing a Shadowrun game? Without telling us?Report
        
        Saul Degraw in reply to Jaybird says:
        
        October 4, 2021 at 7:32 pm
        
        The simplest answer to what happened is that an engineer made a silly mistake and it took everything down for a few years. The timing of the event is just a damned coincidence. It is not the sign of some cyberpunk dystopia adventure. A good chunk of humanity seems to hate the idea of coincidence though and is letting their freak flags fly. It isn’t harmless fun, it is conspiratorial nonesense. Spike Cohen has no authority just because he was a former candidate for VEEP.Report
        
        Doctor Jay in reply to Jaybird says:
        
        October 4, 2021 at 7:45 pm
        
        I’m with Jaybird on this, Saul. I had no idea until this moment that you even knew what Shadowrun was.
        
        And I think it’s awesome that you do, by the way. Also, that was an excellent use of it in conversation.Report
      2. JS in reply to Saul Degraw says:
        
        October 5, 2021 at 11:20 am
        
        “The timing of the event is letting people fly their rejected Shadowrun games in the open.”
        
        Like the troll with tailored pheromones, strength mods, skeletal mods, a combat computer, and a truly ridiculous amount of bio-ware due to blatant rules abuse?
        
        He was a truly loved and incredibly likeable chap, mostly due to the pheromones, and quite capable of turning you to a find chunky mist. And definitely not allowed to use him in campaigns because of “rules” and “balance” and “You can’t go around using a crew mounted weapon as a hand-gun when you can’t solve problems by mind-whamming people with your pheromones”…
        
        Poor Hugbear. Strangled by the DM before he really got to fly free.Report
        
        North in reply to JS says:
        
        October 5, 2021 at 2:12 pm
        
        Maybe it’s just the tailored pheremones but I love, love, LOVE this comment! RIP Hugbear!Report
        
        Saul Degraw in reply to JS says:
        
        October 5, 2021 at 2:29 pm
        
        it was more about the inability of people to take coincidence as a thing and let their freak flags fly.Report
North says:

October 4, 2021 at 7:03 pm

Pity it wasn’t twitter, or better yet both.Report
1. Saul Degraw in reply to North says:
  
  October 4, 2021 at 7:12 pm
  
  The thing is that it wasn’t just FB, it was instagram and whatsapp as well. Whatsapp is an actually very useful communication tool used by billions of people across the world to maintain contact with friends and family at home or abroad. My partner uses it to call her family and friends in Singapore. A lot of small businesses depend on instagram for sales and advertising.Report
  1. North in reply to Saul Degraw says:
    
    October 4, 2021 at 7:25 pm
    
    *nods* Yes, it’s serious and troubling for those businesses and people and I feel bad for them.
    … … …
    Pity it wasn’t twitter, or better yet both.Report
Doctor Jay says:

October 4, 2021 at 7:48 pm

Count me among those who think it is utterly unrelated to whistleblower stuff, just a wild coincidence. I have read the “misconfigured BGP records” elsewhere, and I believe it. The added tidbit was that it was definitely FB’s records misconfigured.Report
Jaybird says:

October 4, 2021 at 8:04 pm

I do not know if this is the official explanation, but it is a good one and fits the “dumber than I imagined” requirement:

a bunch of friends have texted me asking for a basic explanation as to what the hell happened to knock off all of Facebook so:

— alex hern (@alexhern) October 4, 2021

Report
1. Mike Schilling in reply to Jaybird says:
  
  October 4, 2021 at 8:36 pm
  
  If a mouse deletes all of his cookies …Report
2. veronica d in reply to Jaybird says:
  
  October 4, 2021 at 11:07 pm
  
  Good grief.
  
  For context, BGP is the public internet routing protocol. It’s purpose is so that separate companies and internet providers can locate their respective networks. It was never meant to route traffic within an organization.
  
  In other words, if you use Xfinity and want to send internet packets to someone connected to AT&T, Xfinity will use BGP to figure out how to route the packets over the public internet. By contrast, if sending a packet to another Xfinity customer, or an Xfinity server, then no BGP is needed.
  
  If FB managed to lock out they internal systems because of a BGP failure — oof.
  
  I’m actually unfamiliar with how BGP interacts with DNS. I wonder if that is a newish feature.Report
  1. Mike Schilling in reply to veronica d says:
    
    October 5, 2021 at 1:02 am
    
    Here is a post from Cloudflare describing how the outage affected them (to begin with, it made them wonder if their DNS server was broken). https://blog.cloudflare.com/october-2021-facebook-outage/Report
  2. Mike Schilling in reply to veronica d says:
    
    October 5, 2021 at 1:11 am
    
    Theory: The C++ program that creates and maintains the BGP routing tables usually gets killed via SIGTERM, but someone added a signal handler to let it exit normally, not realizing what would happen when all the destructors ran.Report
    1. Susara Blommetjie in reply to Mike Schilling says:
      
      October 5, 2021 at 5:02 pm
      
      I’m coding in Python now. You’re making me nostalgic…Report
      1. Michael Cain in reply to Susara Blommetjie says:
        
        October 5, 2021 at 5:43 pm
        
        I have a love-hate relationship with Python. I’ve written several small things with it — I get something reasonable up and running as quickly, usually more quickly, than any language I’ve used. Then I try something more complicated and something bites me: scoping, some bizarre library interface, something.Report
        
        Mike Schilling in reply to Michael Cain says:
        
        October 6, 2021 at 7:18 pm
        
        The fact that Python doesn’t catch most typos until it stumbles over them makes it highly unproductive for me.Report
        
        Michael Cain in reply to Mike Schilling says:
        
        October 7, 2021 at 1:55 pm
        
        I include that under the broad topic of scoping.Report
Chip Daniels says:

October 4, 2021 at 8:07 pm

I heard that the Post-it with the password to their server was inadvertently thrown out.Report
fillyjonk says:

October 5, 2021 at 8:12 am

I feel like this is a “never ascribe to malice that which can be explained by stupidity” moment. The conspiracies were fun for a while but it looks like someone just did something really boneheaded.

But yeah, this raises questions about the integration of everything and how few platforms run it. Can you *imagine* if something like smart-home tech went down for 8 hours and no one could adjust their thermostats or turn on lights?Report
1. PHilip H in reply to fillyjonk says:
  
  October 5, 2021 at 8:16 am
  
  One of the many many reasons I have refused to be part of the Internet of Things. The very slight bump in convenience does not outweigh the huge vulnerabilities.Report
2. North in reply to fillyjonk says:
  
  October 5, 2021 at 2:13 pm
  
  Yeah, there’s no way in heck I’d ever put most “stuff” on the internet. Fish that.Report