Roblox, an online games platform with over 200 million monthly players that is used by two-thirds of kids aged 9 to 12 in the US, was down for three days before returning to normal operations at 4.45 pm PST on October 31.
Beginning on October 28 at 4 pm PST, Roblox was hit by a server disruption that its status page described as an “internal system issue”. Many players blamed it on the popularity of a giveaway called the Chipotle Boorito Maze, in which players could earn daily items by navigating a maze, and get a free burrito by dressing up in a “a Chipotle-inspired costume”. Roblox tried to tamp down this theory, tweeting that “this outage was not related to any specific experiences or partnerships on the platform.”
From 12.50 pm PST on October 31, Roblox began allowing some users to connect, writing that, “Traffic is being allowed incrementally. Some, but not all players will have access.” The outage officially ended at 4.45 pm, when normal service was resumed and parents rejoiced.
A blog post by David Baszucki, Roblox founder and CEO, explained that the outage was caused by several factors in combination. “A core system in our infrastructure became overwhelmed,” he wrote, “prompted by a subtle bug in our backend service communications while under heavy load. This was not due to any peak in external traffic or any particular experience. Rather the failure was caused by the growth in the number of servers in our datacenters. The result was that most services at Roblox were unable to effectively communicate and deploy.”
More information on the cause will follow. “We will publish a post-mortem with more details once we’ve completed our analysis,” Baszucki continued, :along with the actions we’ll be taking to avoid such issues in the future. In addition, we will implement a policy to make our creator community economically whole as a result of this outage.”