Microsoft’s Azure cloud service has been experiencing significant outages overseas, leading to a torrent of angry Tweets from customers there. The problems started in western and northern Europe and later in the morning spread to India. As of 10:02 a.m. Pacific Time, some customers were tweeting that the problems had been resolved, though no resolution was reflected on Microsoft’s blogs or on a status-reporting site.
Today’s issues began at roughly 8:10 a.m. (all times are Pacific Time). For some, they called into question whether the purported redundancy of having multiple regions really offers effective protection against outages.
At about 8:50 a.m, one customer observed, “Lots of problems in West EU/North EU @Azure :-(. Not very good if you split your Azure redundancy between the 2 regions…” At 9:09 a.m. , another wrote, “Azure in West and North Europe is broken. Time to shovel everything into the shiny UK?”
Reliable uptime is one of the most highly valued aspects of cloud computing, and glitches like this can hurt competitors in the hot market for cloud services. Microsoft is number two in the market and battling with hugely dominant Amazon Web Services for a bigger slice of the pie. AWS has not been untouched by outage issue itself.
Azure’s problems may stretch back to yesterday’s issues with my.visualstudio.com, detailed in Microsoft’s Developer Service Blog. By 6 p.m., they had been “completely mitigated,” the blog said.
But at 8:16 a.m. today, another post on that blog said Microsoft was “actively investigating” issues with Visual Studio Team Services in West Europe. It warned that “customers may experience service interruptions,” said there was no workaround and promised an update before 9 a.m.
Separately, the Azure status blog this morning showed that North and West Europe regions had suffered “intermittent issues” loading site-recovery tiles in the Azure management portal between Sept. 7 at 5 a.m. and Sept. 8 at 2:37 p.m. The post suggested the issues had been resolved.
But as of 8:47 a.m. today, the site IsTheCloudUp.com showed problems with SHD Banner East US 2, SHD Banner North Europe and SHD Banner West Europe. “SHD” is the Service Health Dashboard, a service that lets Azure subscribers monitor the availability of Azure services. So saying SHD Banner is having problems doesn’t say anything about the nature of those problems.
Only a few minutes later, at 9 a.m., IsTheCloudUp.com showed the outages had widened. For the North and West regions of Europe, it showed outages in App Service, Cloud Services, SQL Database, Virtual Machines; and Visual Studio Team Services in West Europe. By 9:36 a.m., the problems has broadened further, extending to App Service, Cloud Services, Virtual Machines and SQL Database in West and North Europe; Automation, Azure Search, Data Catalog, HDInsight, Media Services, Redis Cache, Service Bus and Visual Team Services in West Europe; and Service Bus in West India.
The North Europe region is based on Ireland. The West Europe region is based in the Netherlands. Microsoft doesn’t reveal how many data centers are located within each region. The West India region is based in Mumbai. The company just opened two regions in the U.K., which don’t seem to have been affected by the troubles.
The first tweets about today’s problems began at about 8:10 a.m., indicating that Visual Studio Team Services (formerly Visual Studio Online) was down. One user wrote that he couldn’t connect to one of his servers in West Europe. Another said he couldn’t connect with his West Europe SQL database service. “All databases are down,” wrote another. “DNS seems to blowing up,” wrote yet another. One griped, “Just love the fact that I have the benefit of paying £187pm to get support when @Azure goes down, only in this world is that OK!!!”
Microsoft didn’t immediately respond to a request for comment. We’ll update this story throughout the day.
UPDATE, 10:55 a.m.: In an email, Microsoft commented, “Some customers with resources deployed in West and North Europe may be experiencing difficulties with some services, and our team is in the process of resolving this. In the meantime, customers can get updates on the Microsoft Azure Service Health Dashboard.“
UPDATE, 11:14 a.m.: One Azure-status website is reporting all services are up and running in all regions. IsTheCloudUp reports the situation is unchanged except that the India outage has been remedied, and Microsoft’s own Azure-status site is still reporting numerous outages. “@ up & running now, but clearly we will consider alternatives to @,” one customer tweeted at 11:11 a.m.
UPDATE, 11:57 a.m.: All monitoring sites agree all services have been restored except for Visual Studio Team Services in West Europe. But Microsoft remained cautious: “At this point, users can get to their VS Team Services accounts in West Europe and perform regular operations, but custom extensions will continue to face issues. Our engineers are working hard to completely resolve this issue,” according to the Microsoft Developer blog.
UPDATE, 2:10 p.m.: All European issues have been resolved. But now problems have arisen in the South Central U.S. and East U.S. Microsoft on its blog wrote, “Starting at 15:00 UTC on 09 Sep, 2016 [8 a.m. PT], customers using Visual Studio Team Services \ Build & Deployment/Build (XAML) in South Central US and East US will experience longer than usual build queue times. Engineers are investigating for a potential underlying root cause and are implementing mitigation steps.”
UPDATE, 10:00 p.m.: All issues are now resolved, and may have been for some time — the blog doesn’t say when resolution was achieved.