What are Meltdown and Spectre? GeekWire's guide to the problems with the world's computer chips

Meltdown and Spectre. You’ve probably been hearing these terms a lot. These are vulnerabilities discovered in many of the computer processors used not only in our computers and devices but also in data centers and in the cloud.

So what are they, what do they mean for the tech industry, and what should you do about them? Tom Krazit, GeekWire’s cloud and enterprise editor, offers a primer on this special edition of the GeekWire Podcast.

Listen below and continue reading for an edited summary.

What are Meltdown and Spectre?

These are two vulnerabilities that were discovered last June by researchers at Google and at several universities. It’s a very complicated system, but basically, someone figured out how to take advantage of the way that a modern processor executes instructions, to be able to access protected areas of a chip in which things like passwords or other secure information could be stored. With Meltdown that’s what it would have allowed you to do. If you managed to get an app running on someone’s computer, you could theoretically get access to some of their passwords. Meltdown is patchable and it seems like that’s actually going to be something that will blow over fairly quickly once everyone updates their systems. Spectre is a little bit different. Spectre is a similar type of vulnerability, but there’s no comprehensive patch for Spectre. There are ways to make it harder to implement, or ways to make its effects less damaging, but the only real cure will be to redesign processors to get around this vulnerability, and that will take years to do.

What happens in the meantime, particularly with Spectre, if there’s no immediate patch?

There are patches that people are employing that can just make it harder to exploit. The thing about Spectre is it’s an extremely novel approach to getting access to levels of a computer that regular applications aren’t supposed to be able to access. With it being that novel, it would take someone with a lot of expertise to be able to successfully do this. We’ve seen some proof-of-concept attacks that have come out this week. But proof-of-concepts and actual attacks are quite different, and it seems like a lot of security researchers are fairly confident that any short-term effect from Spectre can be handled by these patches. But the longer these things sit out there, the more criminals will be trying to figure out ways to actually implement and take advantage of this vulnerability, and that longer-term arms race is really just beginning.

They discovered this back in June. Why is it only coming out now? Why is this a surprise to many in the industry?

This is actually a pretty common way to approach major vulnerabilities among the big tech companies of the world. If the Google researchers or the university researchers had just published their findings in June without really talking to anybody about them, it would have been much easier for those bent on exploiting those vulnerabilities for criminal intent to take action. Patches wouldn’t be available without notifying the companies that run a lot of these servers as well as the big cloud providers. You would have left a lot of people open to attack. It’s actually the responsible thing to do — to disclose this to Intel and server vendors and cloud vendors under some sort of NDA so that everyone can work together on a patch for the solution.

The problem that happened this week is that it started to leak out. When you’re patching Linux, which is an open-source project, the patches are available and people start to figure out what’s going on. So the rollout could have gone a little smoother. I think everyone involved would probably agree with that, but the six-month delay actually gives companies like Amazon Web Services and Microsoft time to patch their systems to be able to handle this new reality. In fact Amazon this week said that it had patched a large percentage of its systems that might have been affected by this bug before it was even disclosed, and that the remaining ones would be completed this week. Microsoft Azure has been on a similar pace, too. They should have everything wrapped up by early next week. Google patched its cloud servers immediately because it discovered this stuff six months ago.

What should people do?

There are patches for both. It’s just that people have a lot more confidence in the patch for Meltdown. The important thing to remember about Meltdown, for anybody, is that you should just patch your computers, update your software and you will most likely be fine whether you’re an individual laptop user or somebody responsible for a major data center.

Spectre is going to be a little bit different. I think the average computer user has little to worry about with respect to Spectre. It’s not going to be used for mass consumer attacks, but if you’re working at a cloud vendor, if you’re working at a data center with a lot of sensitive information, Spectre’s something you’re going to have to watch for years.

Have we ever seen anything on this scale before in processors?

There was a major actual bug in an Intel chip, a hardware bug, about 20 years ago that forced Intel to do a massive recall, and I think it wound up costing them nearly a billion dollars. That was a big, big deal. In this case, Intel was very, very careful to describe what this was. Their stance is that this is not an error. The chip is working as designed. Basically somebody came up with a technique that nobody had ever thought of before and it took a lot of the industry by surprise because the method that they’re using to exploit this is actually a pretty common method for modern processor design and just keeping up with the demands of modern applications. Everyone involved had just sort of rolled along for years assuming that the way they had implemented this in the chip was keeping the systems totally secure and it turns out it wasn’t.

No one is totally sure whether or not anyone had actually exploited a system using this technique prior to last June. There’s no real way to know that right now. But operating systems were designed under the assumption that the chip was responsible for handling this piece of security. And it turns out that the chip is incapable of handling it the way that everyone thought, which requires the fix to be implemented in software, and which is why people were worried that it would be a performance hit on some of these systems, because now you’re adding additional code to the operating system and that can, in some cases, take longer to run.

Is there a case to be made here that Intel is too powerful, too strong, and therefore too vulnerable?

People have been making that case about Intel for 20 years. I do think that we’re at an interesting time, especially in the server data center market, where AMD just released a new server chip that looks to be pretty good on a performance basis. ARM, which designs cores for chips, finally has a server processor design out that a lot of people think will make some gains. Intel has like 95 percent of the data center market, so they’re pretty dominant in that market and you know any share that they lose won’t affect the bottom line too much. It will be very interesting to see if this incident opens windows for competitors that wouldn’t have necessarily seen a reason for people to buy them. Now, if you’re buying data center servers, if you’re buying servers for your own stuff, maybe you look at AMD, maybe you look at ARM just because you’re a little worried about Intel’s execution on these matters. That will take years to play out. And we don’t exactly know what will happen.

It seems similar to the era when we had all these exploits against Microsoft Windows in the headlines and people were looking at Macs on their desktop or maybe even Linux as alternatives, potentially more secure, even though there is something of a fallacy in that logic. Do you see parallels there?

The complexity involved here and the difficulty in actually exploiting a system using this vulnerability makes this a little bit different. A lot of the flaws in Windows XP and products like that were easily exploitable. You just get somebody to click on a Word doc and suddenly you own their system. This is a much more complex technique that I don’t think will have the same kind of widespread impact, especially if everybody patches. What it is, really, is a failure of imagination on the part of Intel and other processor makers. We do have to make that clear with respect to Spectre. It was a failure of imagination to understand that a novel technique for improving performance could actually be used against the system. That’s going to require changes, and Intel has admitted as much — that they will be designing processors differently in the future to account for this.

For somebody who knows very little about actual chip design, what is the technique for exploiting these vulnerabilities? Is there a way to explain it in a way that someone not well-versed in the ways of this industry sector can understand?

When a processor executes a string of instructions, when you tell the computer to do something, the operating system sends a signal down to the processor that says, ‘Hey we need to do this now,’ and the processor says, OK, and does that, and off you go. This all happens, of course, in milliseconds. Modern processors, in order to improve performance, use a technique called speculative execution. With a lot of the stuff you do in your computer, you do the same thing all the time. You switch between applications or you launch an application. A significant portion of what you do with a computer, you do over and over and over again. Processor designers have figured out that they can guess what you’re going to do and preload instructions for that activity because they know that pretty soon you’re probably going to do that thing again. This is called speculative execution.

Basically somebody figured out how to spoof that speculative execution by getting instructions that they wanted to run into that memory cache. A lot of the people I’ve seen talking this week are dazzled by the expertise of the Google researchers and others who’ve figured out how to do this because it’s just such a novel technique. A lot of people are saying, wow, that is some serious computer science nerdery right there.

It’s a very complex problem. I think it will require an immense amount of expertise to actually exploit. But as we’ve learned over the past couple of years there are lots of really really competent hackers working on behalf of criminal organizations and foreign governments who are no doubt spending a lot of time right now trying to figure this out.

The names “Meltdown” and “Spectre” are ominous. How much should people really be concerned about this and what are the long-term implications for end users?

Individual device users, just update your software and you’ll be fine. If you manage a large block of data center servers, obviously patching is part of your life and you should do that, of course, with respect to this case. But it’s going to have to come down to, for some of these people, whether or not there are performance hits as a result of the patches that are just too annoying to deal with. They’re going to have to make some strategic decisions about what they want to do and how they architect their data centers. The Computer Emergency Readiness Team which is one of the premier security organizations on the planet, is actually recommending to its followers that the only real way to solve this is to rip out the hardware and put in new chips, and I would expect that Amazon, Microsoft and Google have probably already started making plans to do that, just to accelerate that turnover to new chips as Intel rolls them out, and Intel is expected to release chips that will get around this vulnerability later this year. So it may not actually take all that long. But part of the problem with Spectre is that we don’t really know how long-term a threat it is right now. It’s only been a couple of days since most people learned about it. So security researchers will be hitting on this all year. Maybe toward the middle of the year we’ll have a better picture of what the long-term implications will be.

For more coverage of cloud and security technology, follow GeekWire’s Cloud Tech channel, sign up for our weekly Cloud Tech newsletter, and join us for our Cloud Tech Summit in June.

What are Meltdown and Spectre? GeekWire’s guide to the problems with the world’s computer chips

Most Popular on GeekWire

Job Listings on GeekWork

Related Stories

Most Popular on GeekWire

Job Listings on GeekWork