Ten years ago, if you were to buy the best CPU for gaming or otherwise, you'd have chosen AMD's Athlon 64. My, how times have changed. While AMD has struggled to rekindle its glory days as the CPU-performance leader, Intel's CPUs have gone from strength to strength over the past decade. Today, Intel's CPUs perform best, and use the least amount of power, scaling admirably from powerhouse gaming PCs all the way down to thin and light notebooks and tablets–segments that didn't even exist a decade ago. But this return to CPU dominance might never have happened had it not been for the innovations taking place at AMD back in the early half of the 2000s, which makes the company's fall from grace all the more galling.

The 64-bit extensions of AMD's Athlon 64 meant it could run 64-bit operating systems, which could address more than 4GB of RAM, while still being able to run 32-bit games and applications at full speed–all important considerations for PC players at the time. These extensions proved so successful that Intel eventually ended up licensing them for its own compatible x86-64 implementation. Two years after the launch of the Athlon 64, AMD introduced the Athlon 62 X2, the first consumer multicore processor. Its impact on today's CPUs cannot be overstated: everything from huge gaming rigs to tiny mobile phones now use CPUs with two or more cores. It's a change that even Intel's Gaming Ecosystem Director, Randy Stude, cited when I asked him what had the biggest impact on CPU design over the last decade.


AMD's Athlon 64 kickstarted the 64-bit revolution. Image credit: flickr.com/naukim

"So, the answer to the question is cores," Stude tells me. "I was here at Intel through the Pentium IV days. We hit a heat issue with that part and took a big right turn and introduced a very efficient product out of Israel [the Core CPU] that helped us take over performance leadership that–for the most part–we've enjoyed for the better part of a decade. We've been able to add cores quite efficiently, and that's led to some substantial performance gains for the PC in general."

This focus on cores has dominated the last decade of CPU development. Prior to the introduction of multicore CPUs, the focus was very much on increasing clock speeds. This gave games and applications an instant performance boost, with very little effort required from developers to take advantage of it. Moore's Law–which states that the number of transistors in a dense integrated circuit would double around every two years–was in full swing in the 90s and early 2000s. In the period from 1994 to 1998, CPU clock speeds rose by a massive 300 per cent. However, by the mid 2000s, power consumption and clock speed improvements collapsed, with both Intel and AMD fighting the laws of physics. The solution was to introduce more cores, so that multiple tasks could be executed simultaneously by individual CPUs, thus increasing performance.

The trouble is, unlike increasing clock speed, increasing the number of cores requires developers to change the way their code is written in order to see a performance increase. And, in the case of games development, that's been a slow process.

Games like Battlefield 4 that make use of multiple CPU cores are still the exception, rather than the rule.

"[Multicore CPUs] have required that the software industry come along with us and understand the notion of threading," says Stude. "For gaming, it's been challenging. Threading on gaming is a much more difficult scenario that both us and AMD have experienced. In general, you've got one massive workload thread for everything, and up until now that's been handled by, let's say, the zero core. The rest of the workload, whatever it might be for a particular game, goes off to the other cores. Today, game engine success is a bit hit and miss. You have some games, the typical console games that come over, that don't really push performance at all, and isn't threaded or lightly threaded."

"[Multicore CPUs] have required that the software industry come along with us and understand the notion of threading. For gaming, it's been challenging." – Intel

"The nature of development work for those platforms, especially in the early years, is that you'd get your game running and publish it and you'd rely heavily on the game engines that you as a publisher own, or that you acquire from third parties like Crytek and Epic," Stude continued. "If Epic and its Unreal engine on console don't have a threaded graphics pipeline–which to date they don't–then you're looking at the same issue that you see on the PC, which is a heavily emphasised single-core performance workload, and then everything else that happens like physics and AI happens on the other cores. It's not a completely balanced scenario, because by far the biggest workload is that render pipeline."

The problem has been more pronounced for AMD. Its Bulldozer CPU architecture (which all of its recent processors are based on modified versions of) tried to both ramp up clock speeds by lengthening the CPU's pipeline, increasing latency (an approach not too dissimilar to the disastrous Prescott Pentium 4 from Intel), and by increasing the number of cores by sharing resources like the scheduler and floating point unit, rather than by duplicating them like in a standard multicore CPU. Unfortunately for AMD, Bulldozer's high power consumption meant that clock speeds were limited, leaving the CPU dependent on software that made use of those multiple cores to reach acceptable performance. I asked Richard Huddy, AMD's Gaming Scientist and former Intel and Nvidia employee, whether chasing more cores was the right decision. After all, to this day, Intel's Core series of CPUs consistently outperform AMD's.

"So if you talk to games programmers–there are other markets as well–they have typically found it easy to share their work over two, four cores," says AMD's Huddy. People have changed the way they program for multi-core stuff recently over the last five years to cope with six-eight cores. They understand this number is the kind of thing they need to target. It's actually genuinely difficult to build work-maps of the kind of tasks you have with games to run on something 32 cores or more efficiently."

AMD's Richard Huddy had a hand in creating Direct X, as well as stints at ATI, Intel, and Nvidia.

"The more cores you have, the harder it gets, so there is a practical limit," continued Huddy. "If we produced 1000-core CPUs then people would find it very hard to drive those efficiently. You'll end up with a lot of idle cores at times and it's difficult. From a programmer's point of view it's super-easy to drive one core. So yeah, if we could produce a 100 GHz single-core processor, we'd have a fantastic machine on our hands. But it's mighty difficult to clock up silicon that fast, as we're up against physical laws here, which make it very difficult. There's only so much you can do that ignores the real world, and in the end you need to help programmers understand the kind of constraints they're building to."

"I'd love for us to build a single-core CPU. Truth is, if you built a single-core CPU, that just took all of the power of the CPU and scaled up in the right kind of way, then no programmer would find it difficult to program, but we have to deal with the real world."

The Death of Moore's Law?

The real world is Moore's Law, or rather, the end of it. The death of Moore's Law has been talked about on and off for years, and yet Intel and AMD have continued to see significant performance boosts across their CPU lines. But the upcoming launch of Intel's Broadwell architecture and its die shrink from 22nm to 14nm has seen several delays, prompting many to call out the death of Moore's Law once again. Certainly, both companies face a number of technical challenges when working at such small manufacturing processes. Intel, for example, developed its 3D Tri-Gate transistor technology–which essentially allows three times the surface area for electrons to travel–to deal with current leakage at 22nm and beyond.

"For the last decade–which is a strong portion of our existence, the dominant decade in terms of our revenues and unit sales–we were told Moore's law was dead and that the physics wouldn't allow us to continue to make those advances, and we've proven everyone wrong," says Intel's Stude. "I'm a futurist as a hobby, and I've learned a lot being at Intel. The day I started we had introduced the Pentium and even then the conversation was about what was possible from a die shrink perspective. I'm not ever going to believe in my mind that the pace of innovation will outstrip the human brain."

"I just don't subscribe the concept that there isn't a better way. I think that evidence of the last 50 years would argue that we've got a long way to go on silicon engineering. What we think is possible may completely be eclipsed tomorrow if we find a new element or a new process that would just flip everything on its head. I'm not going to play the Moore's Law is dead game, because I don't think it will be dead. Maybe the timeline slows down, but I just can't subscribe it dying based on what I've seen at my time at Intel."

Intel's "tick-tock" strategy has helped the company stick to Moore's Law, but how just long can it last?

AMD's Huddy shares a similar viewpoint: "Moore's Law looks alive and well, doesn't it? It's always five years from dying. For all practical purposes, I expect us to live on something very much like Moore's Law up until 2020. Our biggest problem is feeding the beast, it's getting memory bandwidth into these designs. I want the manufacturers of DRAM to just keep up with us, and give us not only the higher density–and they do a spectacular job of giving us more memory–but also make that memory work faster. That's a real problem, and if we could just get a lot of super fast memory and not pay the price of that wretched real world physics that gets in the way all the time. I blame them, it's all down to DRAM!"

Better Integrated Graphics

While cores have dominated CPU development over the last decade, both AMD and Intel have made great strides in bringing other parts of a system onto the CPU to improve performance and decrease system size, most notably with graphics. Until recently, Intel's integrated graphics were considered a poor choice for gaming, with performance that was only really good for rendering the 2D visuals of an operating system, rather than sophisticated 3D graphics. But this has changed of late. While Intel's Iris Pro integrated graphics can't compete with a separate GPU, they are able to run many games at acceptable frame rates and resolutions. There have even been some neat small-form-factor gaming systems designed around Iris Pro, such as Gigabyte's Brix Pro.

But when it comes to integrated graphics, AMD is far and away the performance leader. The company's purchase of ATI in 2004–despite some integration issues at the time–has given the company quite the performance lead; AMD's APU range of CPUs with built-in Radeon graphics are the best choice for building a small gaming PC without a discrete GPU. It might be just a small win for the company on the CPU side, but it's one that has had a significant impact on the company's focus.

"We took a decision 18 months ago to focus heavily on graphics IP," says Darren Grasby, AMD's VP of EMEA. "Driving the APU first, first with Llano, and fast forward to where we are today with Kaveri. Kaveri is the most complex APU ever built, and if you look at the graphics performance within that, you're not going to get the high-end gamers with that. But if you look at mainstream and even performance gaming, an A10 Kaveri is your product to get in there. And you don't have to go spend $1500 or $2000 dollars on a very high-spec gaming rig, that quite frankly, a mainstream or performance gamer isn't going to be using to its full capability."

"If you think about it from a gaming aspect, what are gamers looking for? They're looking for the compute power from the graphics card. The CPU almost becomes secondary to it in my mind." – AMD

"So you're right on the 'halo effect' on the CPU side," continued Grasby. "Obviously we can't talk about forward-looking roadmaps, but it's leaning into where the graphics IP is, and where that broader market is, and where the real revenue opportunities sit within that. That's why, if you look at Kaveri, if you look at the mass market and gaming market you're getting right up there. Then you start to get into 295 X2, and then you're talking about where the gamers are. If you think about it from a gaming aspect, what are gamers looking for? They're looking for the compute power from the graphics card. The CPU almost becomes secondary to it in my mind."

The Growing Threat of ARM and Mobile

While Intel continues to lead on pure CPU performance and AMD leads on integrated graphics, both companies have stumbled when it comes to mobile, which is problem as PC sales continue to decline. All-in-one system on a chip designs based on designs from the UK's ARM Holdings power the vast majority of the world's mobile devices–and that doesn't just mean cellphones and tablets; Sony's PlayStation Vita is built on a quad-core ARM chip. Intel has tried to stay the course with X86, creating the Atom line of processors specifically for low-power devices like phones and tablets. They haven't exactly set the world on fire, though. Intel's Mobile and Communications group lost over $900 million earlier this year.

AMD, meanwhile, took a different path and signed an ARM license to begin developing its own ARM processors. The question is–with the vast majority of the company's experience being in X86 architecture–why?

Phones and tablets like the Nvidia Shield mostly make use of ARM processors, rather than the traditional X86-based designs that AMD and Intel produce.

"Did you see Intel's earning results yesterday? [Note: this interview took place on July 17, 2014] Just go and have a look at their losses on mobile division," says AMD's Grasby. "I would suggest at some stage their shareholders are going to have a challenge around it. I can't remember the exact number, it's on public record, but I think it was 1.1 billion dollars they lost on 80 million dollars of turnover. Our clients suggest that isn't the best strategy. I encourage them to keep doing it, because if they keep losing that amount of money, it's definitely not good…the primary reason why we signed the ARM license was because two years ago we bought a company called SeaMicro. We were basically after its Freedom Fabric [storage servers], and that's why we signed the ARM licence, to go after that dense, power server opportunity that's out there. It's a huge opportunity."

"As soon as we got the ARM 64-bit license, other opportunities opened up on the side. Think embedded, for example. Embedded from an AMD perspective had always been an X86 Play. Just to give you an idea, ARM and X86 are a nine to ten billion dollar business. Take ARM out of that it comes to around four to five billion dollars. It's to exercise the opportunity."

PC Market Decline

Despite AMD's efforts, though, its ARM strategy and planned turnaround hasn't gone entirely to plan. The company posted a $36 million net loss in its recent financials, and predicted that its games console business to Sony, Microsoft, and Nintendo–which, to date, has been one of its biggest successes–would peak in September. Shares plummeted by 18 percent after the announcement. The still declining PC market means both Intel and AMD are looking for ways to expand beyond the desktop, but the companies maintain that their CPU lineups, and in particular their CPUs aimed at gamers and overclockers, remain an important part of what they do.

Despite a decline in recent years, overclocking is still alive and well.

"The overclocker market certainly is relevant," says Intel's Stude. "Every time we come out with a part there's a fraction of a fraction of people that are the utmost enthusiasts. They care about every last aspect of that processor and they want to want to push it to the limits. They are tinkerers, they don't mind buying a handful of processors to blow 'em up just to see what they can do, and to make their own living, be it working in Taiwan for the ODMs who make motherboards, or be it in other capacities in the media to submit their opinions on Intel's top end parts."

"We love the boutique nature of it," continued Stude, "because the people in that seat typically have very interesting compute perspectives that influence the decisions that others make. So, if you're very overclockable, you have a very influential position…so we do the best we can to feed this community our best story and we'll continue to that."

While there's no doubt AMD CPUs offer excellent value for money (we used one to great effect in our budget PC build), they still lag behind Intel when it comes to outright performance and performance per watt; to stay in the PC market, AMD has a much tougher job ahead of it then its rival.

"From an engineering perspective, performance per watt becomes the limiting factor in a lot of situations so there's no doubt that we need to do a better job," says AMD"s Huddy. "It's very clear that Intel and Nvidia, and everyone that competes in the silicon market has to be more aware of this. If you go back 10 and in particular 20 years ago, performance per watt, wasn't a big issue, but it increasingly is, and we aim to do better. I have absolutely no doubt about that. There's a lot of attention being paid to that. There are limits over how much we control our own destiny, but particularly for us where we use companies such as TSMC as others do, then those companies work with the same constraints as us and we should be able to just match them."

"From an engineering perspective, performance per watt becomes the limiting factor in a lot of situations so there's no doubt that we need to do a better job." – AMD

AMD Bets On Mantle

The future for AMD may lie in more than just hardware too. Mantle–its competitor API to OpenGL and Direct X–allows console-like low-level access to the CPU and GPU, and it's clear from speaking to the company that it has a lot of hopes pinned on the technology, even if Microsoft's upcoming Direct X 12 promises to do something very similar.

"It's very clear people have seen there's an artificial limitation that really needs to be fixed, and it's not just about giving you more gigahertz on your CPU," says AMD's Huddy. "We can be extremely proud of Mantle, getting the CPU out of the way when there was an artificial bottleneck. There's no doubt that people will use the extra CPU horsepower for good stuff, and we're seeing that in the demos that we're already able to show. However, let's not get hung up on gigahertz, sometimes it's smarts that get you there, and if you're looking for the fastest throughput API on the planet, then you'd have to say it is Mantle, and you'd have to say 'okay, now I get why AMD is leading the way', don't just count the CPU gigahertz, but look at the technology innovation that we're coming up with."

"Amusingly, and I don't know how relevant it is, you can make your own decision on that, for me it's entertaining: one of the companies that approached us [about Mantle] was Intel, and we said to Intel, 'You know what, can you give us some time, to fully stabilise this because this has to be future proof, but we'll publish the API spec before the end of the year.' And if Intel want to do their own Mantle driver and want to contribute to that they can build their own. We're trying to build a better future."

For more on AMD's Mantle, and why the company thinks Nvidia is doing "something exceedingly worrisome" with Gameworks technology, check back later in the week for our look at the developing war between PC graphics most prolific companies.