SI CHINA     SI JAPAN
Login  |  Register          Free Newsletter Subscription
Subscribe
Email
Print
Reprint
Learn RSS

IBM Inside

By Ed Sperling, Editor in Chief -- Electronic News, 6/15/2007

Bernard Meyerson, chief technologist at IBM, sat down with Electronic News/Electronic Business to shed light on what’s happening at the bleeding edge of Moore’s Law and inside large data centers. What follows are excerpts of that conversation
 
Q: What’s new at IBM?
Meyerson: The world is changing so rapidly that you have to adapt or die. If you think about what the issue was 10 years ago, everybody in the data center was screaming, ‘More speed!’ Now it's 'Less power'.

Q: Isn’t that something they expect, though?
Meyerson: They expect it, but they’re not screaming anymore. That’s table stakes. Now they’re screaming for a fire house. You’re melting the ceiling tiles in the data center. What’s gating your ability to put the MIPS (millions of instructions per second) in is that you can’t deal with the thermal loads. That’s why mainframes are becoming popular again.

Q: Water cooling?
Meyerson: That actually started returning about five years ago. It’s making a very strong comeback. That’s a secondary issue, though. But in an x86 data center, the typical utilization runs between 6 and 10 percent. That’s typical. In the same data center, 45 percent of the power is spent on IT. All the rest is on cooling and other things. Of that 45 percent, 50 percent goes to your servers. Everything else is for storage and networks. That means you have to take half of less than half of 6 percent—that’s how much use, which is less than 2 percent—of your actual investment you actually get back in work. Do you think we have an opportunity for improvement here? Now, if you compare that to a Z (mainframe), the typical utilization is about 80 percent. When you run at 100 percent, it hums. It’s perfectly happy. We’ve been doing virtualization for 35 or 40 years, which is why our mainframes are so efficient.

Q: This is the old time-sharing idea in one system, right?
Meyerson: Yes, but here’s the key: It is also load-leveling, so every piece of that system is being utilized all the time. As a consequence, you can load these things to the max and have no negative implications on anything. Even at 100 percent utilization it protects the share of the machine that you need for your critical apps, as you define them, running flat out—without degradation. You slow down the non-critical apps. The funny thing about a machine running tens of thousands of apps, you can find a couple of non-critical apps you can slow to a crawl. Because you can run the machines at 100 percent efficiency, you pick up a factor of 10 in performance. If you start collecting other servers and combine the applications onto a single Z, you save 75 to 80 percent of the power for the same performance.

Q: So are mainframe sales up because of this?
Meyerson: Yes. It’s called consolidation. We’re going through the first major shift of consolidation.

Q: This is more re-aggregation, isn’t it?
Meyerson: Yes. There was a missing link. There was a mistaken belief that everyone who uses a laptop is an IT professional. This is a rather foolish assumption by those who said that’s the way the world will go. They then take their kid’s PacMan game and load it on their laptop, which installs the seven viruses that came with it at no extra charge. Then the network starts to slow to a crawl as that virus infects all the servers on the network—or even worse, all the routers. And very quickly the thing just stops. Now you’ve got an entire building of infected machines and an entire network of infected servers, and there aren’t enough IT guys in your company to deal with that fiasco. In the end, disaggregation led to power inefficiency, management expense, and it’s so embarrassing that most people can’t admit it. Folks have invented new words like thin client to hide the fact that we need to re-aggregate to address many of the issues disaggregation created.

Q: That’s a ‘lite’ terminal that also can do Web searches.
Meyerson: Right. It’s back to the future—again. Another thing has happened, too, which is a seminal shift in this industry. It’s a move to hybrid computing. There’s no single solution that’s best. We can play the gigahertz game, but that’s been disavowed as a strategy mostly by folks who are having trouble staying on the road maps. We’ve been very good about staying on the road map. Power 6 is the fastest server chip ever built. It’s also the fastest chip of any type ever built. Despite the fact that the chip is the size of a Frisbee, at least relative most chips, it outruns everything. There have been a lot of breakthroughs that enable that, but it’s redefining what speed is. We have them cranking away in the lab, fully functional, at 6GHz. We’re selling them at 4.7GHz out of the chute. Think about a 6GHz chip that’s 340 square millimeters (about 1-3/8 inches).

Q: That’s the size of a large coin?
Meyerson: Bigger. And here’s the kicker. If you turn on a laser at the moment you start a clock cycle and you chop it off after one cycle, you have a beam of light that is only several inches long. Do you have any idea of the invention required to get the chip, which is half that size, to clock correctly without timing errors when a beam of light—which is a heck of a lot faster than any electrons—can only go [several inches] in a clock cycle? At 6GHz, that’s one-sixth of a nanosecond. The trick is to build it with a distributed ‘H’ clock tree. You actually have a series of mini-clocks all over the chip. One of them starts to hum, the others listen for it, but it hums a little bit ahead of its neighbors humming figuring it’s behind what it is ‘hearing,’ which is correct. And they all interact until it settles into a consistent hum. Even though the ends of the chip are humming synchronously, there is no way the information can get from one end to the other until half a cycle later. That’s way too late for the next clock, which is why this is a challenge.

Q: So you’ve synchronized these clocks despite the significant delays across the chip?
Meyerson: Yes, there’s a smart feedback system that makes it work. But this kind of stuff is rocket science. And then you ask, ‘What’s next?’

Q: Okay, what is next?
Meyerson: You have to figure out where the problem is.

Q: By that, you mean where’s the next bottleneck?
Meyerson: Yes. If you made the transistor 20 times faster, the chip would only run twice as fast. Moore’s Law is all about density. It’s not about performance. Folks assume performance improves as one shrinks a part. That is no longer correct.

Q: The tight linkage between performance and device scaling ended at 90nm, right?
Meyerson: Right. It actually gets worse with time. With time, although transistor performance has leveled off, what’s degrading much faster than ever is wiring performance. Wiring performance has fallen off a cliff. If you make those wires really skinny, the amount of current is reduced by the resistance going up. Making them higher doesn’t work well either, as that makes them more like a large capacitor. Because they ‘talk’ to each other when one wire is signaling, the next wire over sucks the strength signal out of the active wire by capacitive coupling through the insulator. Everyone’s been working on low-k dielectrics for insulation. What is the lowest k possible?

Q: One?
Meyerson: Correct. That’s a vacuum. It’s a perfect insulator. The problem is that a vacuum does not have really good mechanical properties, but the strategy of the industry has been to reduce k in bulk insulators. We’ve gone from k values of 4 to 3.8 and so on down to 2.9. If you want to make the big leap, you have to go all the way down to 1. This is the announcement we recently made on Airgap, and it’s a huge win.

Q: Intel doesn’t think so.
Meyerson: It’s expensive if it doesn’t work when you try it. Remember that folks said the same thing about strained silicon at first. No one else knew how to do it. Airgap works. It has been used to build real microprocessors, not a little demo chip. The funny thing about it is this is not based on trying to define things lithographically, which is the most expensive imaginable way to build a chip. It is actually based on nanotechnology, or more specifically, self-assembly. Self-assembly is not well understood and many (including Intel) have no expertise in this area. It is based on materials segregation. We invented a polymer, and if you heat it up modestly, what happens is it segregates into two different materials that are chemically different. The material that segregates out of the polymer forms very uniform nodules about 100 atoms across. These nodules are of a different chemical composition than the matrix that’s left behind. These nodules repel each other, so you have this perfectly symmetrical array of them. You use an etchant that selectively etches out just the nodules to create the world’s finest Swiss cheese. The holes in it are incredibly tiny. As the insulator we use a normal fabrication process with a layer of dielectric—things like SiCOH (silicon oxycarbide), which we use now. You don’t have invent new stuff beyond the self-assembling materials, which is why it’s actually cheaper if done right. You use conventional technology for the dielectric beneath the layer of metal, and then you build off the normal structure you have. You planarize the layer of metal so you have metal wires embedded in the dielectric, just like you normally do. But now you take the ability to make Swiss cheese and you exercise it over the top of this. You then coat this with a thin layer of polymer and etch unwanted insulator out through the fine holes you've created.

Q: Is there any magnetic effect causing the ordering of the nodules?
Meyerson: No. It’s all chemical. It has to do with surface tension. That’s why a bead of water forms. It turns out when you have a bunch of spheres in this mixture, they repel one another. If they get as far away from each other as possible, that’s the lowest energy state because of surface tension. The net of it is that when they form this ordered array, you remove the nodules. The Swiss cheese exists above the metal and above the dielectric. You then stick it in a plasma, which is atomic particles. That gets right through the holes. It etches holes in the underlying dielectric that separates the wires. The plasma damages the insulator so badly that you can go in and chemically etch out all the residue. You stop etching when there’s no more insulator between neighboring wires, only beneath them, and you’re doing this in a vacuum chamber so basically this is a vacuum.

Q: So how far above one is this dielectric?
Meyerson: It is one. We don’t put anything back. We actually are at the physical limit between the wires.

Q: What effect does that have on heat?
Meyerson: The wires are still connected on the top and bottom, which is where the heat has to flow. They just don’t connect left and right. That’s the trick. But you also have to then coat this Swiss cheese with goo. It seals up all the holes. You seal in the vacuum. And then you start the normal process again, vertically you haven’t changed anything. But between the wires, which is where the gross preponderance of the effect of the dielectric constant is felt, it’s one.

Q: What does this do for your ability to extend Moore’s Law?
Meyerson: It dramatically increases it. When Airgap becomes your final limiter, we have a solution. Although transistors are required, the wiring gets worse and worse with scaling. We had to do something new to correct this.

Q: Do you gain anything in terms of performance as you go down the Moore’s Law curve?
Meyerson: It actually makes it worse. You scale for the economics. There is some improvement in scaling with each generation. But the fraction of that gain has been dropping off a cliff. When you went from 1 micron to 0.8 micron, almost all the performance gain was due to the scaling itself. You reduced the capacitance, the gate oxide got thinner, and you were nowhere near the physical limits. Now, scaling is the smallest component of your gain. It’s more bad news than good. The trouble is you make things so much worse due to wiring losses that scaling is no longer a performance win.

Q: At what point does it become economically unfeasible?
Meyerson: It’s not a discontinuous process. You can put infinite expense into rushing to market with a new lithography generation and successfully shoot yourself in the other foot. This is an asymptotic situation. Things don’t instantaneously grind to a halt, but you are already in the diminishing returns zone. What people have to think about is whether they rush to market early or whether they continue to innovate. Innovation is the majority of the gain. Is innovation of greater value than scaling? Yes. We’ve already crossed over. It’s just a question of whether people recognize that. The proof of that is Power6. P6 in the new P570 server slaughters anything out there in terms of performance, whether you talk about transaction processor capabilities or raw frequency. When you look at SPECint (integer) and SPECfp (floating point), it runs past everything out there, and it leads in transaction processing as well. Nobody has actually hit all three records simultaneously—ever. That doesn’t happen unless you optimize the designs around the technology. It takes longer to get to market with a technology node, but as scaling is the least important thing today you don’t rush to get out first with only scaling to hang your hat on. That sounds counterintuitive, but the rush to first at a technology node is living in the past.

Q: Does this mean that instead of two years you’re running substantially longer between generations?
Meyerson: No. You can’t stretch it too far because there is an economic component. You want to get more chips out of less silicon. But if you run to market first, you cheat your customer because they get an inferior product sooner. If that product is soon enough, that’s not always bad. Lower performance is okay sometimes if you can give it to someone much, much earlier. But what we’re finding is that the differentiation due to innovation along with holistic design slaughters anything you’re going to do with scaling.

Q: Will there be a market for a processor still run at 90nm and remains there in the long term?
Meyerson: In the processor side, we’re not at the level of sophisticated design that we have the option. When you shrink the design, you still get some benefit. But there are entire industries where people recognize it is not driven by lithography, like the analog and mixed signal business.

Q: On a lot of the non-server applications, is multicore still the future?
Meyerson: It still makes sense with a relatively sparse number of cores. Cores are so powerful that it’s a delicate balancing act. If you show a person a video at 30 frames a second or 30,000 frames a second, it doesn’t matter. The chemistry in your eye can’t respond fast enough. Multicore has been a huge win for the big systems. It’s a lesser win for the smaller systems.

Q: Which is better, homogeneous or heterogeneous cores?
Meyerson: The bigger win is heterogeneous cores. If you do your design right, you can knock it off on the first pass. That opens the door for hybrid computing. You can share a single memory between two neighboring cores so you don’t have a lot of latency. You can start to envision where you can tune up a chip to where it’s not 10 percent better in a given calculation, but factors of 10.

Q: Is this all hypervisor-driven?
Meyerson: Yes. It all has to be done by the software. We realized this was such a differentiator back in the mid-1990s. We’ve also now put different voltage rails in, where we can dynamically tell part of a chip to slow down or shut down. Every chip we build requires us to coordinate software and hardware all the way back to the front of the chip design to accommodate such instructions. It’s becoming hugely interwoven. This is where our fundamental differentiation as a system house comes into play. We are not a chip company. We make killer chips, but we are not a chip company, we're a systems and solutions company.

Q: Compared with Intel?
Meyerson: Intel’s model is really a commodity model. We can’t leverage it at the high end because it doesn’t have a high end. If you look at Itanium, it’s cranking away at a rip-snorting 1.4GHz vs. our 4.7GHz, and 6GHz in the lab. We’re talking about being off the mark 300 and 400 percent just in frequency, and that doesn’t begin to get into the other attributes of the chip. In the best of all worlds, you might want to have another company out there to push you.

For part two of this special report featuring IBM’s ASIC guru Tom Reeves, click here.

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

SPONSORED LINKS



 
Advertisement
SPONSORED LINKS

More Content

  • Blogs
  • Podcasts
  • Videos

Blogs

  • David Lammers
    Views on News

    October 23, 2008
    When Is No Really a No?
    An executive at a major IC manufacturer likes to tell the story about a meeting in 1996 to discuss 3...
    More
  • David Lammers
    Views on News

    October 6, 2008
    IBM And The All-In Bet on High-K
    The debate about the worthiness of high-k/metal gate technology brought to mind what Japanese semico...
    More
  • » VIEW ALL BLOGS RSS

Podcasts

Videos

Advertisements





NEWSLETTERS
Plug in and get the latest SI news, trends and industry updates delivered free, directly to your inbox!

SI NewsBreak and Special Reports (Weekdays)
Wafer Processing Report (Monthly)
Lithography Report (Monthly)
Metrology Report (Monthly)
Clean Processing Report (Monthly)
Packaging Report (Twice Monthly)
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites