Question The Death of Koomey’s Law! Also Thoughts on EPI

nicalandia · Feb 17, 2020

I've been searching on CPU design and technology, most people focus on Moore's law, but I think it's more important to focus on EPI(Energy Per Instruction) or Performance/Watt.

Koomey's Law: The number of computations per joule of energy dissipated doubles approximately every 1.57 years, but that ended on 2010.

According to this paper Intel i486 has the same EPI(or Performance/Watt) as a CoreDuo, since that article is quite old I wonder if today's CPU are as energy efficient as i486

https://www.intel.com/pressroom/kits/core2duo/pdf/epi-trends-final2.pdf

"As a result of micro-op fusion and other techniques, each core in the Core Duo processor delivers almost 8 times the scalar performance of the i486 processor while consuming only 8 times the power of the i486 processor. Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor! "

mikegg · Feb 18, 2020

"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?

tomatosummit · Feb 18, 2020

senttoschool said:
"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?

I don't think there's a broad claim about the i486 being as efficient in anything other than one metric in this paper. It looks to me this is only taking into account the scalar execution energy efficiency. One of the things that comes up a lot today is the energy cost of moving data around and there are plenty of examples where data buses increase power use immensely.

"The results are as if all generations of microprocessors were built on the same process technology. To realize these performance deltas in practice, older microprocessors would need to be given appropriate high-speed memory systems in newer process technologies (i.e. an L2 cache would become necessary since a main memory latency of 5 clocks at 66 MHz would become 80 clocks at 1 GHz). "

and it has this "(2.5x the IPC at 3x the frequency), "

So it might not be taking into account the reduction in energy efficiency from running a system at higher clock speeds as well

nicalandia · Feb 18, 2020

senttoschool said:
"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?

Not only that, but it was found to be remarkable, but the contect here is that Pentium 4 CPU had horrendous EPI.

Quote from the Article: "Even though Core Duo is a much higher performance processor, the effective capacitance switched per instruction is roughly the same as the i486 processor. This is a remarkable achievement, one that reverses the trend towards ever-greater EPI "

This paper is quite simple, how much Energy a CPU consumes per instruction and while that may be really irrelevant to Desktop users or overall end users, its quite important for HPC.

Is there a way we can measure the EPI of recent processors?

rickxross · Feb 18, 2020

The analysis section of the paper is talking about Figure 2. Earlier in the paper it says "In Figure 2, both power and performance have been adjusted to factor out improvements due to process technology over time".

So no, they are not saying the i486 is as efficient as a core duo. What they are saying is that when you look at architecture ONLY, the core duo has the same effective capacitance switched per instruction as the i486. This is surprising because the Core duo is much more complex and has many more transistors that the i486.

I you want to see a benchmark including process improvements, look at spec power ssj 2008. It is supposed to measure power efficiency of servers. compare the results from the fourth quarter of 2007 to the first quarter 2020 results.

Fourth Quarter 2007 SPECpower_ssj 2008 Results

First Quarter 2020 SPECpower_ssj 2008 Results

in Q4 2007 the spec ssj score was about 450. in Q1 2020 the AMD EPYC 7742 is getting over 20,000 score.

nicalandia · Feb 18, 2020

rickxross said:
So no, they are not saying the i486 is as efficient as a core duo. What they are saying is that when you look at architecture ONLY, the core duo has the same effective capacitance switched per instruction as the i486. This is surprising because the Core duo is much more complex and has many more transistors that the i486.

I agree and thanks for the links

maddie · Feb 18, 2020

nicalandia said:
Not only that, but it was found to be remarkable, but the contect here is that Pentium 4 CPU had horrendous EPI.

Quote from the Article: "Even though Core Duo is a much higher performance processor, the effective capacitance switched per instruction is roughly the same as the i486 processor. This is a remarkable achievement, one that reverses the trend towards ever-greater EPI "

This paper is quite simple, how much Energy a CPU consumes per instruction and while that may be really irrelevant to Desktop users or overall end users, its quite important for HPC.

Is there a way we can measure the EPI of recent processors?

Run the same instructions for the CPUs [X number of passes for a given benchmark], measure the time taken and the total power used [power by time]. That should be all you need to do the calculation.

Thala · Feb 19, 2020

tomatosummit said:
So it might not be taking into account the reduction in energy efficiency from running a system at higher clock speeds as well

I havnt read the article, but you need to take into account the different voltages and frequencies when trying to reason about power efficiency. Therefore you should never reason about energy per instruction but capacitance per instruction. Energy per instruction is pretty much useless measure when talking about architectural efficiency and shall only be used if your candidates you compare are at iso voltage and frequency.

rickxross · Feb 20, 2020

Incidentally, I found this link today. The article is in Japanese but the slides are in English.

【福田昭のセミコン業界最前線】 AMDがISSCCで発表したZen2プロセッサのCPUコアとチップレットの技術

AMDは、最新世代のマイクロプロセッサ「Zen2」のCPUコア技術とチップレット技術を、半導体回路技術の国際学会「ISSCC 2020」で2020年2月17日(米国太平洋時間)に発表した。発表は2件の講演にわかれており、まずCPUコア技術(講演番号2.1)を説明し、続いてチップレット技術(講演番号2.2)を説明するという流れだった。

pc.watch.impress.co.jp

Look at the bottom of the second slide, "Zen 2 Architecture"
it says "~9% switching capacitance (CAC) improvement over previous generation, technology neutral"

So that can give you an idea of generational (Zen -> Zen2) switching capacitance improvement. It doesn't say per instruction.. but it does say process neutral.

They go into more detail on the later slides, the "Generational Leadership Perf/Watt" slide breaks down the power improvements based on process and design factors. it looks to me like the majority of perf/watt improvement in this case is due to process.

Gideon · Feb 21, 2020

rickxross said:
Incidentally, I found this link today. The article is in Japanese but the slides are in English.

【福田昭のセミコン業界最前線】 AMDがISSCCで発表したZen2プロセッサのCPUコアとチップレットの技術

AMDは、最新世代のマイクロプロセッサ「Zen2」のCPUコア技術とチップレット技術を、半導体回路技術の国際学会「ISSCC 2020」で2020年2月17日(米国太平洋時間)に発表した。発表は2件の講演にわかれており、まずCPUコア技術(講演番号2.1)を説明し、続いてチップレット技術(講演番号2.2)を説明するという流れだった。

pc.watch.impress.co.jp

Very interesting slides, thanks! It looks like AMD also considered interposer, but that would have limited CCD chiplet count to 4:

And where was that guy, claiming that chiplets will be replaced by monolithic design, as packaging is so expensive (but extra mask costs in million per each 7nm die are a sneeze):

Overall I really suggest checking these slides out, very informative.

maddie · Feb 21, 2020

Gideon said:
Very interesting slides, thanks! It looks like AMD also considered interposer, but that would have limited CCD chiplet count to 4:

And where was that guy, claiming that chiplets will be replaced by monolithic design, as packaging is so expensive (but extra mask costs in million per each 7nm die are a sneeze):

Overall I really suggest checking these slides out, very informative.

What struck me when I first saw the slides was the increase in cost of higher die count CPUs. It's not just that it is cheaper to use chiplets, but that it becomes cheaper to a greater extent as the core count increases.

Meaning that until Intel goes to chiplets, AMD can squeeze them at will in higher core count CPUs.

moinmoin · Feb 21, 2020

maddie said:
What struck me when I first saw the slides was the increase in cost of higher die count CPUs. It's not just that it is cheaper to use chiplets, but that it becomes cheaper to a greater extent as the core count increases.

Meaning that until Intel goes to chiplets, AMD can squeeze them at will in higher core count CPUs.

Indeed, margin increases tremendously with the higher core counts. So for AMD (even ignoring the competition) it actually makes sense to push "ridiculous" amount of cores: 1) it increases their profit, 2) it increases their competitive edge, and 3) it makes software optimizations for more cores more likely to happen soon which again feed back into their competitive edge. The only limit it scalability of the hardware.

nicalandia · Feb 21, 2020

rickxross said:
Look at the bottom of the second slide, "Zen 2 Architecture"
it says "~9% switching capacitance (CAC) improvement over previous generation, technology neutral"

So that can give you an idea of generational (Zen -> Zen2) switching capacitance improvement. It doesn't say per instruction.. but it does say process neutral.

Thanks for the link, so far we know that Zen had 15% improvement over excavator in switching capacitance and Zen 2 added an additional 9%, that is very impressive indeed.

TheELF · Feb 22, 2020

nicalandia said:
Thanks for the link, so far we know that Zen had 15% improvement over excavator in switching capacitance and Zen 2 added an additional 9%, that is very impressive indeed.

There is even a site that does a bit of benchmarking like this

AMD Ryzen 7 3700X Review

AMD's $330 Ryzen 7 3700X is an 8-core, 16-thread CPU that's clocked high enough to compete with Intel's offerings. Actually, its application performance matches even the more expensive Intel Core i9-9900K. Gaming performance has been increased significantly, too, thanks to the improved...

www.techpowerup.com

Question The Death of Koomey’s Law! Also Thoughts on EPI

nicalandia

Diamond Member

mikegg

Platinum Member

tomatosummit

Member

nicalandia

Diamond Member

rickxross

Junior Member

nicalandia

Diamond Member

maddie

Diamond Member

Thala

Golden Member

rickxross

Junior Member

【福田昭のセミコン業界最前線】 AMDがISSCCで発表したZen2プロセッサのCPUコアとチップレットの技術

Gideon

Platinum Member

【福田昭のセミコン業界最前線】 AMDがISSCCで発表したZen2プロセッサのCPUコアとチップレットの技術

maddie

Diamond Member

moinmoin

Diamond Member

nicalandia

Diamond Member

TheELF

Diamond Member

AMD Ryzen 7 3700X Review

TRENDING THREADS