• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question The Death of Koomey’s Law! Also Thoughts on EPI

nicalandia

Diamond Member
I've been searching on CPU design and technology, most people focus on Moore's law, but I think it's more important to focus on EPI(Energy Per Instruction) or Performance/Watt.

Koomey's Law: The number of computations per joule of energy dissipated doubles approximately every 1.57 years, but that ended on 2010.

According to this paper Intel i486 has the same EPI(or Performance/Watt) as a CoreDuo, since that article is quite old I wonder if today's CPU are as energy efficient as i486

"As a result of micro-op fusion and other techniques, each core in the Core Duo processor delivers almost 8 times the scalar performance of the i486 processor while consuming only 8 times the power of the i486 processor. Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor! "
 
"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?
 
"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?
I don't think there's a broad claim about the i486 being as efficient in anything other than one metric in this paper. It looks to me this is only taking into account the scalar execution energy efficiency. One of the things that comes up a lot today is the energy cost of moving data around and there are plenty of examples where data buses increase power use immensely.

"The results are as if all generations of microprocessors were built on the same process technology. To realize these performance deltas in practice, older microprocessors would need to be given appropriate high-speed memory systems in newer process technologies (i.e. an L2 cache would become necessary since a main memory latency of 5 clocks at 66 MHz would become 80 clocks at 1 GHz). "

and it has this "(2.5x the IPC at 3x the frequency), "

So it might not be taking into account the reduction in energy efficiency from running a system at higher clock speeds as well
 
"Thus, the Core Duo processor achieves roughly the same EPI as the i486 processor!"

For some reason, this just doesn't seem right.

It's telling us that a chip from the 80s is as efficient per watt as a chip from the 2000s?
Not only that, but it was found to be remarkable, but the contect here is that Pentium 4 CPU had horrendous EPI.

Quote from the Article: "Even though Core Duo is a much higher performance processor, the effective capacitance switched per instruction is roughly the same as the i486 processor. This is a remarkable achievement, one that reverses the trend towards ever-greater EPI "

This paper is quite simple, how much Energy a CPU consumes per instruction and while that may be really irrelevant to Desktop users or overall end users, its quite important for HPC.


Is there a way we can measure the EPI of recent processors?
 
The analysis section of the paper is talking about Figure 2. Earlier in the paper it says "In Figure 2, both power and performance have been adjusted to factor out improvements due to process technology over time".

So no, they are not saying the i486 is as efficient as a core duo. What they are saying is that when you look at architecture ONLY, the core duo has the same effective capacitance switched per instruction as the i486. This is surprising because the Core duo is much more complex and has many more transistors that the i486.

I you want to see a benchmark including process improvements, look at spec power ssj 2008. It is supposed to measure power efficiency of servers. compare the results from the fourth quarter of 2007 to the first quarter 2020 results.

in Q4 2007 the spec ssj score was about 450. in Q1 2020 the AMD EPYC 7742 is getting over 20,000 score.
 
So no, they are not saying the i486 is as efficient as a core duo. What they are saying is that when you look at architecture ONLY, the core duo has the same effective capacitance switched per instruction as the i486. This is surprising because the Core duo is much more complex and has many more transistors that the i486.
I agree and thanks for the links
 
Not only that, but it was found to be remarkable, but the contect here is that Pentium 4 CPU had horrendous EPI.

Quote from the Article: "Even though Core Duo is a much higher performance processor, the effective capacitance switched per instruction is roughly the same as the i486 processor. This is a remarkable achievement, one that reverses the trend towards ever-greater EPI "

This paper is quite simple, how much Energy a CPU consumes per instruction and while that may be really irrelevant to Desktop users or overall end users, its quite important for HPC.


Is there a way we can measure the EPI of recent processors?
Run the same instructions for the CPUs [X number of passes for a given benchmark], measure the time taken and the total power used [power by time]. That should be all you need to do the calculation.
 
So it might not be taking into account the reduction in energy efficiency from running a system at higher clock speeds as well

I havnt read the article, but you need to take into account the different voltages and frequencies when trying to reason about power efficiency. Therefore you should never reason about energy per instruction but capacitance per instruction. Energy per instruction is pretty much useless measure when talking about architectural efficiency and shall only be used if your candidates you compare are at iso voltage and frequency.
 
Incidentally, I found this link today. The article is in Japanese but the slides are in English.

Look at the bottom of the second slide, "Zen 2 Architecture"
it says "~9% switching capacitance (CAC) improvement over previous generation, technology neutral"

So that can give you an idea of generational (Zen -> Zen2) switching capacitance improvement. It doesn't say per instruction.. but it does say process neutral.

They go into more detail on the later slides, the "Generational Leadership Perf/Watt" slide breaks down the power improvements based on process and design factors. it looks to me like the majority of perf/watt improvement in this case is due to process.
 
Incidentally, I found this link today. The article is in Japanese but the slides are in English.

Very interesting slides, thanks! It looks like AMD also considered interposer, but that would have limited CCD chiplet count to 4:
photo021_o.jpg


And where was that guy, claiming that chiplets will be replaced by monolithic design, as packaging is so expensive (but extra mask costs in million per each 7nm die are a sneeze):
photo026_o.jpg


photo025_o.jpg


Overall I really suggest checking these slides out, very informative.
 
Last edited:
Very interesting slides, thanks! It looks like AMD also considered interposer, but that would have limited CCD chiplet count to 4:
photo021_o.jpg


And where was that guy, claiming that chiplets will be replaced by monolithic design, as packaging is so expensive (but extra mask costs in million per each 7nm die are a sneeze):
photo026_o.jpg


photo025_o.jpg


Overall I really suggest checking these slides out, very informative.
What struck me when I first saw the slides was the increase in cost of higher die count CPUs. It's not just that it is cheaper to use chiplets, but that it becomes cheaper to a greater extent as the core count increases.

Meaning that until Intel goes to chiplets, AMD can squeeze them at will in higher core count CPUs.
 
What struck me when I first saw the slides was the increase in cost of higher die count CPUs. It's not just that it is cheaper to use chiplets, but that it becomes cheaper to a greater extent as the core count increases.

Meaning that until Intel goes to chiplets, AMD can squeeze them at will in higher core count CPUs.
Indeed, margin increases tremendously with the higher core counts. So for AMD (even ignoring the competition) it actually makes sense to push "ridiculous" amount of cores: 1) it increases their profit, 2) it increases their competitive edge, and 3) it makes software optimizations for more cores more likely to happen soon which again feed back into their competitive edge. The only limit it scalability of the hardware.
 
Look at the bottom of the second slide, "Zen 2 Architecture"
it says "~9% switching capacitance (CAC) improvement over previous generation, technology neutral"

So that can give you an idea of generational (Zen -> Zen2) switching capacitance improvement. It doesn't say per instruction.. but it does say process neutral.
Thanks for the link, so far we know that Zen had 15% improvement over excavator in switching capacitance and Zen 2 added an additional 9%, that is very impressive indeed.
 
Thanks for the link, so far we know that Zen had 15% improvement over excavator in switching capacitance and Zen 2 added an additional 9%, that is very impressive indeed.
There is even a site that does a bit of benchmarking like this
efficiency-multithread.png
 
Back
Top