• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 858 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
BTW, if it is true and L3 die is below, then why not make SRAM amount > 64 MB? There would be room for more on the die.

Increase in hit time may not be worth the added capacity for most apps. There are already a lot that don't gain anything from the v-cache and increasing the latency hurts the performance for all of those.

So 5.2 max boost is pretty much confirmed. If thermal restraint was lifted, why is boost still .3 Ghz down from non V-cache model??

It may still be voltage constrained, limiting the clock speed.

Another possibility is binning/market segmentation. If you want the faster boost you'll have to shell out for a 9900X3D or a 9950X3D.
 
9800X3D Blender Open Data entry. 11% faster than 9700X. OC maybe? Or maybe its due to its 120W TDP vs the original 65W TDP of 9700X. Its massively faster than 7800X3D.


1730138166402.png

1730138278768.png
 
Roughly 5.4ns for M1 [= 12 MB shared L2$] vs 2.4ns for 7950X [= 1 MB private L2$].
Or 3.8 ns for Telum [= 32 MB private L2$¹]
But that's when neither die area nor power consumption are of immediate concern.²

________
¹) of which parts can be dynamically repurposed into shared virtual L3$ (12 ns on average) or shared virtual L4$ even (which is off-chip cache).
²) almost a square inch of 7nm Samsung silicon for an 8-core chiplet, with 200 W power budget — but this is a real and honest way to obtain a ticket to La La Land. ;-)
 
It's possible that they may decouple it into an L4 cache to allow the first 32MB of L3 to have a lower latency, at the expense of a few extra cycles of RAM latency.
In client (where you can probably only afford design with modification for both mobile and desktop) I'd much rather have an SLC instead of L4.

Each layer of cache adds extra complications, more tags to keep track of, etc.

That's one of the reasons Apple and Qualcomm forego L3. Unless you can afford to make the L3 big enough (say 24GB - 32GB+) you might be better off with bigger shared L2 caches and a SLC, that also benefits the GPU, NPU ...
 
In client (where you can probably only afford design with modification for both mobile and desktop) I'd much rather have an SLC instead of L4.

Each layer of cache adds extra complications, more tags to keep track of, etc.

That's one of the reasons Apple and Qualcomm forego L3. Unless you can afford to make the L3 big enough (say 24GB - 32GB+) you might be better off with bigger shared L2 caches and a SLC, that also benefits the GPU, NPU ...
That's a giant L3 you're not gonna see in a long time...
 
That's a giant L3 you're not gonna see in a long time...
Yeah, that's why you're not gonna see L3 on qualcomm / apple SoCs.

At least until they are 90% mobile focused. Apple's rumored server SKUs might actually have L3 and it might trickle down to higher end desktop / M Max SKUs
 
Yeah, that's why you're not gonna see L3 on qualcomm / apple SoCs.

At least until they are 90% mobile focused. Apple's rumored server SKUs might actually have L3 and it might trickle down to higher end desktop / M Max SKUs
Well, yeah that and how big or high Gigabyte size L3s would be. It's absurd.
 
If Zen continues to get wider and slower, than I can see them using the reduced clockspeed targets to double the size of the L2 while keeping the same number of cycles of latency. That should help with throughput a bit.
From my layman PoV investing into increasing L2 size does not yield much in terms of general-purpose IPC. Zen 4 doubled the Zen 3's relatively small 512kB L2. Yet, it was trailing the rest of "major IPC contributors" with sub-2% IPC points. Intel went the same odd L2-growing route since Willow... 512kB -> 1.25MB -> 2MB -> 2.5/3MB.
 
9800X3D Blender Open Data entry. 11% faster than 9700X. OC maybe? Or maybe its due to its 120W TDP vs the original 65W TDP of 9700X. Its massively faster than 7800X3D.


View attachment 110463

View attachment 110464
More optimzed V/F curve
X3D have always had a better one, but it have been capped by temps and voltage limits in the past. (look at 7950X3D vs 7950X at lower PPT limits (sub 160w) and compare the efficiency)

Now that Z5X3D are unhindered by temperature and nearly all voltage limits are lifted, the new V/F curve finally get the time to shine... You will see the Z5X3D models beat vanilla Zen5 in pretty much all MT workloads @ stock PPT limits
The only remaining place where regular Zen5 wins is in the light ST workloads since the v-cache cant handle much more than 5.7ghz (when silicon pushed to the limit)
 
Last edited:
If it's priced higher than 450 it's getting negative reviews for sure.
7800X3D was selling in large numbers at $350 just a year or so after its launch. I think AMD can just price it at 450 and give smaller discounts later, it's better long-term.
But it's AMD, they like to miss.
 
Back
Top