• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Nvidia Blackwell in Q1-2025

Page 82 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
I know we've been over this before, but in the light of this release, and Blackwell being rather derivate, and arguably Ada and Ampere as well... What does NVidia have coming up over the next, say two generations? Do we just iterate? Is both node and architecture stagnating? Will that make it easier for others to catch up?

By Architecture, do you mean big free gains in Raster performance per Shader core? I wouldn't expect that at all.

It's clear the push from AMD and NVidia will probably be more on AI and RT improvements.

Raster performance will likely come slowly from node improvements increasing the transistor budget.
 
Now we’re talking! 🤣


Nvidia is the new Intel.
Yes, but they have an even bigger problem than Intel because the nodes are now bumping up against the laws of physics. Intel sat on the same lithography node for like a decade. nVidia or anyone else interested in shrinking down below 4nm is kinda in a no-win situation. Or at least there are no more easy performance gains to be had without extra cost being absorbed or passed on to the consumer.
 
I know we've been over this before, but in the light of this release, and Blackwell being rather derivate, and arguably Ada and Ampere as well... What does NVidia have coming up over the next, say two generations? Do we just iterate? Is both node and architecture stagnating? Will that make it easier for others to catch up?
Seeing as how Blackwell already maxes out the die size and TDP on N4P, I suspect Rubin flagship will likely do the same, so whatever density improvements N3E gives over N4P, that'll be the upper limit on the raw performance improvement. They'll do their usual doubling of RT intersection performance and possibly 2x tensor units (not another reduction in precision), but that'll be it besides a whole dollop of software features.
 
The evidence doesn't really seem to bear that out though.

In lower end 50 series parts Die size is about the same as 40 series, for about the same Shader count as last generation, and 5090 expanded proportionally to the SM count increase, so NVidia does not seem to have any die size bloat from the technology update.

Plus this generation AMD is supposed to be adding real AI tensor cores, and beefing up it's RT capability. So AMD may actually have bloat while NVidia is showing none.

So again, no evidence that AMD has any advantage to exploit, so would need a die just as monstrous as 5090's to compete... which I still argue is a complete non starter.

- OTOH the 4090 is 60% larger than the 4080 for ~30% additional gaming performance. The 5090 is scaling proportionately from the 4090, but the 4090 doesn't scale proportionately from the 4080.

I know bigger dies are generally more inefficient because physics and all that, but I wonder if the bloat is already baked into the big die arch...
 
Zen is way different. Zen does not use a huge complex monolith for their consumer line and is cheap to build. AMD just been delivering at a consistence cadence for their CPU division while keeping power consumption relatively low which is what help it gain consumer confidence. It was not the threadrippers that did it.

Going Halo for Nvidia makes more sense since they have a professional visualization market which provides guaranteed income at high margins.

AMD doesn't have much for this market. As we can see with the possible performance flop of the RTX 5090, going big carries more risk as more things can go wrong. The GTX 480 and Fury X are examples of this. The RTX 5090 is clocked awfully low for the amount of power it uses. This might be a fermi like chip which misses targets in the gaming segments but recovers sales in the professional markets.

Also chip design is more expensive than that. Both companies R and D expenditures seem to line up with this chart.

Fig2.png


Here is a newer chart.

3aFC7AoYe4XUfReErwVNm7-1200-80.jpeg


I was looking into the software part and it's ridiculously expensive. Someone on reddit says the licenses for single software can be a million dollars per year and indeed use the software mentioned by XPEA mentioned(Cadence and synopsys) and they have hundreds of these licences which are only good for one year. A single big chip can use hundreds of licenses in their design.

So would AMD rather uses these resources on something like instinct or a Halo discrete graphics card? I think we all know the answer. AMD is just following the money.

Spending hundreds of millions on a single halo just doesn't make sense when that money can be spent on datacenter where sales are in billions, not hundreds of millions.

The licensing structure for these tools is decently complex and yes, very expensive. You need a suite of tools that range in price and are licensed by the core count you will use to run the tool and (on the digital side) transistor count of your design. It also only gets more expensive as you move to more advanced nodes as the same tools require new, more expensive licenses to support smaller nodes. The license costs for these latest nodes is ridiculously expensive and the compute power needed to simulate/verify them has gone up dramatically as well.
 
What's the odds that Nvidia will go with a MCM solution for gaming Rubin? Just seems fishy that GB203 is half of GB202, as if they already have a split L2 solution working for GB202 and just need a silicon bridge to make MCM work next gen.
 
What's the odds that Nvidia will go with a MCM solution for gaming Rubin? Just seems fishy that GB203 is half of GB202, as if they already have a split L2 solution working for GB202 and just need a silicon bridge to make MCM work next gen.
unlikely
 
I’m more interested in those N1X Blackwell SoCs. If those work well then maybe the generation after Rubin will be MCM
 
The FE cooler is in very impressive if it can dissipate that much heat in just 2 slots. But all that heat goes right into the case unlike an AIO.

Yes, excellent job from the Nvidia engineers, I hope to see this rub off to AIBs too. One still needs to carefully plan what to do with the heat building up in the case, but ultimately that is a problem for all builds - one either fixes the heat flow or they get meh results.
 
I get the feeling that dual issue isn't properly "fixed" as in all cores being FP32/INT now does not make them equal, perhaps they did not have transistor budget to fix other bottlenecks and left it for N3 work
 
I get the feeling that dual issue isn't properly "fixed" as in all cores being FP32/INT now does not make them equal, perhaps they did not have transistor budget to fix other bottlenecks and left it for N3 work
Budget was reallocated to the RT core. Might even call it a level 4 RT core with the features.

Interesting how they basically went back to Maxwell. Why is that?The oversll INT throughput is doubled but the capability to concurrently issue INT that Turing brought is seemingly lost. I guess just wait and see.

Seems to me watching the DF deep dive, that older games will just be a boring architecture uplift, while newer games that use SER, neural rendering, new RT acceleration structures, rt megageometry etc will run faster on Blackwell as the software render gets accelerated by Blackwell hardware. Actually similar to the 'promise of Turing'. I guess we'll see.

I don't know what they saw in their simulators, but from the looks of things, they needed GB103 with 96SM.
 
Just got this & it's a bit disturbing! Except for the top die, just look at the die size, transistor density & most importantly the transistor count. Kinda disillusioned & feeling a bit nauseated. :flushed:

View attachment 115021

This ladies & gentlemen, is clear evidence of cheating. I think I'll be skipping 50 series.
If this is what future is going to look like from now on, not speaking strictly about Nvidia and GPUs, but broadly about industries/society, it pretty much means technological progress available only to "rich" people going forward.
 
If this is what future is going to look like from now on, not speaking strictly about Nvidia and GPUs, but broadly about industries/society, it pretty much means technological progress available only to "rich" people going forward.
GPU market and the gaming industry should become more commodified for it to remain hot and interesting and price competitive.
The fascinating bit for me is Radeon vs Adreno! What a bizarre turn of events that those estranged cousins are duking it out in the same market.
 
Back
Top