Discussion Nvidia Blackwell in Q1-2025

Heartbreaker · Jan 15, 2025

CakeMonster said:
I know we've been over this before, but in the light of this release, and Blackwell being rather derivate, and arguably Ada and Ampere as well... What does NVidia have coming up over the next, say two generations? Do we just iterate? Is both node and architecture stagnating? Will that make it easier for others to catch up?

By Architecture, do you mean big free gains in Raster performance per Shader core? I wouldn't expect that at all.

It's clear the push from AMD and NVidia will probably be more on AI and RT improvements.

Raster performance will likely come slowly from node improvements increasing the transistor budget.

Golgatha · Jan 15, 2025

SiliconFly said:
Now we’re talking! 🤣

Nvidia is the new Intel.

Yes, but they have an even bigger problem than Intel because the nodes are now bumping up against the laws of physics. Intel sat on the same lithography node for like a decade. nVidia or anyone else interested in shrinking down below 4nm is kinda in a no-win situation. Or at least there are no more easy performance gains to be had without extra cost being absorbed or passed on to the consumer.

Saylick · Jan 15, 2025

CakeMonster said:
I know we've been over this before, but in the light of this release, and Blackwell being rather derivate, and arguably Ada and Ampere as well... What does NVidia have coming up over the next, say two generations? Do we just iterate? Is both node and architecture stagnating? Will that make it easier for others to catch up?

Seeing as how Blackwell already maxes out the die size and TDP on N4P, I suspect Rubin flagship will likely do the same, so whatever density improvements N3E gives over N4P, that'll be the upper limit on the raw performance improvement. They'll do their usual doubling of RT intersection performance and possibly 2x tensor units (not another reduction in precision), but that'll be it besides a whole dollop of software features.

adroc_thurston · Jan 15, 2025

Heartbreaker said:
It's clear the push from AMD and NVidia will probably be more on AI and RT improvements.

uh. nope.
At least not for AMD.

GodisanAtheist · Jan 15, 2025

Heartbreaker said:
The evidence doesn't really seem to bear that out though.

In lower end 50 series parts Die size is about the same as 40 series, for about the same Shader count as last generation, and 5090 expanded proportionally to the SM count increase, so NVidia does not seem to have any die size bloat from the technology update.

Plus this generation AMD is supposed to be adding real AI tensor cores, and beefing up it's RT capability. So AMD may actually have bloat while NVidia is showing none.

So again, no evidence that AMD has any advantage to exploit, so would need a die just as monstrous as 5090's to compete... which I still argue is a complete non starter.

- OTOH the 4090 is 60% larger than the 4080 for ~30% additional gaming performance. The 5090 is scaling proportionately from the 4090, but the 4090 doesn't scale proportionately from the 4080.

I know bigger dies are generally more inefficient because physics and all that, but I wonder if the bloat is already baked into the big die arch...

adroc_thurston · Jan 15, 2025

GodisanAtheist said:
OTOH the 4090 is 60% larger than the 4080 for ~30% additional gaming performance

Mind that it's binned. So it's 'only' like 50% moar SMs.

GodisanAtheist · Jan 15, 2025

adroc_thurston said:
Mind that it's binned. So it's 'only' like 50% moar SMs.

- Ah yeah Ada never got a full big die.

Let a man dream about AMD competing in Halo...

Hitman928 · Jan 15, 2025

tajoh111 said:
Zen is way different. Zen does not use a huge complex monolith for their consumer line and is cheap to build. AMD just been delivering at a consistence cadence for their CPU division while keeping power consumption relatively low which is what help it gain consumer confidence. It was not the threadrippers that did it.

Going Halo for Nvidia makes more sense since they have a professional visualization market which provides guaranteed income at high margins.

AMD doesn't have much for this market. As we can see with the possible performance flop of the RTX 5090, going big carries more risk as more things can go wrong. The GTX 480 and Fury X are examples of this. The RTX 5090 is clocked awfully low for the amount of power it uses. This might be a fermi like chip which misses targets in the gaming segments but recovers sales in the professional markets.

Also chip design is more expensive than that. Both companies R and D expenditures seem to line up with this chart.

Here is a newer chart.

I was looking into the software part and it's ridiculously expensive. Someone on reddit says the licenses for single software can be a million dollars per year and indeed use the software mentioned by XPEA mentioned(Cadence and synopsys) and they have hundreds of these licences which are only good for one year. A single big chip can use hundreds of licenses in their design.

So would AMD rather uses these resources on something like instinct or a Halo discrete graphics card? I think we all know the answer. AMD is just following the money.

Spending hundreds of millions on a single halo just doesn't make sense when that money can be spent on datacenter where sales are in billions, not hundreds of millions.

The licensing structure for these tools is decently complex and yes, very expensive. You need a suite of tools that range in price and are licensed by the core count you will use to run the tool and (on the digital side) transistor count of your design. It also only gets more expensive as you move to more advanced nodes as the same tools require new, more expensive licenses to support smaller nodes. The license costs for these latest nodes is ridiculously expensive and the compute power needed to simulate/verify them has gone up dramatically as well.

adroc_thurston · Jan 15, 2025

Heartbreaker said:
so would need a die just as monstrous as 5090's to compete...

Don't have to. Just stack.

Saylick · Jan 15, 2025

What's the odds that Nvidia will go with a MCM solution for gaming Rubin? Just seems fishy that GB203 is half of GB202, as if they already have a split L2 solution working for GB202 and just need a silicon bridge to make MCM work next gen.

adroc_thurston · Jan 15, 2025

Saylick said:
What's the odds that Nvidia will go with a MCM solution for gaming Rubin?

Maybe? GB100 was issues galore though.

poke01 · Jan 15, 2025

Saylick said:
What's the odds that Nvidia will go with a MCM solution for gaming Rubin? Just seems fishy that GB203 is half of GB202, as if they already have a split L2 solution working for GB202 and just need a silicon bridge to make MCM work next gen.

unlikely

poke01 · Jan 15, 2025

I’m more interested in those N1X Blackwell SoCs. If those work well then maybe the generation after Rubin will be MCM

CP5670 · Jan 15, 2025

The FE cooler is in very impressive if it can dissipate that much heat in just 2 slots. But all that heat goes right into the case unlike an AIO.

gdansk · Jan 15, 2025

CP5670 said:
The FE cooler is in very impressive if it can dissipate that much heat in just 2 slots. But all that heat goes right into the case unlike an AIO.

I also finished watching that. What an insane design. It's doing what I thought was possible only with HBM parts. Nvidia allowing an engineer to talk about one of the most interesting parts of Blackwell is appreciated.

maddie · Jan 15, 2025

CP5670 said:
The FE cooler is in very impressive if it can dissipate that much heat in just 2 slots. But all that heat goes right into the case unlike an AIO.

Great engineering. Single slot high powered cards can be a thing again.

coercitiv · Jan 16, 2025

CP5670 said:
The FE cooler is in very impressive if it can dissipate that much heat in just 2 slots. But all that heat goes right into the case unlike an AIO.

Yes, excellent job from the Nvidia engineers, I hope to see this rub off to AIBs too. One still needs to carefully plan what to do with the heat building up in the case, but ultimately that is a problem for all builds - one either fixes the heat flow or they get meh results.

tajoh111 · Jan 16, 2025

https://twitter.com/x/status/1879559680059752677

Blackwell performance is starting to make more sense. Blackwell is using the same manufacturing as Lovelace which is N4. Too bad they didn't use a N4P. N4P would have allowed for 11% higher performance.

Adding 11% more performance through all blackwells lineup would have allowed consumers expectations to be met. This is frankly cheap of Nvidia.

Win2012R2 · Jan 16, 2025

I get the feeling that dual issue isn't properly "fixed" as in all cores being FP32/INT now does not make them equal, perhaps they did not have transistor budget to fix other bottlenecks and left it for N3 work

MoogleW · Jan 16, 2025

adroc_thurston said:
Mind that it's binned. So it's 'only' like 50% moar SMs.

128/76 = 68% SM difference between 4080 and 4090

MoogleW · Jan 16, 2025

Win2012R2 said:
I get the feeling that dual issue isn't properly "fixed" as in all cores being FP32/INT now does not make them equal, perhaps they did not have transistor budget to fix other bottlenecks and left it for N3 work

Budget was reallocated to the RT core. Might even call it a level 4 RT core with the features.

Interesting how they basically went back to Maxwell. Why is that?The oversll INT throughput is doubled but the capability to concurrently issue INT that Turing brought is seemingly lost. I guess just wait and see.

Seems to me watching the DF deep dive, that older games will just be a boring architecture uplift, while newer games that use SER, neural rendering, new RT acceleration structures, rt megageometry etc will run faster on Blackwell as the software render gets accelerated by Blackwell hardware. Actually similar to the 'promise of Turing'. I guess we'll see.

I don't know what they saw in their simulators, but from the looks of things, they needed GB103 with 96SM.

Timmah! · Jan 16, 2025

SiliconFly said:
Just got this & it's a bit disturbing! Except for the top die, just look at the die size, transistor density & most importantly the transistor count. Kinda disillusioned & feeling a bit nauseated.

View attachment 115021

This ladies & gentlemen, is clear evidence of cheating. I think I'll be skipping 50 series.

If this is what future is going to look like from now on, not speaking strictly about Nvidia and GPUs, but broadly about industries/society, it pretty much means technological progress available only to "rich" people going forward.

Keller_TT · Jan 16, 2025

MoogleW said:
Budget was reallocated to the RT core. Might even call it a level 4 RT core with the features.

Is this description based on what Imgtech introduced or does NV have its own reference?
Would you say that RDNA 4 is now comprehensively level 3 by that metric?

Keller_TT · Jan 16, 2025

Timmah! said:
If this is what future is going to look like from now on, not speaking strictly about Nvidia and GPUs, but broadly about industries/society, it pretty much means technological progress available only to "rich" people going forward.

GPU market and the gaming industry should become more commodified for it to remain hot and interesting and price competitive.
The fascinating bit for me is Radeon vs Adreno! What a bizarre turn of events that those estranged cousins are duking it out in the same market.

Discussion Nvidia Blackwell in Q1-2025

Golden Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Golden Member

Member

Member

Golden Member

Member

Member