Discussion exciting new features, research & advancements in gaming (graphics & adjacent software)

MrMPFR · Feb 6, 2026

soresu said:
Real-Time Markov Chain Path Guiding for Global Illumination and Single Scattering!

It's actually referenced in the ReSTIR PG paper mentioned above.

It can do single scatter volume rendering insanely fast, but it's not so hot with handling lots of light sources apparently.

I remember seeing this about 8-9 months ago. More info on Reddit, where I summarized it.

Side by side comparisons on the channel of one of authors of the paper: https://www.youtube.com/@lucas-alber

Same author also has this cool denoising implementation: https://www.lalber.org/2025/06/percentile-based-adaptive-svgf/

MrMPFR · Feb 6, 2026

soresu said:
It can do single scatter volume rendering insanely fast, but it's not so hot with handling lots of light sources apparently.

Seems like this is a tradeoff resulting from limited size of SMIS set and it can be combined with ReSTIR:

Our method is less effective in scenarios with many light sources, as illustrated inAlkaline (right). In these situations, it adapts its sampling distribution to prioritize dominant light sources and maintains a stable output over time. For a detailed examination of this behavior, see section 7. ReSTIR remains orthogonal to our method and MCPG could be used to guide paths in ReSTIR PT [Lin et al. 2022] for even lower error output.

MrMPFR · Feb 10, 2026

Forget Superresolution, Sample Adaptively (when Path Tracing)

Abstract

Real-time path tracing increasingly operates under extremely low sampling budgets, often below one sample per pixel, as rendering complexity, resolution, and frame-rate requirements continue to rise. While super-resolution is widely used in production, it uniformly sacrifices spatial detail and cannot exploit variations in noise, reconstruction difficulty, and perceptual importance across the image. Adaptive sampling offers a compelling alternative, but existing end-to-end approaches rely on approximations that break down in sparse regimes.
We introduce an end-to-end adaptive sampling and denoising pipeline explicitly designed for the sub-1-spp regime. Our method uses a stochastic formulation of sample placement that enables gradient estimation despite discrete sampling decisions, allowing stable training of a neural sampler at low sampling budgets. To better align optimization with human perception, we propose a tonemapping-aware training pipeline that integrates differentiable filmic operators and a state-of-the-art perceptual loss, preventing oversampling of regions with low visual impact.
In addition, we introduce a gather-based pyramidal denoising filter and a learnable generalization of albedo demodulation tailored to sparse sampling. Our results show consistent improvements over uniform sparse sampling, with notably better reconstruction of perceptually critical details such as specular highlights and shadow boundaries, and demonstrate that adaptive sampling remains effective even at minimal budgets.

Instead of doing joint denoising and super res it reduces the sampling density corresponding to the resolution scale and then performs adaptive sampling and denoising in a joint pipeline powered by a 15M parameter CNN model.

It's "...the first end-to-end adaptive sampling and denoising pipeline designed for the sub-1-spp regime." Here it destroys DLSS Ray Reconstruction Transformer and beats all other methods. This is still true for the 2.6M parameter scaled down model.

marees · Feb 10, 2026

MrMPFR said:
Forget Superresolution, Sample Adaptively (when Path Tracing)

Instead of doing joint denoising and super res it reduces the sampling density corresponding to the resolution scale and then performs adaptive sampling and denoising in a joint pipeline powered by a 15M parameter CNN model.

It's "...the first end-to-end adaptive sampling and denoising pipeline designed for the sub-1-spp regime." Here it destroys DLSS Ray Reconstruction Transformer and beats all other methods. This is still true for the 2.6M parameter scaled down model.

Conclusion
We presented the first end-to-end adaptive sampling and denoising pipeline designed for the sub-1-spp regime. Our stochastic formulation for sample placement enables stable gradient estimation at extreme sparsity, where prior deterministic approaches fail. Combined with tonemapping-aware training using perceptual losses and a gather-based pyramidal filter suited to sparse inputs, our method
consistently outperforms both superresolution and previous adaptive sampling approaches across all tested budgets.

Our results demonstrate that adaptive sampling remains effective even when the average budget falls below one sample per pixel.
The learned sampler reliably concentrates samples on perceptually critical regions while safely reducing allocation in smooth areas where errors are difficult to perceive.

Momoka_ · Feb 14, 2026

Future game development will be better optimized for Direct Storage, bringing the compression experience closer to that of Oodle Kraken. Steam is now officially utilizing Zstd compression as part of its standard technology stack.
https://schedule.gdconf.com/session...and-beyond-presented-by-microsoft-xbox/917900

https://www.reddit.com/r/pcgaming/comments/1irq2wd/steam_is_working_on_adding_zstd_compression_for

MrMPFR · Feb 24, 2026

Real-time Rendering with a Neural Irradiance Volume

Abstract

Rendering diffuse global illumination in real-time is often approximated by pre-computing and storing irradiance in a 3D grid of probes. As long as most of the scene remains static, probes approximate irradiance for all surfaces immersed in the irradiance volume, including novel dynamic objects. This approach, however, suffers from aliasing artifacts and high memory consumption. We propose Neural Irradiance Volume (NIV), a neural-based technique that allows accurate real-time rendering of diffuse global illumination via a compact pre-computed model, overcoming the limitations of traditional probe-based methods, such as the expensive memory footprint, aliasing artifacts, and scene-specific heuristics. The key insight is that neural compression creates an adaptive and amortized representation of irradiance, circumventing the cubic scaling of grid-based methods. Our superior memory-scaling improves quality by at least 10x at the same memory budget, and enables a straightforward representation of higher-dimensional irradiance fields, allowing rendering of time-varying or dynamic effects without requiring additional computation at runtime. Unlike other neural rendering techniques, our method works within strict real-time constraints, providing fast inference (around 1 ms per frame on consumer GPUs at full HD resolution), reduced memory usage (1-5 MB for medium-sized scenes), and only requires a G-buffer as input, without expensive ray tracing or denoising.

A purely ML based high quality diffuse global illumination implementation. Many limitations on the implementation but runs much faster and consumes far less memory than probe based GI. Looks far better in some instances as well.

MrMPFR · Feb 24, 2026

Shipping Neural Texture Compression in Assassin’s Creed Mirage

A neural material texture compression technique that uses machine learning to exploit the cross-channel structure and reconstruct full PBR materials in real time, enabling high compression rate in Assassin’s Creed Mirage without sacrifying visual quality

www.ubisoft.com

Think this is the first mention of shipping Neural Texture Compression in a real game. This is not NVIDIA's SDK but Ubisofts own proprietary take that combines selective NTC (high memory pressure + instance counts) and automatic asset upscaling. Upscaling part was unexpected but Mirage was crossgen title with low quality assets and also apparently NTC benefits from higher-resolution training targets.

No talks scheduled about this at GDC, so possibly something to be discussed at a later point at Eurographics, HPG or SIGGRAPH. Also no confirmed date for when this ships in the game so it could be a while.

MrMPFR · Feb 26, 2026

marees said:
just came across a very good blog by Nvidia on work graphs from 2 years ago

That's indeed a good blog + I noticed the stuff about occupancy. Isn't no barriers supposed to be a major win especially for divergent workloads? As soon as producer is finished, no more BS waiting for the last SIMD32 thread to complete?
I've prob spent too much time looking at this stuff xD This article introduces:

GPU Work Graphs in Microsoft DirectX® 12 - AMD GPUOpen

Our primer on GPU Work Graphs introduces this exciting new paradigm for graphics developers, which enable a live shader kernel to dispatch new workloads on-demand without needing to circle back around to the CPU first.

gpuopen.com

… enable more direct methods of solving complex problems.

… reduce memory constraints and improve cache utilization.

… simplify inter-pass dependencies and barrier-induced complexity.

… improve GPU thread saturation.

So
#1 API flexibility
#2 Cachemem efficiency boost
#3 No more fixed limitations where one weak link can stall entire execution
#4 Very much the same as #3, but due additional factors.

The SIGGRAPH 2025 PDF notes are a gold mine:

https://gpuopen.com/download/SIGGRAPH%202025%20-%20GPU%20Work%20Graphs.pdf

Page 253 illustrates why running multiple material shaders at once within a SIMD unit is a terrible idea. But by having a material node for each unique material this can be avoided altogether ensuring only coherent material shaders are being run.

From a hardware angle to properly benefit from this we would need to defer the any-hit shader evaluations, since performing a SE/GPC global payload sort is required ~~to ensure there's plenty of any-hit requests to choose from~~. ~~This is to ensure only one material node at a time is being executed for the material shaders.~~ This is misleading the producer results can accumulate via the global payload sorter until these can be sent to the consumers. No expensive writes to global memory. Should allow the GPU to achieve very high occupancy. Because only one or a few to one material node at a time is executed in compute units, we can achieve very high occupancy. If there's more than one node a simple partial sort should be enough to ensure coherent shader execution. Thus extremely high coherence could be expected, likely eclipsing what is currently afforded by SER in DXR 1.2 So no more thread divergence and also no more low occupancy. Since PT is largely shading, not traversal, the potential improvement here is significant. By moving the entire RT pipeline into a work graph, not just the compute shader passes, additional efficiencies are adventitiously exploited to ensure a pipeline that's even more coherent and has even higher occupancy, thus eliminating most if not all bubbles, barriers, and empty launches.

Regular gaming workloads can benefit too, but will unlock new possibilities for GPU driven procedural content generation, complex systems on GPUs (AI and physics), neural shading, and as already mentioned ray and path tracing.

For a HW architecture that is hard coded (you know which one) to match this capability across the entire stack, which includes building a robust cachemem foundation, I suspect we could see massive benefits. Hopefully in the best case mirroring or even exceeding the ~~occupancy~~ coherence of the Pixel shaders pass in the C&C's coverage. Going from 30-45% ~~occupancy~~ coherence to ~90% is 2-3X higher ~~occupancy~~ coherence is a big deal. I know the Chips and Cheese's SER article math includes traversal step as well, but even if these benefits only applied to shading it would still be a complete gamechanger.

All this sounds too good to be true so can anyone please provide a sanity check or shoot it down if it's misleading?

MrMPFR · Feb 27, 2026

MrMPFR said:
Shipping Neural Texture Compression in Assassin’s Creed Mirage

A neural material texture compression technique that uses machine learning to exploit the cross-channel structure and reconstruct full PBR materials in real time, enabling high compression rate in Assassin’s Creed Mirage without sacrifying visual quality

www.ubisoft.com

Think this is the first mention of shipping Neural Texture Compression in a real game. This is not NVIDIA's SDK but Ubisofts own proprietary take that combines selective NTC (high memory pressure + instance counts) and automatic asset upscaling. Upscaling part was unexpected but Mirage was crossgen title with low quality assets and also apparently NTC benefits from higher-resolution training targets.

No talks scheduled about this at GDC, so possibly something to be discussed at a later point at Eurographics, HPG or SIGGRAPH. Also no confirmed date for when this ships in the game so it could be a while.

They already had a talk about NTC 2 years ago: https://gdcvault.com/play/1034892/Machine-Learning-Summit-Real-time

Still odd there hasn't been any update from Ubisoft. You don't do a silent drop for the world's first NTC game implementation, so it's prob still WIP.

Ranulf · Feb 27, 2026

GDDR7 is great says Micron. Oh, and gamers need 96GB of gpu ram.

Micron just explained how important memory is for gamers, a mere two months after it stopped making memory for gamers

GPU memory is the new performance bottleneck, but how much GDDR7 will Micron actually be making?

www.pcgamer.com

""The next era of PC performance will be defined not by more compute, but by memory scale," Micron says. Strictly speaking, Micron is talking about GDDR7 VRAM for GPUs, not DDR5 RAM for CPUs."

coercitiv · Feb 27, 2026

Ranulf said:
GDDR7 is great says Micron. Oh, and gamers need 96GB of gpu ram.

Yass, gamers need smart update tactics, deets yo!

Imagine how disconnected these company execs are, advertising for the exact thing that's going to destroy their retail sales. In the case of Micron they can go full tone deaf mode, but Asus should know better. Good luck with your PC sales in 2026 Asus, stay tuned for more deets!

MrMPFR · Mar 1, 2026

MrMPFR said:
That's indeed a good blog + I noticed the stuff about occupancy. Isn't no barriers supposed to be a major win especially for divergent workloads? As soon as producer is finished, no more BS waiting for the last SIMD32 thread to complete?
I've prob spent too much time looking at this stuff xD This article introduces:

GPU Work Graphs in Microsoft DirectX® 12 - AMD GPUOpen

Our primer on GPU Work Graphs introduces this exciting new paradigm for graphics developers, which enable a live shader kernel to dispatch new workloads on-demand without needing to circle back around to the CPU first.

gpuopen.com

So
#1 API flexibility
#2 Cachemem efficiency boost
#3 No more fixed limitations where one weak link can stall entire execution
#4 Very much the same as #3, but due additional factors.

The SIGGRAPH 2025 PDF notes are a gold mine:

https://gpuopen.com/download/SIGGRAPH%202025%20-%20GPU%20Work%20Graphs.pdf

Page 253 illustrates why running multiple material shaders at once within a SIMD unit is a terrible idea. But by having a material node for each unique material this can be avoided altogether ensuring only coherent material shaders are being run.

From a hardware angle to properly benefit from this we would need to defer the any-hit shader evaluations, since performing a SE/GPC global payload sort is required ~~to ensure there's plenty of any-hit requests to choose from~~. ~~This is to ensure only one material node at a time is being executed for the material shaders.~~ This is misleading the producer results can accumulate via the global payload sorter until these can be sent to the consumers. No expensive writes to global memory. Should allow the GPU to achieve very high occupancy. Because only one or a few to one material node at a time is executed in compute units, we can achieve very high occupancy. If there's more than one node a simple partial sort should be enough to ensure coherent shader execution. Thus extremely high coherence could be expected, likely eclipsing what is currently afforded by SER in DXR 1.2 So no more thread divergence and also no more low occupancy. Since PT is largely shading, not traversal, the potential improvement here is significant. By moving the entire RT pipeline into a work graph, not just the compute shader passes, additional efficiencies are adventitiously exploited to ensure a pipeline that's even more coherent and has even higher occupancy, thus eliminating most if not all bubbles, barriers, and empty launches.

Regular gaming workloads can benefit too, but will unlock new possibilities for GPU driven procedural content generation, complex systems on GPUs (AI and physics), neural shading, and as already mentioned ray and path tracing.

For a HW architecture that is hard coded (you know which one) to match this capability across the entire stack, which includes building a robust cachemem foundation, I suspect we could see massive benefits. Hopefully in the best case mirroring or even exceeding the ~~occupancy~~ coherence of the Pixel shaders pass in the C&C's coverage. Going from 30-45% ~~occupancy~~ coherence to ~90% is 2-3X higher ~~occupancy~~ coherence is a big deal. I know the Chips and Cheese's SER article math includes traversal step as well, but even if these benefits only applied to shading it would still be a complete gamechanger.

All this sounds too good to be true so can anyone please provide a sanity check or shoot it down if it's misleading?

I've rewritten this previous message. Sorry for any inaccuracies.

Work graphs looks even more impressive than I thought. Definitely potential for major reset with DXR 1.3. Fingers crossed.

Momoka_ · Mar 4, 2026

https://www.nvidia.com/en-us/events/gdc/
Nvidia will share the new achievements of future neural rendering in game development in GDC

MrMPFR · Mar 4, 2026

Momoka_ said:
https://www.nvidia.com/en-us/events/gdc/
Nvidia will share the new achievements of future neural rendering in game development in GDC

Can't see anything new. Still the same stuff as GDC last year.

Maybe they announce games that will integrate NRC and NTC (unlikely)

Momoka_ · Mar 4, 2026

MrMPFR said:
Maybe they announce games that will integrate NRC and NTC (unlikely)

Cyberpunk 2077 has not used NRC until now.

Momoka_ · Mar 4, 2026

https://www.intel.com/content/www/us/en/developer/topic-technology/gamedev/tech-sessions-gdc.html
Intel's content attracts me more.

MrMPFR · Mar 4, 2026

Momoka_ said:
https://www.intel.com/content/www/us/en/developer/topic-technology/gamedev/tech-sessions-gdc.html
Intel's content attracts me more.

Yeah sounds exciting.
They also had SOTA NTC paper at HPG 2025. Could be what NTC talk is about.

marees · Mar 6, 2026

Wednesday, March 11

DirectX State of the Union 2026: DirectStorage and Beyond
Speakers: Shawn Hargreaves (Microsoft), Danny Chen (Microsoft)
Date: Wednesday, March 11
Time: 11:30am - 12:30pm
Location: Room 3001/3003, West Hall

Thursday, March 12

DirectX: Bringing Console-Level GPU Tools to Windows
Speakers: Austin Kinross (Microsoft), Budi Purnomo (AMD), Steven Tovey (Intel), Kevin Hawkins (Qualcomm)
Date: Thursday, March 12
Time: 11:30am – 12:30pm
Location: Room 2020/2022, West Hall

Evolving DirectX for the ML Era on Windows
Speakers: Max McMullen (Microsoft), Hisham Chowdhury (AMD), Steven Tovey (Intel), Don Brittain (NVIDIA)
Date: Thursday, March 12
Time: 12:45pm – 1:45pm
Location: Room 2024, West Hall

marees said:
Xbox at GDC 2026: Build for What’s Next
The future of Xbox starts now.
February 12, 2026
Bryce Baer
Senior Director of Xbox Ecosystem Marketing

This year, GDC takes place from March 9-13 at the Moscone Convention Center in San Francisco, California.

This year, for the first time, we are hosting the Xbox Dev Summit, where we’ll be presenting six sponsored sessions to help prepare attendees to build for what's next. The Xbox Dev Summit is in West Hall Room 3001/3003 and will be kicking off with Jason Ronald, VP of Next Gen, at 10:10am on Wednesday, March 11.

Wednesday, March 11
Xbox Developer Summit Keynote: Building for the Future with Xbox
Speakers: Jason Ronald (Xbox)
Date: Wednesday, March 11
Time: 10:10am – 11:10am
Location: Room 3001/3003, West Hall

Xbox at GDC 2026: Build for What’s Next

The future of Xbox starts now.

developer.microsoft.com

Momoka_ · Mar 10, 2026

https://videocardz.com/newz/the-witcher-iv-to-feature-nvidia-rtx-mega-geometry
Just as I suspected—Nvidia’s long-time partner CDPR is set to feature Mega Geometry for Nanite in The Witcher 4.

MrMPFR · Mar 10, 2026

Momoka_ said:
https://videocardz.com/newz/the-witcher-iv-to-feature-nvidia-rtx-mega-geometry
Just as I suspected—Nvidia’s long-time partner CDPR is set to feature Mega Geometry for Nanite in The Witcher 4.

Impossible to appreciate that Nanite micropolygon detail without it.
Already in AW2 so I'm thinking Control Resonant will have it as well.

MrMPFR · Mar 13, 2026

Seems like DirectX are already posting SM6.10 stuff: https://microsoft.github.io/DirectX-Specs/d3d/Raytracing2.html

TL;DR: RTX Mega Geometry is being standardized. From what I can tell a lot of stuff is also being offloaded to the GPU. Looks like we can expect a preview in late Summer.

marees · Mar 13, 2026

DirectX Compute Graph Compiler will be available for private preview this summer, please reach out to your Windows representative if you’re interested in joining.

DX Linear Algebra will enter public preview in April, giving developers an early opportunity to experiment with these capabilities and help shape the future of ML‑assisted graphics on Windows. See the Linear Algebra spec for more detail about the feature.

Introducing DX Linear Algebra

(Last year, Cooperative Vector demonstrated that ML can be effectively integrated directly into the graphics pipeline, particularly for scenarios where developers want fine-grained, shader level control over how ML is applied alongside traditional rendering logic.

For the first time, developers could access hardware accelerated vector–matrix operations directly from HLSL, enabling a class of neural rendering techniques that execute inline with traditional shading.)

not all workloads fit this execution model. Many common and emerging scenarios—such as denoising, temporal upscaling, and more—require matrix–matrix operations, shared data across threads, and batch-oriented execution that go beyond what vector–matrix primitives alone can efficiently express.

DirectX Linear Algebra, an expansion of DirectX’s math capabilities is designed to support both vector and matrix-based ML workloads under a single programming model. DX Linear Algebra adds first-class matrix–matrix operations while preserving the ability to author ML directly in HLSL, giving developers explicit control over math, data flow, and execution for shader level ML scenarios.

Expanding to Model Level ML with DirectX Compute Graph Compiler

DirectX Compute Graph Compiler is a new DirectX ML compiler API designed to execute full model graphs with native class GPU performance.

While shader-level ML (DX Linear Algebra above) is powerful, many modern ML-driven graphics workloads are best expressed and optimized as full computation graphs, not as isolated operators or hand-authored kernels. These graphs capture end-to-end structure—dataflow, dependencies, and deep fusion—that are difficult or impossible to exploit at the shader level, especially when targeting the full PC ecosystem.

Shader-level ML and model-level ML now live side by side in DirectX:

HLSL Linear Algebra for small, inline workloads and
DirectX Compute Graph Compiler for larger models.

Evolving DirectX for the ML Era on Windows - DirectX Developer Blog

At GDC this year, we shared how machine learning is becoming foundational to real time graphics, and how DirectX is evolving to meet that shift across shader level and model level ML.

devblogs.microsoft.com

dangerman1337 · Mar 13, 2026

From the Xbox Helix slides is is that we're getting DirectX 13 (along with Shader Model 7.0) with RDNA 5 & RTX 60 Rubin? I mean they name called "next generation of DirectX" on the presentation of Xbox Helix.

Momoka_ · Mar 14, 2026

DX13 needs a 'killer game' at launch to show off its full potential and convince players and developers that the upgrade is worth it.

dangerman1337 · Mar 14, 2026

Momoka_ said:
DX13 needs a 'killer game' at launch to show off its full potential and convince players and developers that the upgrade is worth it.

I hope so but we haven't had even like a game use mesh shaders aside from Alan Wake II or Doom: The Dark Ages (Vulkan but it supports it). The extended cross gen console period + the popularity of GTX 10 series makes it very hard. And considering how games are expensive to make these days a game fully taking advantage of DX13 will probably be years after RDNA 5/Helix comes out.

Discussion exciting new features, research & advancements in gaming (graphics & adjacent software)

Senior member

Senior member

Senior member

Abstract​

Platinum Member

Member

Senior member

Abstract​

Senior member

Senior member

Senior member

Platinum Member

Diamond Member

Senior member

Member

Senior member

Member

Member

Senior member

Platinum Member

Wednesday, March 11​

Thursday, March 12​

​

Xbox at GDC 2026: Build for What’s Next​

Wednesday, March 11​

Member

Senior member

Senior member

Platinum Member

Introducing DX Linear Algebra​

Expanding to Model Level ML with DirectX Compute Graph Compiler​

Senior member

Member

Senior member

Abstract

Abstract

Wednesday, March 11

Thursday, March 12

Xbox at GDC 2026: Build for What’s Next

Wednesday, March 11

Introducing DX Linear Algebra

Expanding to Model Level ML with DirectX Compute Graph Compiler