I went back and looked at the Navi 4C designs. Why jump straight to something that complex and expensive? Assuming RDNA 5 uses chiplets, why not go for a simpler layout? This is what I’m thinking:
GCD - 40 CU (80-100mm2)
MCD - 128 bit bus + 32mb Infinity Cache (~70mm2)
3D stack the GCD on the MCD and you have a low-end product. No expensive/complex interposer or bridge packaging needed. Just two dies like Zen X3D. Then combine two and three stacks for the mid-range and high-end products. Those would require some sort of interposer or active bridge. The product stack would look like this:
Peasants - 40 CU, 128 bit bus, 32 mb cache, 2 dies (150-170mm2)
Plebeians - 80 CU, 256 bit bus, 64 mb cache, 4 dies (300-340mm2)
Whales - 120 CU, 384 bit bus, 96 mb cache, 6 dies (450-510mm2)
That’s pretty conservative in terms of silicon and covers most of the product stack with two dies and less advanced packaging than what they were working on. Assuming N2X, the low-end version might approach a 9700 XT while the high-end would be at-least twice as fast. They could maybe fit 60 CU but it hards to say without knowing how much bigger RDNA 5 CU’s might be or if they use N3P vs N2X.