Nvidia RTX 4090 Could Have 75% More Cores Than RTX 3090 & With Great Performance

Nvidia RTX 4090: A leaker by the name of @davideneco25320 on Twitter has shared some very specific details about Nvidia’s next-generation Ada (aka Lovelace) GPUs including SM counts and names of each new die. If his data is accurate (and given the recent Nvidia hack, it very well could be), Ada will be a massive upgrade over Ampere, the RTX 30-series, especially for the flagship GPU. As this is leaked data and cannot be completely trusted, take these results with a grain of salt.

The leak shows that Nvidia will not be changing its nomenclature for the Ada generation, keeping the two-letter prefix and three-digit number system as the Ampere generation. AD102 denotes the flagship GPU, likely for an RTX 3090 or Titan-class card, with AD103 following as the next most powerful die (perhaps for a potential RTX 4080). AD104-106 will follow suit being the midrange dies (i.e. RTX 4070 and RTX 4060) and AD107 will fill out the entry-level market for Nvidia’s Ada GPUs (i.e. something like an RTX 4050).

Best RAM for Gaming DDR4 and DDR5 Kits | Best Nvidia RTX & AMD Radeon RX graphics cards for Gaming

J'ai fait un petit graphique pic.twitter.com/zilwXgi0va

— La Frite David 🇫🇷 (@davideneco25320) March 1, 2022

Note also that the codenames suggest Nvidia will be using the Ada codename and not the previously rumored Lovelace codename, so that’s how we’ll refer to the future GPUs for now.

New Nvidia RTX4090 will give Great Performance Than RTX 3090

One thing that has changed significantly is the number of SMs in Ada. The flagship AD102 die will supposedly tip the scales with a whopping 144 SMs in a single die. By way of comparison, Ampere’s GA102 only has 84 SMs, so this is a 71% increase in SM count, which should likewise apply to GPU cores, RT cores, TMUs, and other elements. This will be one of the largest jumps we’ve ever seen in a single generation.

If Nvidia keeps the number of CUDA cores the same on Ada, this means we could be looking at 18,432 CUDA cores for the flagship card. Nvidia’s upcoming RTX 3090 TI ‘only’ has 10,752 CUDA cores, using the full GA102 chip. Of course, we’ll also see lesser variants that use partially harvested AD102 chips, and while 144 SMs may be the maximum, we wouldn’t be surprised to see 10–20% of the SMs disabled for some graphics card models.

The number of SMs in the other chips isn’t nearly as high, though the numbers are still very respectable. AD103 will supposedly have the same 84 SMs as GA102 with 84 SMs, a 40% jump from GA103. AD104 will follow suit, with the same 60 SMs as GA103, or 25% more SMs than GA104. AD106 is a bit closer to GA106, with 36 SMs — a 20% uplift. Finally, AD107 will supposedly feature just 24 SMs, again the same respectable 20% jump in SM count compared to GA107.

If these leaks and rumors prove accurate, we can expect flagship cards like a future RTX 4090 and RTX 4080 to pack some incredible performance improvements over the current RTX 30-series. It’s certainly a larger jump than Ampere compared to Turing, at least in some respects. RTX 3080 for example had 68 SMs compared to RTX 2080 Ti’s 68 SMs, though there were plenty of other changes.

The above doesn’t account for any additional performance improvements coming from the Ada architecture itself, which could bring further benefits. It has been rumored for some time that Ada will be jumping ship from Samsung back to TSMC with its latest N5 5nm node. That alone should provide some significant improvements in efficiency and transistor count over Ampere, and may also unlock higher clock speeds.

Power consumption could also be increased for Ada GPUs with the addition of the new 16-pin power connectors that are being developed and produced right now for future PCIe 5.0 graphics cards. Featuring a maximum power output of 600W from a single plug, that would give Nvidia a ton of headroom to boost performance on Ada GPUs.

Ada may also be the first PCIe 5.0 compliant graphics solution, and while the increase in PCIe bandwidth might not matter too much, it certainly won’t hurt performance. What we don’t know is how much Nvidia plans to change the fundamental building blocks in Ada. For example, Turing had 64 FP32 cores and 64 INT32 cores per SM, which were able to run concurrently on different data. Ampere altered things so that the INT32 cores became INT32 or FP32 cores, potentially doubling the FP32 performance.

Nvidia RTX 4090 Ampere

Ampere also features 3rd generation Tensor cores and 2nd generation RT cores for ray tracing. Ada will likely use 4th generation Tensor cores and 3rd generation RT cores. What will that mean? We don’t have exact details, but Ada will almost certainly deliver far more performance than the current Ampere GPUs. There might be more CUDA, Tensor, and/or RT cores per SM, or the internal pipelines may simply be revamped to improve throughput.

Memory is also another big player when it comes to GPU performance, and could play an even bigger role in improving frame rates considering how many SMs Ada may have. GDDR6+ and GDDR7 are already on Samsung’s roadmap featuring substantial bandwidth improvements over GDDR6X, and Nvidia will likely use one or both of these new standards if they’re ready in time for Ada production. After all, the more cores you have, the more memory bandwidth you need to feed them all.

Generally speaking, Nvidia has improved performance on its fastest GPUs by around 30% with previous architectures, but with the change in process node and massively increased core counts, plus a potentially higher power limit, it’s not unrealistic to expect even bigger improvements from Ada.

Will the RTX 4090 (or whatever it ends up being called) end up delivering twice the performance of the RTX 3090? That’s ambitious but certainly not out of reach. 75% more cores with higher clock speeds and/or a more efficient architecture would do the trick. We’ll find out more later this year, as Ada is expected to launch in the September timeframe.