Nvidia RTX 5090 Graphics Card Review — Get Neural Or Get Left Behind

1 year ago 50

The Nvidia RTX 5090 with its new two-slot cooler design

Anshel Sag

When Nvidia first announced the RTX 5000 series of graphics cards at CES 2025, it was clear that the company would be leaning even further into AI with these products. As anyone who follows enterprise technology knows, Nvidia is a major player — maybe the single most important player — in AI, so it’s no surprise that it has made significant headway in using AI to accelerate gaming.

While a lot of people will focus on the CUDA cores and Tensor Cores inside the GPU, there are also a lot of improvements that come from the new DLSS 4 software, which includes new transformer models and 4x frame generation. The RTX 50 series is the only product family from Nvidia capable of both features at that level, although the predecessor RTX 40 series does manage 1x frame generation and will take advantage of transformer models.

The Nvidia GB202 GPU

At the heart of the RTX 5090 is the GB202 GPU, based on Nvidia’s new Blackwell architecture. The full GB202 GPU has 24,575 CUDA cores, 192 ray tracing cores, 768 Tensor Cores and 768 texture units. The current version of that GB202 die based on TSMC’s 4N process node yields 21,760 CUDA cores, 680 fifth-gen Tensor Cores, and 170 fourth-gen ray tracing cores, plus it has a 512-bit memory interface. Down the road, as yields improve, it is possible we could see an RTX 5090 Ti with all cores enabled. For memory, the RTX 5090 has 32GB of GDDR7 using PAM3 pulse-amplitude modulation signaling for better frequency and voltage. This has resulted in 28 Gbps memory with 1,792 GB/s of memory bandwidth.

Compared to the 4090, the 5090 has next-generation CUDA, Tensor Cores and ray tracing cores. It also has almost 33% more CUDA cores than the RTX 4090, with similar upgrades for Tensor Cores, RT cores and RT performance. The 5090 also has a 575-watt GPU versus the 450-watt GPU in the 4090, all while using the same TSMC 4nm 4N process node and upgraded PCIe Gen 5 interface. Nvidia has upgraded the Tensor Cores to FP4 capability, which it claims has twice as much throughput of FP8 as the Ada Tensor Cores inside the 4090. The RT cores also saw a huge bump in performance in ray-triangle intersection testing and a ton of other ray tracing and path tracing features.

A block diagram of how the AMP works in the Nvidia GPU

Nvidia

One new addition to the RTX 5090’s GPU is the AI management processor, or AMP, which is a fully programmable context scheduler. This is designed to reduce the overhead on the GPU for scheduling tasks to the different cores; it acts like a traffic cop for all the different workloads working on the GPU concurrently. This is a dedicated RISC-V processor that is located at the front of the GPU pipeline, which results in much lower-latency decision making than CPU-driven methodologies. The AMP is also compatible with Microsoft’s hardware-accelerated GPU scheduling introduced in Windows 10, so it shouldn’t create any new challenges for developers and should improve CPU utilization and latency when performing multiple tasks on the GPU simultaneously. All in all, this means a better experience when performing graphics and AI workloads simultaneously, which is increasingly happening more on Nvidia GPUs as features like DLSS become so important.

In addition to AI and graphics, the new GPU on the RTX 5090 brings enhanced 4:2:2 H.265 and HEVC video encoding capabilities. This improves encoding capability for 4K content that is shot in 4:2:2 color, which is becoming more standard for content creators, and for enabling new AV1 and HEVC encoding. Nvidia’s ninth-gen NVENC encoder also adds a new AV1 ultra-high-quality mode, which improves quality even further than standard AV1 quality. Nvidia also added a third encoder to the 5090, which can reduce encoding times by as much as 50% compared to the (already incredibly fast) 4090. Nvidia even says that it’s as much as 4x faster than the RTX 3090; I never got to test that model, so we’ll have to rely on other reviewers for that.

The display pipeline for the GB202 GPU supports DisplayPort 2.1b, which allows up to 80 Gbps of bandwidth utilizing UHBR 20. This translates to running up to 16K at 60 hertz, 8K at 120 hertz and 4K at up to 240 hertz.

An Nvidia flowchart of how DLSS Multi-Frame Generation works

Nvidia

DLSS 4 Multi-Frame Generation

The most consequential feature of the RTX 50 series is DLSS 4’s Multi-Frame Generation. This is the feature that effectively turns one rendered pixel into up to 16 total pixels. While the 40 series offers frame generation, it is capped at 1x, which improves graphical performance, but nowhere near the level that 4x does. The chart below lists DLSS capabilities as they pertain to the different series of Nvidia graphics cards; the company has said that it may offer frame generation on the RTX 30 series down the road, but that’s not certain.

DLSS features by generation

Nvidia

DLSS Multi-Frame Generation boosts performance in concert with a new transformer model, which it uses to improve image quality while upscaling. This is the first time in five years that Nvidia has changed the type of model it uses for DLSS, having previously used a convolutional neural network model. One of the fundamental capabilities of DLSS is to render a game at lower resolution to achieve higher frame rate, then upscale it to the playable resolution, which can affect image quality. By using a transformer model, Nvidia improves the upscaling quality and arguably makes DLSS feel lossless, even if it isn’t literally so. Nvidia also uses the transformer model for its Ray Reconstruction function, which results in similar image quality improvements.

Image quality with the new Nvidia transformer

The Test Bench And Methodology

To test the RTX 5090, I built a new test bench using an AMD Ryzen 9800X3D processor paired with an ASUS X870E Hero gaming motherboard — sent to me by AMD — cooled by a 360mm ASUS ROG RYUO III CPU cooler. This was paired with 64GB of Patriot Viper DDR5 6000 MT/s RAM sent to me by Patriot Memory, a 2TB Crucial Memory T705 Gen 5 SSD and a Corsair 7000X case with a 1-kilowatt Corsair RM1000x power supply sent to me by Corsair. The monitor was an Alienware AW3225QF, which I reviewed last year; this monitor is capable of 4K 240-hertz gaming, which is where the RTX 5090 is designed to shine. All of these components were used in service of getting the best possible benchmark numbers for the RTX 5090 against the 4090.

The author's test rig with the RTX 5090 installed

Anshel Sag

Since AMD isn’t necessarily competing at the high end of the market against Nvidia for this generation, it seemed much easier to compare the 5090 against the 4090; this approach also accommodated the amount of time I had to build the system and test the new card, which was about three days. Needing to limit myself to just a few benchmarks, I chose Blender, 3Dmark and three relevant games that could show DLSS 4 with frame generation in action. Those three games were Marvel Rivals, Star Wars: Outlaws and Cyberpunk 2077. Nvidia says that it will have 75 games supported for DLSS 4 when retail availability starts on January 30. Cyberpunk 2077 is used to address this generation’s version of the “But can it run Crysis?” test. Nvidia and Cyberpunk maker CD Projekt RED have invested lots of time and money into making the game look really good.

Benchmark Results Against RTX 4090

First up is Blender, which has become one of the most popular creative tools for 3-D artists. The latest version, 4.3, is available in the Blender Benchmark, which is what I used to compare the 5090 to the 4090. It is made up of three different tests that compare the raw 3-D rendering power of the two cards.

The Nvidia RTX 5090 outperformed the RTX 4090 across the board on Blender 4.3 benchmarks.

Anshel Sag

As shown in the diagram above, the RTX 5090 is a clear winner in all three tests; it improved upon the RTX 4090 by 33% in Monster, 45% in Junkshop and 31% in Classroom. This is a respectable increase that would be appreciated by any 3-D creator, especially if they were coming from an even older card.

Next up, 3Dmark is a synthetic benchmark with two DX12 tests that don’t take advantage of the GPU’s AI capabilities and mostly focus on rasterization. Steel Nomad is a 4K benchmark with DX12 and HDR that uses advanced rendering techniques, while Speed Way uses a lower resolution (1440P) with ray tracing to offer a bit of both ray tracing and rasterization, although still without anything like DLSS. 3Dmark continues to be an industry-standard benchmark and a good way to test theoretical performance.

The RTX 5090 significantly outperformed the RTX 4090 on both 3DMark benchmarks.

Anshel Sag

As these benchmarks show, the RTX 5090 is again considerably more performant than the 4090. Specifically, the RTX 5090 is 42% faster in Speed Way and 50% faster in Steel Nomad, which makes sense if you consider the increased CUDA and RT cores. That said, I would probably consider these the best-case scenario in games that don’t use DLSS; it’s likely that many games would deliver less than these numbers in the real world without AI.

Real-World Game Benchmarks

All three games tested were titles that Nvidia identified on its DLSS 4 early-access list. I chose these three because they also represent a good diversity of games and a great way to understand how much Nvidia is improving with AI. Again, Nvidia says that 75 games will support DLSS 4 at retail availability, but 700 games already support DLSS in some capacity, and with DLSS override in the Nvidia App, we could see even more games support DLSS 4. Nvidia has the market share to get the industry to adopt DLSS, and with popular games such as Marvel Rivals adopting it at launch and supporting DLSS 4, we can expect it to make a significant impact from the get-go.

The Marvel Rivals game performed better on the RTX 5090 than the RTX 4090.

Anshel Sag

For Marvel Rivals, I turned on frame generation with both graphics cards, which automatically turns on low latency and DLSS. The game was also set to run at 4K while using Nvidia FrameView to track the frame rate. For those who aren’t familiar, 1% lows are the lowest frame rates experienced 1% of the time while playing a game; it’s a worse-case (not quite worst-case) scenario that can give more context to a simple average frame rate.

When I ran this test, 1% lows on the RTX 4090 were a still-very-playable 84 FPS, with an average frame rate of 160, while PC latency was 31 ms. However, the RTX 5090 absolutely blew past my expectations with a 1% low of 115 FPS and an average of a whopping 258 FPS, which is actually beyond the refresh rate of my 240-hertz monitor — and great for such a competitive game. The PC latency was also reduced 30%, down to 21 ms, which can make all the difference in a competitive title like Marvel Rivals.

A comparison of benchmark performance for Star Wars: Outlaws on the RTX 5090 and the RTX 4090

Anshel Sag

For Star Wars: Outlaws I was provided by Nvidia with an early-access build to test DLSS 4. However, before I did that, I played the game on the RTX 5090 in 4K with ray tracing fully enabled but without frame generation enabled; in that configuration, the average frame rate was only about 40 to 50 FPS. Turning on DLSS with 4x frame generation boosted my frame rate to an average of 180, which made the game play entirely differently while still looking just as visually stunning. The RTX 4090 also has frame generation available, with an average of 96 FPS, but that is basically half the performance of the RTX 5090.

A performance comparison of the game Cyberpunk 2077 on the RTX 5090 and the RTX 4090

Anshel Sag

Finally, I turned on pretty much everything in both cards for Cyberpunk 2077, including ray tracing, frame generation and Ray Reconstruction. I did this while running the Cyberpunk in-game benchmark in 4K, which shows low, average and high FPS. The RTX 5090 performance was more than double the RTX 4090 in all three scenarios, which once again shows how much faster the latest AI makes the RTX 5090.

Power And Thermals

The RTX 5090 is a power-hungry beast. It is quoted on paper as having a 575-watt GPU, versus the RTX 4090’s 450 watts. Nvidia recommends a 1-kilowatt PSU for the 5090, which is an upgrade for many people. During my testing, the GPU monitoring app GPU-Z reported that the GPU reached 555 watts of thermal design power and 523 watts of peak board power; these numbers may not be 100% accurate, but they do indicate that the card is reaching nearly its full potential.

Thermal image of the Nvidia RTX 5090 generated by the FLIR One Pro Gen 3 thermal camera

Anshel Sag

Thermally, this card is fantastic, especially when you consider that the RTX 4090 is a triple-slot card that has a colossal cooler on it. The RTX 5090 is considerably smaller, with an even smaller PCB to enable a pass-through. The card reached 77 degrees Celsius at the peak of my testing and externally never went beyond 62.1 degrees Celsius, according to my Flir thermal camera. The card did not get loud at any point when inside my case and truthfully seemed to handle the thermal load better than I expected, considering that it’s a two-slot card design with a smaller cooler than the 4090 — while also having over 100 watts more TDP.

Our New Neural Rendering Era

To sum it up, when the latest AI capabilities are enabled, the RTX 5090 is without a doubt the fastest graphics card in the world. Even without the AI capabilities, it is still the fastest by a good margin. While I didn’t have the time to test creative applications such as Premiere Pro or DaVinci Resolve, I am excited by the addition of a third encoder and improved HEVC and AV1 encoding. While these encoders are great for creators and will speed things up considerably, they can and will likely further reduce the overhead on the GPU while streaming as well. The RTX 5090 is a powerhouse gaming, content creation and AI GPU with incredible performance across all three.

That said, this is still a $2,000 graphics card and will likely be in very tight supply for a while. We may unfortunately see the return of GPU scalping, even at a $2,000 list price. Note that this is a higher initial price than the RTX 4090, which debuted at $1,600, although it does leave room for a $1,500 RTX 5080 Ti as a nearly-as-good spec, given that the less-powerful RTX 5080 will retail for $999. I believe the pricing of the rest of the RTX 50 series is designed to squeeze AMD, while the RTX 5090 itself looks like an unapologetic halo. And while I do believe that the RTX 5090 is worth $2,000 for those with the budget and an appetite for top performance, I would also say that lots of people will likely benefit from the RTX 5080 with frame generation turned on.

The RTX 5090 brings together the best of the best technology across CUDA, RT and Tensor Cores and a management chip to ensure better latency and maximize efficiency. I am interested to see how these performance improvements translate to the RTX 5080 and RTX 5070, and how those will compare to Nvidia’s RTX 4080 and 4070 — and possibly even AMD’s upcoming Radeon RX 9070. In that connection, I am curious to see how AMD can respond to features like multiframe generation and Nvidia’s continued improvements to ray tracing performance. It will be important to test those capabilities across titles that support both Nvidia and AMD technologies such as FidelityFX Super Resolution. I worry that if AMD doesn’t offer something competitive to what Nvidia is doing with DLSS 4, we might not ever see AMD catch up.

Moor Insights & Strategy provides or has provided paid services to technology companies, like all tech industry research and analyst firms. These services include research, analysis, advising, consulting, benchmarking, acquisition matchmaking and video and speaking sponsorships. Of the companies mentioned in this article, Moor Insights & Strategy currently has (or has had) a paid business relationship with Alienware (Dell), AMD and Nvidia.

Read Entire Article