Titan V GPU Core vs. Memory Overclocking Benchmarks

**FireStrike Ultra Overclocking: A Comprehensive Analysis**

Our latest experiment was designed to test the limits of overclocking on the FireStrike Ultra, one of NVIDIA's most powerful graphics cards. We observed that just the core over clocks generally provide greater uplift, with a change of 86 points for the core only to 84 points for the memory only. The difference is a boost of about 3% for core over memory only, which suggests that optimizing the clock speed on the core can have a significant impact on performance.

However, as we delved deeper into our analysis, we discovered that there's diminishing returns beyond a certain point. Specifically, with a change in offset from 175 megahertz to 200 megahertz for both the core and HBM clocks, we observed only a small gain of about 2% compared to the previous configuration. Similarly, when we increased the HBM clock from 150 megahertz to 100 megahertz, we saw less than 1% improvement in performance. This suggests that there's a ceiling beyond which further adjustments do not yield significant benefits.

We also tested the Superposition feature, which allows us to compare the gains from each aspect of overclocking individually. In this configuration, we observed more gains from the core alone compared to the memory only, with a difference of about 1.5%. However, unlike in our previous experiment on FireStrike Ultra, we did not see as much diminishing returns in this case. The gains from both core and HBM over clocks were somewhat comparable, indicating that there's still room for optimization in this area.

One notable observation was the impact of the Tensor Core on performance. While it appears to be inefficient in terms of power-to-performance, the actual behavior is more complex than initially thought. We observed that at some point, the memory clocks become bound by architecture constraints, limiting their ability to improve performance further. This suggests that future architectures for gaming may need to address this issue to optimize performance.

Overall, our analysis has provided valuable insights into the behavior of FireStrike Ultra under different overclocking configurations. While there's still room for improvement, we've also identified areas where diminishing returns are less pronounced, suggesting that further optimization is possible. The results have implications for future architectures, particularly in terms of efficiency and power management.

As we move forward with future experiments, our primary focus will be on optimizing the performance of NVIDIA's graphics cards. We'll be exploring new techniques such as hybrid modding to push the limits of clock speeds while minimizing thermal and power limitations. Our next experiment will involve testing the effectiveness of a shunt mod, which could potentially unlock further gains in performance.

In conclusion, our analysis has shed light on the intricacies of FireStrike Ultra's behavior under different overclocking configurations. By understanding these complexities, we can develop more effective optimization strategies for future graphics cards and provide valuable insights into the latest developments in NVIDIA's architecture.

**The Importance of Architectural Evolution**

As we continue to push the limits of performance on NVIDIA's graphics cards, it's essential to consider the implications of architectural evolution. The introduction of new technologies like Tensor Cores has significant implications for power management and efficiency.

In our analysis, we observed that the Tensor Core appears inefficient in terms of power-to-performance, which suggests that future architectures may need to address this issue. However, the actual behavior is more complex than initially thought, with constraints emerging at certain clock speeds.

This highlights the importance of understanding the underlying architecture and its implications for performance optimization. By studying these dynamics, we can develop strategies to unlock further gains in performance without sacrificing efficiency or power management.

**Academic Exercises**

While our analysis may seem academic in nature, it serves a critical purpose in understanding the behavior of NVIDIA's graphics cards. By exploring different overclocking configurations and observing their impact on performance, we gain valuable insights into the strengths and weaknesses of each approach.

This knowledge can then be applied to future architectures, enabling us to identify areas for improvement and develop more effective optimization strategies. The results of our analysis will inform the development of new technologies and help shape the direction of NVIDIA's graphics cards in the years to come.

**The Future of NVIDIA's Graphics Cards**

As we look ahead to future experiments, it's clear that NVIDIA's graphics cards will continue to evolve and improve. Our research has provided valuable insights into the behavior of FireStrike Ultra under different overclocking configurations, which will inform our approach to optimizing performance in future architectures.

We expect that future graphics cards will build upon the lessons learned from Volta, incorporating new technologies and techniques to unlock further gains in performance while maintaining efficiency and power management. By studying the intricacies of NVIDIA's architecture, we can develop strategies to optimize performance and identify areas for improvement.

In conclusion, our analysis has provided a comprehensive understanding of FireStrike Ultra's behavior under different overclocking configurations. By applying this knowledge to future architectures, we can unlock further gains in performance while maintaining efficiency and power management. The results of our research will inform the development of new technologies and shape the direction of NVIDIA's graphics cards for years to come.

"WEBVTTKind: captionsLanguage: enas we work towards our hybrid content for the Titan V Volta card we're now looking at clock scaling so how does the game and synthetic performance change as we scale the HBM versus the core clock basically answering the question of which one matters more for these couple of applications we're testing today and although the Titan V which runs on Volta may not have the next-gen gaming architecture in its final form because it is ultimately a compute card for scientific workloads and things like that it can still teach us about Nvidia's direction which thus far with this card we've learned appears to be targeting async compute and lower level API optimization before we get into that this content is brought to you by the Thermaltake flow RGB closed-loop liquid cooler which is a 360 millimetre radiator plus 3 120 fans that are RGB illuminated the if then we'll take it ring fans at that this is a 4.5 done a stack pump which is one of the faster pumps you can learn more at the link in the description below so this test is pretty straightforward all we're doing is testing multiple different core and hpm clock combinations we're doing that with EVGA precision which works actually really well at this card even though they didn't even release a build for it and we're running with the fan at 100% for all tests just to eliminate that overheating variable from the equation this is also done it with 120% power offset so these tests we have the most data for with synthetics that's precisely what synthetics are built for 3d mark and superposition primarily and then we also have sniper elite ashes of the singularity and destiny to as real gaming benchmarks the reasons we chose these are because for destiny 2 we saw very little scaling compared to the Titan XP in some instances and we think that's because we're becoming Rob's bound as the Rob's on this card are the same count as on the previous 10 series cards at the high end for ashes of the singularity we saw a middle step of performance it was 10% over the Titan XP and for sniper we saw massive gains over the previous architecture because it's asynchronous compute enabled and it's a lower level DirectX 12 API which Ashes is as well but it behaves differently so that gives us a full suite of what we can expect to see from the card under various can and if you want the full details on testing you can click the links in the description below to the article but let's just dive in with sniper and towards the end we'll have a lot more data for the synthetic workloads so we're just starting with the polar opposites for the games as we know games will interest most of you more than synthetics and also games will have the least change as opposed to more sensitive synthetics for Sniper Elite 4 we observed a slightly more beneficial impact from just HBM overclocking indicated by one twenty nine point six FPS average and marginally increased lows versus one twenty five point two FPS average this shows a three point five percent increase from doing just HBM to overclocking versus just the core overclocked overclocking either one stand alone is still getting us a noteworthy jump over the stock Titan V minimally eight point seven percent but overclocking both has the most impressive gains jumping up to 142 FPS average it's almost as if the core and HBM overclocks stack so to speak in this particular title and that makes sense remember that Sniper Elite 4 again uses asynchronous compute low-level API s and leverages components more heavily especially extra shaders this we think is an indicator of where Nvidia is going with its future gaming architecture whatever that may be it will likely be a voltage derivative but won't be volta in its current form in ashes of the singularity with DirectX 12 we're observing clock scallion at HB m scaling almost equally the increased core clock helps a bit more in frame time consistency but doesn't move the average in a meaningful way versus just the increased HBM clock these are functionally the same again overclocking both provides a noteworthy gain about 5% over the individual component over clocks but doing one or the other shows what we're seeing here destiny 2 showed some of the least scaling in our original test which is a mix of its DirectX 11 API and more importantly a potential ROPS limitation with destiny 2 we observed marginally higher performance with just an HBM overclock at two point six percent boosted over the core only overclocked overclocking both the core and HBM gave us another 8% over the core only OC and here's what we observed in firestrike ultra for this one we saw just the core over clocks generally provide Greater uplift with a change of 86 86 points for the core only to 84 34 points for the memory only the difference is a boost of about 3% for core over memory only over clogging and fire strike ultra overclocking both to 200 megahertz offset gets us to 90 26 points where we're observing diminishing returns versus the one seventy-five megahertz hpm to offset and even one 50 megahertz HBM to offset it would appear that the final 25 megahertz of HBM clock isn't really doing a lot for us or that we're becoming bound to somewhere else in the architecture that may be core clock but this is behavior we also saw with Vega where at some point you start becoming bound elsewhere and the memory clocks don't do as much as they did at the lower end finally superposition shows more gains from the core then the memory only over clocks but not by much the difference is about 1.5 percent we don't run into diminishing returns as hard with this one as we did with fire strike as the one seventy-five megahertz offset the one fifty megahertz offset and even the one hundred megahertz HBM to offset all show somewhat comparable scaling and performance our 100 megahertz core and 200 Hertz HBM to offsets also show a slight gain over core only or hpm only overclocking and further illustrates that there's a bit more Headroom to boost performance in this particular application generally speaking it looks like a core overclock minimally helps a bit more than hpm overclocking and some of these tests and it kind of worst case they do about the same as each other doing both does get us sort of stacks performance in Sniper Elite 4 where we gain almost equal amounts from each aspect of the overclock which really speaks to how the game works as opposed to a lot of the other games on the market for destiny 2 it's not quite as exciting and superposition we have a bit more Headroom and less of the diminishing returns than we saw with fire strike so they all have slightly different behaviors but not that different and fortunately with this card you're not too often in a position where you're wondering should I overclock a versus B because you run to other stability or thermal limitations before you run into those types of limitations so again this is all sort of academic exercises it's a look at this new architecture Volta and how it performs under different conditions with clocks thermals annoys all that stuff so we can start to form a picture of where nvidia is going for the future now it's our present understanding that what we're seeing here not even just the tensor core is wise but what we're seeing here as a whole in Volta is probably going to be at least somewhat changed a name for the future architectures for gaming now how much it changes architectural II how much it changes underneath whether these are two different architectures developed in tandem or whether they're taking Volta stripping out the tensor cores and then providing a gaming architecture with greater efficiency we're not sure the sort of expectation would be that Volta where it appears inefficient which would be in terms of power to performance and gaming should improve when it becomes whatever it becomes for the gaming cards because there's all these items in this card floating point sixty four vectors there's tensorflow processors that do nothing for games except sit there and take up space on the die and that's not great for efficiency so we would expect that that particular aspect would improve for the gaming generation cards as far as the rest of the behavior though this gives us some idea as to what's going on it is an async future it looks like from what we're seeing from the data thus far and that'll be the first time anybody is making a big push there so we'll see what happens but the next thing to do is probably the hybrid mod see how much more we can get out of the clocks because right now we're the thermal limit and power limit colliding at this point so hopefully we can fix some of that maybe a shunt mod we'll see thank you for watching as always keep an eye out for builds woods content coming up soon on this channel which will be a PCB and vrm analysis you can go to store gamers nexus dotnet slash mod matte if you want to pick up a mod matte like this one so that we just spent a lot of time working on this it has a plug that connects to a common ground point which connects to your wrist and then to a ground pin in the wall so everything is anti-static free pretty high quality materials and it has a quick reference cheat sheets printed on it or you can just subscribe for more thank you for watching I'll see you all next timeas we work towards our hybrid content for the Titan V Volta card we're now looking at clock scaling so how does the game and synthetic performance change as we scale the HBM versus the core clock basically answering the question of which one matters more for these couple of applications we're testing today and although the Titan V which runs on Volta may not have the next-gen gaming architecture in its final form because it is ultimately a compute card for scientific workloads and things like that it can still teach us about Nvidia's direction which thus far with this card we've learned appears to be targeting async compute and lower level API optimization before we get into that this content is brought to you by the Thermaltake flow RGB closed-loop liquid cooler which is a 360 millimetre radiator plus 3 120 fans that are RGB illuminated the if then we'll take it ring fans at that this is a 4.5 done a stack pump which is one of the faster pumps you can learn more at the link in the description below so this test is pretty straightforward all we're doing is testing multiple different core and hpm clock combinations we're doing that with EVGA precision which works actually really well at this card even though they didn't even release a build for it and we're running with the fan at 100% for all tests just to eliminate that overheating variable from the equation this is also done it with 120% power offset so these tests we have the most data for with synthetics that's precisely what synthetics are built for 3d mark and superposition primarily and then we also have sniper elite ashes of the singularity and destiny to as real gaming benchmarks the reasons we chose these are because for destiny 2 we saw very little scaling compared to the Titan XP in some instances and we think that's because we're becoming Rob's bound as the Rob's on this card are the same count as on the previous 10 series cards at the high end for ashes of the singularity we saw a middle step of performance it was 10% over the Titan XP and for sniper we saw massive gains over the previous architecture because it's asynchronous compute enabled and it's a lower level DirectX 12 API which Ashes is as well but it behaves differently so that gives us a full suite of what we can expect to see from the card under various can and if you want the full details on testing you can click the links in the description below to the article but let's just dive in with sniper and towards the end we'll have a lot more data for the synthetic workloads so we're just starting with the polar opposites for the games as we know games will interest most of you more than synthetics and also games will have the least change as opposed to more sensitive synthetics for Sniper Elite 4 we observed a slightly more beneficial impact from just HBM overclocking indicated by one twenty nine point six FPS average and marginally increased lows versus one twenty five point two FPS average this shows a three point five percent increase from doing just HBM to overclocking versus just the core overclocked overclocking either one stand alone is still getting us a noteworthy jump over the stock Titan V minimally eight point seven percent but overclocking both has the most impressive gains jumping up to 142 FPS average it's almost as if the core and HBM overclocks stack so to speak in this particular title and that makes sense remember that Sniper Elite 4 again uses asynchronous compute low-level API s and leverages components more heavily especially extra shaders this we think is an indicator of where Nvidia is going with its future gaming architecture whatever that may be it will likely be a voltage derivative but won't be volta in its current form in ashes of the singularity with DirectX 12 we're observing clock scallion at HB m scaling almost equally the increased core clock helps a bit more in frame time consistency but doesn't move the average in a meaningful way versus just the increased HBM clock these are functionally the same again overclocking both provides a noteworthy gain about 5% over the individual component over clocks but doing one or the other shows what we're seeing here destiny 2 showed some of the least scaling in our original test which is a mix of its DirectX 11 API and more importantly a potential ROPS limitation with destiny 2 we observed marginally higher performance with just an HBM overclock at two point six percent boosted over the core only overclocked overclocking both the core and HBM gave us another 8% over the core only OC and here's what we observed in firestrike ultra for this one we saw just the core over clocks generally provide Greater uplift with a change of 86 86 points for the core only to 84 34 points for the memory only the difference is a boost of about 3% for core over memory only over clogging and fire strike ultra overclocking both to 200 megahertz offset gets us to 90 26 points where we're observing diminishing returns versus the one seventy-five megahertz hpm to offset and even one 50 megahertz HBM to offset it would appear that the final 25 megahertz of HBM clock isn't really doing a lot for us or that we're becoming bound to somewhere else in the architecture that may be core clock but this is behavior we also saw with Vega where at some point you start becoming bound elsewhere and the memory clocks don't do as much as they did at the lower end finally superposition shows more gains from the core then the memory only over clocks but not by much the difference is about 1.5 percent we don't run into diminishing returns as hard with this one as we did with fire strike as the one seventy-five megahertz offset the one fifty megahertz offset and even the one hundred megahertz HBM to offset all show somewhat comparable scaling and performance our 100 megahertz core and 200 Hertz HBM to offsets also show a slight gain over core only or hpm only overclocking and further illustrates that there's a bit more Headroom to boost performance in this particular application generally speaking it looks like a core overclock minimally helps a bit more than hpm overclocking and some of these tests and it kind of worst case they do about the same as each other doing both does get us sort of stacks performance in Sniper Elite 4 where we gain almost equal amounts from each aspect of the overclock which really speaks to how the game works as opposed to a lot of the other games on the market for destiny 2 it's not quite as exciting and superposition we have a bit more Headroom and less of the diminishing returns than we saw with fire strike so they all have slightly different behaviors but not that different and fortunately with this card you're not too often in a position where you're wondering should I overclock a versus B because you run to other stability or thermal limitations before you run into those types of limitations so again this is all sort of academic exercises it's a look at this new architecture Volta and how it performs under different conditions with clocks thermals annoys all that stuff so we can start to form a picture of where nvidia is going for the future now it's our present understanding that what we're seeing here not even just the tensor core is wise but what we're seeing here as a whole in Volta is probably going to be at least somewhat changed a name for the future architectures for gaming now how much it changes architectural II how much it changes underneath whether these are two different architectures developed in tandem or whether they're taking Volta stripping out the tensor cores and then providing a gaming architecture with greater efficiency we're not sure the sort of expectation would be that Volta where it appears inefficient which would be in terms of power to performance and gaming should improve when it becomes whatever it becomes for the gaming cards because there's all these items in this card floating point sixty four vectors there's tensorflow processors that do nothing for games except sit there and take up space on the die and that's not great for efficiency so we would expect that that particular aspect would improve for the gaming generation cards as far as the rest of the behavior though this gives us some idea as to what's going on it is an async future it looks like from what we're seeing from the data thus far and that'll be the first time anybody is making a big push there so we'll see what happens but the next thing to do is probably the hybrid mod see how much more we can get out of the clocks because right now we're the thermal limit and power limit colliding at this point so hopefully we can fix some of that maybe a shunt mod we'll see thank you for watching as always keep an eye out for builds woods content coming up soon on this channel which will be a PCB and vrm analysis you can go to store gamers nexus dotnet slash mod matte if you want to pick up a mod matte like this one so that we just spent a lot of time working on this it has a plug that connects to a common ground point which connects to your wrist and then to a ground pin in the wall so everything is anti-static free pretty high quality materials and it has a quick reference cheat sheets printed on it or you can just subscribe for more thank you for watching I'll see you all next time\n"