Dual GIGABYTE EPYC Server - The World Record Breaker!

**Converting to Modern Hardware: The Need for Action**

In today's fast-paced digital landscape, companies need to adapt and evolve to stay ahead of the curve. One area that requires significant attention is server infrastructure. Outdated hardware can lead to inefficiencies, increased costs, and a lack of scalability. It's time to take action and convert to modern hardware that can handle the demands of today's applications.

**The Cost of Inaction**

Consider the cost of not upgrading to modern hardware. A server like the one showcased in this video, with its 77 42 configuration, might seem like a good deal at first glance. However, when you factor in the cost of equivalent horsepower on Amazon Web Services (AWS) or Google Cloud Platform (GCP), it becomes clear that spending $77 42 per month is not as cheap as it seems. In reality, this server can be replaced with a more powerful machine for significantly less money - sometimes even less than half the price. This makes sense when you consider the cost of processors, memory, storage, and other components.

**The Availability of AMD EPYC CPUs**

AMD's EPYC CPUs are widely available and in high demand. As a result, there is little to no availability problem with these processors. In fact, AMD is producing them at a rapid pace, which has helped to alleviate concerns about supply chains. This makes it an ideal time to consider upgrading to modern hardware.

**Why Upgrade?**

There are several reasons why companies should upgrade to modern hardware like the AMD EPYC CPUs. Firstly, it enables better scalability and density. With more cores and threads available, applications can take advantage of parallel processing, leading to improved performance and efficiency. Secondly, modern hardware is designed with power efficiency in mind, which means less energy consumption and reduced heat generation.

**The Importance of Continuous Monitoring**

Another reason to upgrade to modern hardware is the need for continuous monitoring and maintenance. Outdated servers can become obsolete, leading to a lack of visibility into system performance and potential security risks. By upgrading to modern hardware, companies can ensure that their systems are always running at peak efficiency and reliability.

**Gigabyte's Contribution**

A big thank you goes out to Gigabyte for loaning us this server, allowing us to demonstrate the capabilities of the AMD EPYC CPUs in this video. We're grateful for their support and contribution to the industry.

**Big-Data Benchmarks: A Way to Measure Performance**

To measure the performance of our servers, we use a suite of big-data benchmarks. These tests simulate real-world scenarios, such as scientific applications (e.g., Cinebench), PHP web servers, and more. Some of these benchmarks are designed for single-threaded applications, while others scale with the number of cores available.

**The 128-Core Monster Server**

We recently benchmarked a server featuring an 128-core CPU. This machine is truly massive in terms of power density, making it an ideal choice for companies looking to squeeze every last bit of performance out of their hardware. With this level of processing power, servers can handle even the most demanding applications with ease.

**2-Node Servers: A Density Solution**

For companies requiring high-density solutions, we're seeing a rise in 2-node servers. These configurations offer better storage and memory utilization than traditional server designs while sacrificing some flexibility in terms of PCI Express expansion slots. However, when it comes to density, these servers are hard to beat.

**The Importance of Connectivity and Expansion**

When choosing a server configuration, consider the importance of connectivity and expansion options. Some companies require access to multiple PCI Express slots for specialized applications or hardware components. In such cases, 2-node servers with fewer slots may not be the best choice. By weighing these factors, companies can ensure that their servers meet their specific needs.

**Benchmarks: A Way to Compare Performance**

To compare the performance of different server configurations, we use a range of benchmarks designed for various applications and workloads. These tests provide valuable insights into the strengths and weaknesses of each system, helping companies make informed decisions when selecting hardware.

"WEBVTTKind: captionsLanguage: enit's quite loud in here sorry we'll go back outside in a minute but for my next magic trick I'm gonna make the equipment and all these racks disappeared this is what I wanted to show you this is the kind of consolidation that is enabled by 128 pours into you most IT organizations most people have servers are not on the bleeding edge they've got a lot of systems we've got power edge 710 720 cisco ucs m4 i mean this is everything from super mikro Dell Cisco and HP and we've already removed a lot of systems I mean the first rack this is an older net shelter rack all of the systems in here have already been consolidated and we're not even at half capacity on just one 128 core machine we're using VMware as the hypervisor but you could just as easily do this with proxmox but you really can consolidate all of these older servers into just a couple of 128 for monsters and that's what we're using the gigabyte chassis for so here we are at the loading dock with the equipment ready to go out we've got our TL 2035 probably should have been retired a long time ago and a whole bunch more cisco servers now though ones on top they're all older ddr3 but we've got the 1u ddr4 stuff on the bottom and you know some of these have 72 gigabytes of RAM 64 gigabytes of RAM 128 gigabytes of RAM but when you're consolidating all of these into virtual machines I mean the operating systems are also Windows and Linux typically even things like SQL Server which normally it's like oh let's put that on bare metal now these new servers are so fast with the nvme storage plus spinning russ plus the caching plus things in vmware like visa and all this can go the just using 32 cores in vmware we're already 25% faster than all of these machines combined it's just completely insane it's not just the clock speed it's not just clock for clock on modern hardware it's clock for clock on older hardware because you know businesses want to squeeze every last of value out of equipment they can and at these point these things aren't worth the electricity it takes to run them this world record breaker 128 cores 256 threads 4 terabytes of memory yes this is an AMD epic Rome server we got to put it together something simply requires a little bit DIY a little bit IKEA if you will this is the gigabyte our 282 z93 look at this look at this just raw unadulterated horse power this is incredible an incredible incredible machine let's see what it's got under the hood 64 cores 120 threads each 128 PCIe lines each but most of the PCIe Lonnie well half the PCIe lanes I made sure the two CPUs are used for inter CPU communication 64 CPUs means the Windows doesn't know what to do each CPU is gonna be to nuuma nodes it's gonna be too near Numa nodes and then the other CPU is going to be far nimma node well 2 for Numa nodes which actually works pretty well in Windows that's pretty similar to the topology that has been most debugged on Windows theoretically windows supports all topologies but as we learned within the threader for 2990 Microsoft can't be bothered to put a lot of work into anything that's slightly outside the norm of CPU topologies these CPUs take that into account they're really well designed that work perfectly Windows Linux doesn't matter of course on Linux these things absolutely shred as we've seen from our benchmarks and other stuff but we can actually put 64 cores to the test in this chassis my goodness what a beautiful chassis now the are 282 z93 as configured it has 12 three and a half inch SATA drives so this is all about your bulk storage and it's also available and you know if you want full frontal nvme 24 nvme bays no problem if you're still mixed you know legacy SAS and nvme flash or something like that you're gonna have mixed nvme and SAS flash for the front as well the back line that we're working with in this particular model is SATA only so if you wanted to upgrade to SAS the the you know you've got two connectors for that are not present on this particular backplane so this thing is designed for you know those 1020 terabyte three and a half inch mechanical drives and you've got one end to here you can of course break out your x16 slots into four m dot two connections so if you wanted to use into storage that's fine or PCI Express flash add-in cards you know they make the fifteen terabyte add in flash cards that are PCI Express this thing's got nine physical slots although in this configuration it's really set up for a GPU type workload so you can have three dual height GPUs comes with your power cables your breakout cables this is the 6 plus 2 SATA power connection as opposed as opposed to the 8 pen power connection if you're running something like the Tesla V 100 you can get a different cable or the V one hundreds come with an adapter for the EPS 12-volt either one this chassis will support you can specify what you need from your your vendor your OEM that you're working with but if you get a bare-bones system like this it's really not too hard to you know DIY and put it together a lot of people are all about the the for our you know incident response like the for our support you just have to order from a good reseller or work with a good partner that's going to support the hardware like that stop me if you've heard this one before the IT manager or the senior developer or whoever you're sort of connecting to and remoting into and managing or developing on ten-year-old hardware maybe it's Sandy Bridge maybe it's even older it might be it's like Camila maybe it's a Xeon box that has been there a while was installed by somebody that left the company like three years ago and everybody's not really sure what's going on does it have all its patches is it updating itself I don't know every now and again we have to reboot it AMD epoch is changing the game with regard to consolidation and putting all these machines together now you might be thinking that's crazy because you got all that equipment and a server or in Iraq you know all those different individual servers and the reliability that should be pretty good because you know only well you if something fails even or something fails catastrophically it's just gonna be the one server that fails but when you've got an entire rack of equipment all that electricity usage and everything else versus consolidation down to a single to you server is that even possible yeah I've done it now I've actually done it a fair bit you can consolidate all of those old machines down into a single epic 77 42 dual-processor server 120 8 cores 256 threads you know up to 8 terabytes of memory although nothing get a little pricey don't really need that much you can always upgrade into it this is the new box that's gonna replace Bob's box let's actually document at this time and you can run both of those in parallel with basically no performance penalty because you've got so much computational horsepower at your disposal I mean 256 threads and over subscribing that Amy's got a ton of hardware in these new HEPA chrome CPUs to be able to over subscribe those virtual machines so I can't it's it's difficult for me to come up with words to explain to you how much of a game changer this is in the enterprise but now at this point where our memory and processors installed before we install any add in peripherals like the tesla v1 hundreds we need to do a power on test and make sure that it works so you grab a monitor and keyboard and mouse basically get this thing set up plug it in turn it on see what happens it works really well uses the shadow copy mechanism to create a disk snapshot of a running system so you can run that create a VHD and then you can convert the VHD to whatever you need you know VMware or hyper-v or proxmox or a mix of whatever it is that you're running and then you have that physical machine becomes a virtual machine that is a really handy tool in terms of Linux migration Linux is super portable and super easy you can convert your Linux installation from a physical machine to a virtual machine or a virtual machine to a physical machine or move containers around or convert from the old proxmox containers to the new proxmox containers fairly you know the LXE stuff like CL x DX that's just it's usually not super hard you just move one thing at a time and even if your virtual machines are unruly in terms of like how big they are and how much data they are that's maybe a clean-up opportunity you can mount some NFS shares or create some network file systems or create some I scuzzy targets and migrate your data and sort of get control of your infrastructure at the same time do you know how much Amazon like if you're if you're getting ec2 space for this 77 this dual 77 42 this google AMD epic server you're gonna be spending five six thousand dollars a month for the equivalent horsepower of this machine but you're probably gonna spend less than forty thousand dollars on a server like this and we know with the processors the memory the storage the whole nine yards in fact it may even be less than that and if you if you drop down from the 64 core monster CPUs you're gonna you know not have as good a density but you can save a ton of money on the cost of the machine overall and you can get it right now there's not really there's not really too much of an availability problem with the AMD epic CPUs because AMD is selling a ton of them and they're making a ton of them and the Triplett production at least as far as aim the epic CPUs seems to be pretty good so I've just hopefully I've given you some ideas big thanks to gigabyte for loaning me this server so I could show you in this video what we were doing you know sort of in the data center and this is something that you should take on this is not something that you should you should be afraid of these are these the projects that your company needs you to do because nobody else is gonna do it and it's gonna turn into a big mess later when one of those servers in the rack randomly dies you don't even know which one it is because no one has logged into it in five years it just needs to be on doing its thing and no one thinking about it and if you want it to continue to be on doing this thing with no one thinking about it you need to convert it into modern hardware and deploy it on epic because it looks like these epic chassis x' and this epic ecosystem is gonna be around for a while I'm Wendell this is level one if you want some how to's done like how to do stuff something specifically with your server let me know there's a full suite of benchmarks in the description everything from scientific applications things like grill max to things like PHP web servers stuff like that and some of these benchmarks are just running on a single thread so something like PHP which is really super context-dependent that's a single thread benchmark so you're really just testing you know how it's going to run in the context of that PHP process not how good the server is overall so if you get like that 74 o2p the performance is not gonna be a lot different than this 77 42 because the clock speeds are roughly the same except that you've got 24 cores versus 64 cores so you'll get that scaling because PHP scales linearly it's like Cinebench but it's the server version of Cinebench just depends on what the benchmark is whereas grow max will just use all the computational horsepower that you have available assuming that you get openmpi set up and so that is a whole machine benchmark I wish that um we use the frolics test suite I wish that it had more of a distinction between uh this will this benchmark will scale with the number of course that you have and this is more of like a context switching benchmark so the more cores you have the the better the result from this benchmark it will be but this benchmark is really just testing uh something that's single thread or lightly threaded technically the PHP benchmark is a couple of threads because you've got the Apache process and then the PHP handoff and so if you want to be like super pedantic about it it's not just what fits on a single thread but does scale pretty linearly with the number of cores that you have I'm Wendell this is level one if you want a how-to on how to do something with 128 core monster server because that server not it's not that it should be doing double duty it should be doing like quintuple duty or even beyond that because it is a ridiculous amount of horsepower into you and you can even get higher density solutions you can get for two you nodes in a 2-u server if you really want density but you're gonna give up storage and memory channels and some other things like that connectivity I like having although I like having all the PCI Express expansion available and physically in the chassis so I think a 2-u server like this is is just about right probably the nvme configuration instead of the three and a half inch configuration that we saw in the video like one of our earlier chassis configurations but yeah 128 core CPU is a monster in a way that is is difficult to explain but hopefully I showed you what it enables so that you have a better understanding\n"