Additional Processors - Computerphile

The Development of Dedicated Chips and Graphics Processing Units

The Atari VCS, also known as the Atari 2600, was one of the first consoles to use a dedicated chip for graphics processing. This chip was separate from the main CPU on the motherboard, which made it more complex to work with. The problem arose when the main CPU wanted to copy a bit of memory that was already stored in a different location. To solve this issue, Atari developed another chip specifically designed for bit block transfer.

This new chip implemented the bit block transfer algorithm, allowing it to combine two bits of memory and write the result back into the same address. This was a significant improvement over software implementations, which required reading instructions from memory and then executing them on the main CPU. By offloading this task onto a dedicated chip, Atari could achieve faster performance.

The use of dedicated chips like this one allowed for more efficient graphics processing. For example, the Amiga had a dedicated blitter chip that was specifically designed to perform fast bit block transfers. This chip ran at a speed that made it faster than the main CPU, allowing it to handle complex graphics operations quickly and efficiently. The blitter chip became an essential component of the Amiga's graphics capabilities.

The Amiga also employed another technique to improve its performance: caching. By storing frequently accessed instructions in a cache memory, the CPU could fetch data from memory at full speed without having to wait for it to be retrieved. This approach allowed the Amiga to achieve faster performance and more efficient use of its resources.

As CPUs became faster over time, the need for dedicated chips like the blitter began to decrease. The Atari TT, for example, had a 68030 CPU that ran at 32 megahertz, compared to the 8 megahertz of the original Amiga. This increase in speed meant that the instructions could be stored in the instruction cache, allowing the CPU to execute them faster and more efficiently. As a result, the blitter chip became less necessary.

However, when building new systems, designers considered alternative approaches. The Archimedes, for instance, had CPUs so fast that they didn't require dedicated chips like the blitter. Instead, the focus shifted to more general-purpose graphics processing units (GPUs). These GPUs were designed specifically for handling 3D graphics and other complex tasks, making them a better fit for systems with high-end CPUs.

In modern computers, this trend continues. Graphics cards have become increasingly sophisticated, allowing for fast and efficient rendering of 2D and 3D graphics. Most desktop machines and laptops now include dedicated GPUs that are capable of handling these demanding tasks. In some cases, it's still beneficial to use software for certain operations, such as cleaning or managing data. However, when dealing with graphics processing, the GPU is often the best choice.

In recent years, this trend has led to the development of general-purpose computing on graphics processing units (GPGPU). GPUs like those found in NVIDIA's GeForce and AMD's Radeon series are designed to handle a wide range of tasks beyond just 3D graphics. They can perform complex mathematical calculations, execute software instructions, and even handle machine learning tasks.

In summary, the use of dedicated chips and graphics processing units has played a significant role in shaping the development of computers. From the early days of Atari VCS to modern GPUs, these specialized components have enabled faster performance, more efficient use of resources, and greater capabilities for handling complex tasks. As technology continues to evolve, it's likely that we'll see even more sophisticated approaches to graphics processing and general-purpose computing on GPUs.

"WEBVTTKind: captionsLanguage: enSo, I think it's what we're talking about we're talking about multiprocessor systems then is that we're talking aboutsystems where you have multiple CPUs to write your software for so as people said as we talked about things like the keyboard have aCPU in them and all of the times we're like the Atari or the Amiga or even some of the original PCs toprocess the keyboard and then just send the data to the computer, but that's notusually a CPU that you're going toProgram yourself, write part of your application to run on in the same way most modern Intelmotherboards the management engines on runs on a CPU embedded into the chipset that's running on there andProviding no one hacks it then you're not gonna be running your code on those. It's it's a CPUIt's doing a task, but it's not one we're going to program with. If it's not the centralProcessing unit. It's an additional processing unit. Is it still a CPU?It's the central processing unit of that subsystemWhen we talk about CPU, we're talking about the central processing units a bit at the heart of the systemDoing things so people are rightly pointing out that we have CPUsBut we're not gonna write our software to run on themYou can and if you look at some weird bits of softwareThey would sometimes shove things down onto this keyboardCPU; I remember there's one virus on the Atari ST that did that to stay in memory and things but we're really talking about the realCentral processing where we're gonna write the software that's going to run to do the tasks we need and that's what we're talking aboutbut there is a class ofEffectively processing units that you do find in machines that are running part of our softwareI mean the classic example of that would be something like a floating-point unitso these days our CPUs whether it's an Intel chip or an ARM chip or something often haveBuilt-in support for doing floating-point maps I adding3.1415926 to some other random number and so on after they're built in support for thatBut they didn't originally if you look at the original IBM PC, I want the XT behind meSo let's take the lid off this while we're doing thatSo got a screwdriver built like a tankSo inside the originalPC and if we look just down here next to the power supply we'll see that if I take this card outIt's read-only the serial port at the momentWe can see what's going onHT 250 serial card eight bit. I suppose said it wasn't called that then so if we look down hereWe'll see we've got on the motherboard an Intel 8088 CPU. That's the main CPU in the original pcsWhat all your PC is easing now derive from whether it's a Macintosh or Windows machine as oh they all have a chipWhich pretty much still boots up like the Intel 8088?Did from 1983 but if you look next to it, you'll see there's an empty socket with nothing plugged into itAnd that was deliberately done there for two reasons one where they designed the 8086 or the 8088they deliberately built it in the way knowing that they might want to add support for what's called a floating-point unit orFPU which is what goes into that spare hole there and the FPU was a chipThat you could buy and put into the PC or put into any machine to have support for an after you to handle the floating-pointMathematics so if you wanted to do add floating-point numbers togetherYou could either do it by writing the software add the two numbers togetheror you could do it by having a floating-point chip inside yourComputer and the floating-point chip would then add the numbers together and the advantage of that was you could design the floating-point hardwareTo run a lot of faster than you could just write the software to do with the same speed so you can build dedicated hardwareTo add floating-point numbers together in the same way. You can build dedicated hardware to add individual numbers together, which is what's inside your CPUso when they designed the 8088 chip and what the 8086 chip theyLeft support him for this and so there were some instructions which have us if the opcode for that instruction started with specific binary patternwouldAllow a floating-point unit in the system to take control reading the values that are accessing from memorySo the CPU would work with the floating-point unit to load the version from memoryAnd then the floating-point unit would go off and do the floating-point calculation in parallel at the same time as a CPU kept moneyDoing into your stuff and so you did get some sort of parallel pricing, but it wasn't a general-purpose process at this timeAll it could do was floating point stuff if it needed to access memoryIt was the CPU that started things offThere was support in that to actually do some fetch further values and so on using direct memory accessbut in general it was reliant on the CPU to do a lot of the things so it's all supported the CPU but gave youa significant speed boost compared to doing it all in softwarethe nice thing was is that the CPU if you run thoseInstructions would throw an exception so you could write software then catch them implementthoseinstructions and software but if the CPU if the FPU was present it let the hardware do it in parallel so you could get theBest of both worlds you could write the softwareTo use floating point Maps. Don't worry about the floating point unit being there. If it is there you could then takeadvantage of it now the reason why the machines shipped with an empty socket is because the empty socket probably cost aFew cents a few pennies to put in but the CPU cost about a hundred dollars, I thinkMinimum if not more than that to put in the machineSo if you were just doing using something like word processing on the machineYou don't need a floating-point unit for most of the stuff you're doingYou would you wouldn't bother but if you're doing something that did heavy number crunchingthen you could drop the floating-point unit into the machine when it was built and you'd have one that would work a lot better toDo those sort of?operations now of course is the machines developed through the81 86 the 286 or 386 or 486 the software developed and by the time the 40s it was popular we had sort ofFonts being drawn from outline descriptions things which actually involves doing floating-point calculationsand so eventually the floating-point unit got built into the CPUs themselves so it can have sort of protesting units, which aren'tThere as general processing units, but they support the operations of the CPUdo they take some of the operations that run on the CPU in software andThey don't add any new functionality. There's nothing that you can't do without themBut by doing it in Hardware you get a speed boost by doing it in the case of the floating-point unit on the original PCOr you could get the same for the sort of Motorola machinesYou could perhaps get a hundredfold increasein the speed up to do those floating-point operations everything else rather at the same speed but for the floating-point,operations in about 100 times fasterWhich of course meant if you've got a lot of them you programs around about 100 times faster if you weren't using themYou didn't get any benefit at all. Another one, which is quite interesting, which you'd find in things like some Atari STS and in mostamigas is what was called the blitter chip andthe bolita chip basically implemented what was called the bit block transfer routine or bit blit often referred toAnd this was developed at Xerox PARC in the 70s on the outter machineI think it was and it was basically a generalized algorithm for copyingBlocks of memory round particularly in the way that you want to do raster graphicsone of the things that both the Amiga and the Atarihad and the only reason I'm looking at the Atari here isBecause it had it as a separate chip on the motherboard which is annoyingly under the network part Atari developed anotherChip which implemented the bit block transfer algorithm because he Vantage was then that your main CPU wanted to copy a bit of memoryWas set up this chip will just appear as an i/o barrel and say to this copy this bit of memoryFrom this address to this address combined with what's already thereUsing this sort of combination and you could do some very fancy graphics effects, but the advantage was whereas on the implement in softwareyou had to read the instruction for memory that said load this value from memory and then load it into a register andThen read those instruction says load that value from memory and it was loaded into it registerAnd then do the combination and write the value back outThis could just have a few Hardware counters directly inside it that loading the value from memoryLoading the next value from memory combined them wrote them back out into memory so you could do it a lot faster. You could runevery clock cycleDoing some useful work as opposed to some of them having to fetch the instructions that were doing what we needed to be doingSo by offloading it onto a dedicated support again, just like with the FPU. You're not doing anything. You can't do with the main CPUBut you're offloading it there you having it run slightly faster, and then the CPU can get on with other thingsalthough in this caseIt had to wait for the British ship to finish itbut it was still faster than doing it manually for all what the smallest ofoperations the Amiga didthe same thing you had a British ship along with a couple of other things which gave it the advantages in terms ofGraphics, of course the problem had and which a Charlie found. Is that as your CPU gets fasterYou also have to make the blitter chip faster or your CPU becomes much faster than the British shipsThere's no point of having on thereSo the Atari TT for example, which is a thirty two megahertz 68030 as opposed to an eight megahertz, six eight thousandWhich also had a cat that 68030 also had a cacheRan so much faster that the instructions would be in the instruction cacheSo the CPU could just fetch the dataAt full speed from memory and write it back with the combination done without the need for the blitter chipAnd so you very quickly you CPU gets fasterYou've got a more general architectureYou can do whatever you need to do thereAnd so actually the British ship became redundant in some of the Atari is pretty soon after it was in lamentedWell, they didn't put a faster versionInto the Falcon which can be useful in certain modesacorn on the other hand when they built the ArchimedesThey're roughly the same sort of time as Commodore Amiga and Atari ST have the blitter chipsThey said well actually our CPUs so fast that we don't need it. There's no advantage to itnow but the same approach was taken with graphics cards in computers that you would end up with the graphics card havingsupport to do some of the basic 2dOperations that Windows is wanting to do and so onSo rather than doing them on the actual CPUYou would hand that control off to the graphics card to do that things like copying the memory about perhaps drawing a straight lineWhich you can very easily implement in hardwareAnd so you've got this balance between do you you get your CPU to do this as a general piece of software?Which you can change and updateEasily or do you hand this over to a dedicated piece of hardware where you can implement it in a way that's fasterAnd so on. So for some things like the graphics thingsand floating pointsThey very much now been taken over by dedicated bits of hardwareif he's only delicately it's a to Li harder and it more we get GPUs which areeffectively general-purposeprocessing unitsBut they're very much configured and designed in a way that makes them very amenable to doingparticularly 3d graphics processing but also 2d graphics processing asWell and almost all CPUs these days and desktop machines and laptops and so on have a floating-point unit built-in because we're doing stuffThat needs those floating-point. So it makes sense in some cases to have Hardware to do that in other casesIt makes more sense to have it in the generalized nature of softwareSort of be responsible for cleaning it yourselfYou could be responsible for managing the house and things or you can employ someone else to do it on your behalfSo you can employ a cleaner or someone to sort of clean the houseWell, then you haven't to do it yourself as some people might choose to do. It's the same with the computer\n"