The Art of Bit Blitting: A Technique for Efficient Graphics Rendering
The art of bit blitting is a technique used to efficiently render graphics on computers, particularly in the early days of gaming and graphical applications. This method involves using clever hardware and software combinations to copy data from one location to another, allowing for rapid rendering of images on screen.
When it comes to copying regions of memory, the bit blit algorithm proves to be an effective solution. By shifting the input data by a certain amount, we can achieve the same effect without having to actually shift the input itself. This is because everything we're going to copy is shifted by the same amount, so rather than shifting the input, we just shift where we read the output to the left and get the same effect.
However, this leaves us with two bits leftover, which need to be handled. To solve this problem, we can use a clever approach involving the first byte of memory. We shift it by shifting our output things, effectively copying the leftover bits into their correct positions. This is achieved through wiring up the hardware in such a way that when we come to read the next byte, these bits get copied from the original location into the desired position.
The result is that we can build this functionality into the literature of computer graphics programming, whether it's for the Amiga or any other machine. By using clever hardware and selecting different positions, we can design this into the program itself to achieve rapid rendering of images on screen. The BlitzChip can do this much quicker than software alone, allowing for optimized performance.
The bit blit algorithm has a rich history, dating back to its development on Xerox Alto machines in the 1970s. It was later popularized in SmallTalk programming languages and became widely used in the computer graphics industry during the 1980s and 1990s. This technique proved to be particularly useful for tasks such as scrolling text or rendering windows, where large amounts of data needed to be copied quickly.
One notable example of the use of bit blitting was in the Amiga machine, which utilized this technique to render graphics efficiently. The Amiga's BlitzChip processor could perform bit blit operations rapidly, allowing for smooth and fast rendering of images on screen. This was achieved by setting up the BlitzChip to perform the operation while the main CPU calculated where the next object would be placed.
However, with the advent of color graphics and continuous tone images, the need for bit blitting decreased. In these cases, using alpha channels to combine images became more practical. Alpha channels are used to represent the transparency or opacity of an image, allowing for smoother blending of colors. By multiplying the source image by its alpha channel and adding it to the destination image multiplied by one minus the alpha channel, we can achieve a similar effect to bit blitting.
In conclusion, the art of bit blitting is a powerful technique that has been used in computer graphics programming for decades. Its use in early gaming machines such as the Amiga demonstrates its effectiveness in rendering images quickly and efficiently. While its popularity may have waned with the advent of newer techniques, it remains an important part of computer graphics history and can still be useful in certain situations.
"WEBVTTKind: captionsLanguage: enone of the things that's always fascinated me is bits or bit blitz or the blit operation and things that it's very referred to depending on which computer system you're using which operating system what time of day it is that sort of thing this was an algorithm that was developed at xerox park back in the 70s by diana mary and dan ingles and a few others and it's an algorithm that was used right through the 70s through the 80s and even into the 90s on computer graphics hardware across all sorts of different computers and interpreting in software and things and so i thought it'd be fun to have a look at what this algorithm is used for why it's sometimes implemented in software why it's even faster if you implement it in hardware or can be faster if you implement it in hardware and things are just why it's so useful in computer graphics operations if you bought a computer in the 80s you may well see that listed on the features was that it had a blitter chip in there famously with the commodore amigos one of the things that they claimed made it so good and then it was added by atari into the atari st line a bit later on as well and that is just a hardware implementation of the same algorithm that xerox park and others have been using right back to the 70s so it's not something new that came from commodore or amiga or any of those things it's just implementing a standard algorithm in hardware because it works faster if you do it in hardware then if you do it in software so what does this algorithm do well it's very very simple it copies blocks of memory from one location to another but it has the ability to combine those two blocks or three blocks depending on the way you implement it in specific ways and depending on how you combine those things you can get interesting patterns appear on screen so it's mainly used for graphics operations although you can use it for other things the atari falcon for example uses it to speed up hard disk access because your hard disk appears as a location in memory and so you can copy from that location to other bits of memory very very quickly if you use the blitter chip to do it but in general it's used for heart in general it's used for graphics operations so let's remind ourselves about how a computer would represent a graphics display in its computer's memory and we're going to think about a very very simple one we're going to sort of go down to the bare bones we're going to have a set of pixels well let's just do with four by four so i don't have to draw so many squares so one two three four rows and four columns of pixels so we can reference any pixel on that by giving it a number so we've got 0 1 2 3 for the rows and 0 1 2 3 for the columns and then we can start to draw shapes in there so if i want to sort of square i can fill in the first two pixels on the first row filling the first two pixels on the second row we're going to keep this simple we're going to have pixels that are either black or white on off that's fine we can draw things and we can start to draw interesting patterns let's do the same here but we need a way to encode that into the computer's memory and the way that we do that is that we can convert these pixels into bits into the bits of the computer's memory and we can do that so okay a black pixel is going to be one because it's set and a white pixel is going to be zero because it's not set so we could take the first row and so we're going to represent that as a one bit because we've set it a one bit because we've set it a zero bit because we haven't set it in a zero bit because we haven't set it next line is exactly the same and then we can do the same again for this line not set not set so zero zero set that's a one not set that's a zero we've got an interesting pattern i don't know what it is and then on the last line not set not set not set set and so we can convert the patterns that we've created into a binary number and then we could store that in the computer's memory so this would be the equivalent of size 12 12 2 and 1 and we could store them in the computer's memory if we want more pixels so people say go to have 8 pixels we'd use 8 bits if we wanted 16 pixels we'd use 16 bits 2 bytes and so on to store that and what we do is we'd store this going starting from the top of the screen with the first line we store all the bits that may represent that so let's say we've got a 640 by 480 screen we'd have 640 bits 80 bytes worth of information 80 times 80 to 648 bits in the byte 8 pixels in a byte 80 bytes would represent that line then the next 80 bytes in memory would be the next line of pixels and that will continue for 480 lines and 32k of memory later you've got the data for your screen now that works very nicely you can store it in memory the other thing is it's actually very easy to build the display hardware to display that on screen because you can take a byte or a word of memory at a time load it into the sort of same shift registers that mike was talking about last week when you're talking about linear feedback shift registers you can hardly implementation that you load the byte in and then you clock out each of the bits one by one out of the shift register and that's the signal that you can send to your tv or whatever monitor device you're using to display things on screen so this approach works really nicely for storing data in a form that we can manipulate because we can just change the bits to change the pattern so for example if we wanted to invert this pattern here so rather than having black black white white we had white white black black we could take the bits run them for a not gate and we'd invert it so we get 0 0 1 1 0 zero one one one one zero one one one one zero and if we draw uh square out again so there we are so we can manipulate those bits to change the pattern that we've got on screen and what we find is that you can use various operations to alter the bits as they're stored in memory to create different graphical effects so for example if we wanted to draw a character on screen let's say we wanted to draw a character let's go for a letter a and i'm going to draw an eight by eight this time so we'll go a bit bigger so we can represent this and we've got a letter a as an 8-bit character and the same way we can convert this into a series of bytes that we can store in memory so we've got zero across the top because none of them are set i'm speeding things up we've got 60 in there if we convert it to binary 64 plus 2 that's 66 66 that's 126. we'll go with that 66 66 and we've got zero and we can fast forward through that as well and we've got a series of bytes that can represent a character a and we can display that character on screen just by going to some location and writing those bytes into the memory that's representing our screen one row after another so we have to write it at the right location the right byte in memory then add on the length of the screen line as much talked about in some of the videos on images write the next bite in and then we go down and that would appear on screen is this like is this what's known as the screen buffer yeah exactly this is the sort of the way that the screen buffer is it's just literally a series of bits that represent in memory and actually one of the things you find is when people started to add color rather than say using two pixels next to each other which you could do or three pixels four pixels to represent they use the same idea of having a bitmap display but then they put another one behind it and you read the color by looking at the first plane of bits and the second plane of bits behind it so this might have a zero that might have a one in the same location for a pixel so that's the color of the pixel is zero one color two so we can write those bytes into memory at that location and that would display them on the screen except what we'd end up doing is we would replace what was on the screen beforehand so we'd always get a letter a but it would all be on a white or black background depending on what way you've got the colors but what we could do is we could use a different way of storing them so we could actually say and i'm going to be clever here just to try and illustrate this if i can figure it out let's say we wanted to draw a line to show that this had been crossed out and we could do that by having a line and we'd represent that just by a series of pixels one after the other going in a diagonal pattern and we could work out the bit pound for that and start off with 128 64 32 i'm not going to bother going through that so what we want to do is to overlay that on top if i just take the numbers and write them we don't overwrite what was here but what i can do is i can take the number i've got here and combine it with the bits that i've got already on the screen so what i can do is i can take the number here which is 128 combine it with the number we've got here using the or operation and basically all i'm doing here is oring every bit here with the equivalent bit in the other one so that would then become 0 or 128 is 128 and then that would set that pixel there leaving the rest things do the same with the other one so we get 64 here 60 or 64. becomes 124 in that case we now do the next one we would have 32 and we'd all that in there 60 or 32 is going to be 96 and so on and we can work out the numbers and we could do the same thing so just to check so the all operation there is saying if it's not already a one make it a what yeah so it's exactly the same boolean logic that you've seen in computer science before you take two bits in the same position you run them through an or gate if either bit is set or both bits then the output is one otherwise the output is zero if both bits are zero so you could do that you could also do the same using and and you'd get interesting effects effects where you'd mask off some pixels so you only have the pixels set where both the original image and the new image had a pixel set there and if you draw the right shape there you could say have one image it was a circle another image which was something and them together and you'd end up with whatever the first image was now in a circle so what we find is is that by we can write software which can combine the things and to draw things on the screen if we want to copy a value one and just replace it we can just store the values in there if we want to overlay them we can or them if we want to sort of mask them we can use an and to do that and various other things we can invert them by running them through a not gate so various boolean operations boolean logic operations can have interesting effects as to what appears on screen now you can write the software to do that but what actually turns out is is that most of these operations end up looking the same let's say i want to copy one this is my screen an image over here let's say we've got a picture of a dog that's my image and i want to copy it to appear down here now how would i do that well i would start off reading in the bytes and make up this dog in the corner look at the number poppy zero there and i would copy the same bytes there then do it for the next one along the next one along the next one along the next one along next one in fact you do it bit by bit because you're copying the bits and things and then eventually you'd get to the bits that sort of make up the dog i don't know if that looks like a dog let's have a little bit more it's vaguely like grommet isn't it it's looking straight and on if i have a couple of eyes it might be more you're testing my graphical skills here there we are so that's a dog and we copy it by basically copying all the bits from one location to you and to do that we can basically we find out where in memory this starts and we find out how many bits it's taking up and we copy those bits here to this new location where we're putting them well we find out where this where this starts in memory and we copy them to these things and we start writing the bits there now the interesting thing is that you can have two rectangles here two regions and they might be different sizes so you might say copy this rectangle here to this rectangle here which is slightly smaller and in that case you want to work out okay this one's smaller i'll stop when i get to the edge of this one then move on to the next line see you can work out we'll keep them the same size just to simplify things so we can do that in a for loop we go across the top line copying all the bits one by one and to speed it up we probably copy the bytes but the problem with that is is that if this that's fine if the they line up on a multiple of eight or a multiple of 16 if your computer works in short words and things and so on but if this is say position eight we can read it easily we want to put it at a position let's say this is a 322 shall we say that doesn't line up to a multiple of eight the nearest multiple of eight is 320. so 320 bytes in from the middle of the screen but actually we want the bits to start two pixels further on so actually what we need to do is take the eight bits from here move them here we don't want to write them exactly we sort of shift them right uh by two which means two bits fall out the edge and we have to keep track of them and then we write that byte into memory and then we've got two bits left over from the previous by which we need to get to now be in the top two bar bits of that byte and we copy the next bit and then we so we end up we can write the software to do this it's relatively straightforward code and we just step through this copy them across and then we go on to the next line we update our pointer here to get back to the position here we update the pointer from the rectangle here to get to the next line and we do exactly the same thing and if you want to cop write a character on the screen you do this but you just store the values if you want to overlay them you do the same thing because this source rectangle doesn't actually have to be on the screen it just has to be somewhere in memory and the same operation applies it could be a different shape screen you just add different values on and so on so what the bit blit algorithm does is it basically is a way of automatically taking one source rectangle so we can take all the pixels from here and we describe where it is with us things from where it is in memory because memory's memory the way we make it into an image is the way that we describe that memory and treat it in our software and then the hardware that displays it so this doesn't have to be the same location in memory as long as we know where it is we know how long the width of the screen is with that block is in pixels and bytes and things we can describe that and as we know where this is going they can overlap and things we can then copy it if we just want to copy it from one location to the other we read a byte right at the right position if we want to overlay them we read a byte read a byte from the source image all them together write them back having shifted things if we need to to get them in the right place if we want to mask them we do the same with an end the bitly algorithm just generalizes this so you can then use this algorithm to copy from one region somewhere in the computer's memory to another uh applying whatever operation whatever combination of logical operations you want to do and as we said there's only a small number that actually makes sense there's only 15 that you can do so there's only 16 that you can do two of which set things to all be zero or all be one which you don't really need to do because you can just copy from a black bit or a white bit of the screen and so on so yeah there's a limited number that actually makes sense but there's only 16 that you could possibly do anyway so that's the bit algorithm it's just a way of copying things about and combining them to create fancy images now why did this all get implemented in hardware well the problem is is that when you start to write software to do this you actually have to write a program that's run by the cpu that does this thing you could think about a very trivial operation where you start okay get the first pixel work out where that is on screen work out which bit is representing that read that bit move that so it's in the right place to set the bit on the screen at the other thing combine it with the bit that's already there based on the operation you want to do and things if you do that there's a famous paper by rob pike describing how this works and they implemented it on a 6 8 000 back in the 80s it took 8 minutes to copy a screen which is a bit slow so but you can do it but there's ways you can speed it up as we said you could read a bite at a time and shift that bite to the right place and then write a bite at a time you'll get eight times faster because you're reading eight bytes or you can do it in sixteens if you've got 16 bit machine and so on and get 16 times faster roughly but if you actually look at what's happening if you were to write a program that was do it the way the cpu works is the cpu is going to read the instruction it's going to execute particularly on the cpus of the time these days you've got caches as we talked about in there which will actually sort of save it having to read that from memory again but the way this operation works does cause some interesting issues with the cache uh and operations if you don't have separate instruction and data caches because you're reading lots of data and so you're going all over memory and things so it's not cash friendly so you read an instruction which says load a byte from memory you load it you manipulate it and you store it but each of those operations requires another cpu instruction so actually what end up happening is that your cpu is spending most time is fetching the instructions to execute rather than fetching the graphics images and manipulating them so what people did because this is actually a very structured algorithm it's read by from memory read byte from memory combine bytes write result back to memory increment pointers repeat until you get to the end of the line increment pointers in a slightly different way rinse and repeat for the next line and so on it's actually very easy to build this in hardware and so what the blitter chips were in the amiga what the blizzard ships was in the atari sc and other computers were hardware implementations of the bit blit algorithm and by implementing it in hardware you can get a huge speed increase because that chip is all it does is bit bit operations so it doesn't have to fetch the instruction to say load the bike from memory then load the bike it just knows that it's loading it from memory loads that bright loads this bike from the destination combines and writes it back out to the destination gets on with the next one it's it's it's a fixed processor it does one thing and it does it very very quickly but also if we think about we talked about how if this is at pixel 8 and we're moving it to pixel 322 we have to shift it about and then we've got the bits left over and things you can start to do clever things in your design let's draw out uh a register just a bit memory inside the chip with eight one two three four five six seven eight what we can do is read in each bit from that byte into memory now we could have a shift register where we actually shift it about but there's actually a clever way that we can do this which is what some of the bitter chips did is that we can have another eight bits and i'm going to draw these slightly smaller one two three four five six seven eight there which line up next to it and when we start at the beginning of the line we set all these to be zero either all blank or all ones depending on how you're doing it to the matter is we'll just say zero and then when we come to actually we load in these bits with the ones for the pixels let's say we're copying that letter a which we had over there we're on this line so we're going to load in these bits um starting from this so we're going to have a blank one we're going gonna have a one in it and one one one one one zero close enough close enough and things in there we've got zero there now let's say we want to shift this two places to the right what we actually do is we don't do any shifting at all we just say well okay we'll read the most significant bit bit seven from there so that becomes bit seven that we use we use bit six from here then we can read bit five from here bit four from there bit three two bit one and bit zero from there okay no need to shift it we can just select those bits and hardware and send them to the output and what we do is we could have we want to shift it by three where we start here and read it and we just have a basically a switch inside so imagine the sort of rotary switches you have the digital equivalent of that which selects where it wants to read them from to sort of shift this across and just moves it so we shift the answer back because everything we're going to copy is going to be shifted by the same amount so rather than shifting the input we just shift where we read the output to the left and we get the same effect what do we do with those left over now this is where it gets clever so we've got these two bits left over so we read the first byte we shift it by shifting our output things and then we wired this up so when we come to read the next byte as it loads in those bits there all these bits get copied zero one one one one one one into these bits over here because now when we read in from the same place bit seven is what got what was bit one over there in there in the top bit bit zeros got into what was bit six those things are copied in there so again we can build the hardware to enable us to copy these things very very quickly so we can design this into the literature whether it's the amigas whether it's your ties whichever one it is by using clever hardware and just selecting different positions and so on and then you can write them out there's the bytes or the words that you're going to write out at the screen so the blitzchip can do this much quicker than you could do it in software and in fact you can optimize it even quicker some of them will say well actually you've selected a mode where i'm not actually combining anything if i'm just writing the value in there i don't need to read the destinations value first i can just use that things and then you can combine it so that's bit blip it's an algorithm that was developed on the xerox alto machines for copying regions it was made popular in small talk as well back in the early 80s and then the computers of the 80s and into the 90s were using this type of operation to sort of composite all their graphics together if you wanted to copy some text on screen you'd use a bit blit except you might not because one of the problems with it is it's great when the regions are big because the setup cost is quite small you have to set up i want to copy from this block of memory it's got this size and dimensions things to this block of memory it's got this size of dimensions and things that takes time if you're just copying a single character eight bytes it's often quicker to just do that manually on your cpu because it's the cost is similar to do it manually as it would be to set it up but if you want to copy a window from one side of the screen to another or you're scrolling a whole line of text in microsoft word or something where it's a lot of text it's much quicker to do it using the bit blit algorithm and i'm assuming you know judging by the machines you've talked about that this was used a lot for gaming and adding kind of characters onto backgrounds and things like that yes exactly i mean one of the things that the amiga did to make to act as a good games machine was that the blitter operations could copy blitter objects bobs as they were called around the screen as they needed them to draw them on screen very very quickly it would set up the blitz to do it and often this could run in parallel to the main cpu so the blitz could be copying data while you calculated where the next one was going to go and then put it on screen in some form there's limitations because you can't have two things accessing memory at the same time easily but you get some small form of parallelism in there and so on to do it the problem these days why it's perhaps not used as much is that as we said when you've got continuous tone images sort of gray scales 24-bit 32-bit color you don't want to do doing bit operations to combine things you have an eiffel channel which is a sort of value between zero and one mapped into a 8-bit or 10-bit or 16-bit value depending on your image system and you would then do similar things in that you'd take the source image and you would multiply it by its alpha channel you'd take the destination image and you'd multiply that by one minus the alpha channel and then you would add them together to get the same sort of effect so you can sort of think about some of these operations being done in boolean logic as being sort of alpha channels where they're forced to be either one or zero and then we can do it with logical operations do not try this at home absolutely only try this at home oh yeah that's a very good try this at home but not anywhere else at all in some sense remember when we talked about 3d images you could view rgb as in some sense 3d so the first plane is r g and b or vice versaone of the things that's always fascinated me is bits or bit blitz or the blit operation and things that it's very referred to depending on which computer system you're using which operating system what time of day it is that sort of thing this was an algorithm that was developed at xerox park back in the 70s by diana mary and dan ingles and a few others and it's an algorithm that was used right through the 70s through the 80s and even into the 90s on computer graphics hardware across all sorts of different computers and interpreting in software and things and so i thought it'd be fun to have a look at what this algorithm is used for why it's sometimes implemented in software why it's even faster if you implement it in hardware or can be faster if you implement it in hardware and things are just why it's so useful in computer graphics operations if you bought a computer in the 80s you may well see that listed on the features was that it had a blitter chip in there famously with the commodore amigos one of the things that they claimed made it so good and then it was added by atari into the atari st line a bit later on as well and that is just a hardware implementation of the same algorithm that xerox park and others have been using right back to the 70s so it's not something new that came from commodore or amiga or any of those things it's just implementing a standard algorithm in hardware because it works faster if you do it in hardware then if you do it in software so what does this algorithm do well it's very very simple it copies blocks of memory from one location to another but it has the ability to combine those two blocks or three blocks depending on the way you implement it in specific ways and depending on how you combine those things you can get interesting patterns appear on screen so it's mainly used for graphics operations although you can use it for other things the atari falcon for example uses it to speed up hard disk access because your hard disk appears as a location in memory and so you can copy from that location to other bits of memory very very quickly if you use the blitter chip to do it but in general it's used for heart in general it's used for graphics operations so let's remind ourselves about how a computer would represent a graphics display in its computer's memory and we're going to think about a very very simple one we're going to sort of go down to the bare bones we're going to have a set of pixels well let's just do with four by four so i don't have to draw so many squares so one two three four rows and four columns of pixels so we can reference any pixel on that by giving it a number so we've got 0 1 2 3 for the rows and 0 1 2 3 for the columns and then we can start to draw shapes in there so if i want to sort of square i can fill in the first two pixels on the first row filling the first two pixels on the second row we're going to keep this simple we're going to have pixels that are either black or white on off that's fine we can draw things and we can start to draw interesting patterns let's do the same here but we need a way to encode that into the computer's memory and the way that we do that is that we can convert these pixels into bits into the bits of the computer's memory and we can do that so okay a black pixel is going to be one because it's set and a white pixel is going to be zero because it's not set so we could take the first row and so we're going to represent that as a one bit because we've set it a one bit because we've set it a zero bit because we haven't set it in a zero bit because we haven't set it next line is exactly the same and then we can do the same again for this line not set not set so zero zero set that's a one not set that's a zero we've got an interesting pattern i don't know what it is and then on the last line not set not set not set set and so we can convert the patterns that we've created into a binary number and then we could store that in the computer's memory so this would be the equivalent of size 12 12 2 and 1 and we could store them in the computer's memory if we want more pixels so people say go to have 8 pixels we'd use 8 bits if we wanted 16 pixels we'd use 16 bits 2 bytes and so on to store that and what we do is we'd store this going starting from the top of the screen with the first line we store all the bits that may represent that so let's say we've got a 640 by 480 screen we'd have 640 bits 80 bytes worth of information 80 times 80 to 648 bits in the byte 8 pixels in a byte 80 bytes would represent that line then the next 80 bytes in memory would be the next line of pixels and that will continue for 480 lines and 32k of memory later you've got the data for your screen now that works very nicely you can store it in memory the other thing is it's actually very easy to build the display hardware to display that on screen because you can take a byte or a word of memory at a time load it into the sort of same shift registers that mike was talking about last week when you're talking about linear feedback shift registers you can hardly implementation that you load the byte in and then you clock out each of the bits one by one out of the shift register and that's the signal that you can send to your tv or whatever monitor device you're using to display things on screen so this approach works really nicely for storing data in a form that we can manipulate because we can just change the bits to change the pattern so for example if we wanted to invert this pattern here so rather than having black black white white we had white white black black we could take the bits run them for a not gate and we'd invert it so we get 0 0 1 1 0 zero one one one one zero one one one one zero and if we draw uh square out again so there we are so we can manipulate those bits to change the pattern that we've got on screen and what we find is that you can use various operations to alter the bits as they're stored in memory to create different graphical effects so for example if we wanted to draw a character on screen let's say we wanted to draw a character let's go for a letter a and i'm going to draw an eight by eight this time so we'll go a bit bigger so we can represent this and we've got a letter a as an 8-bit character and the same way we can convert this into a series of bytes that we can store in memory so we've got zero across the top because none of them are set i'm speeding things up we've got 60 in there if we convert it to binary 64 plus 2 that's 66 66 that's 126. we'll go with that 66 66 and we've got zero and we can fast forward through that as well and we've got a series of bytes that can represent a character a and we can display that character on screen just by going to some location and writing those bytes into the memory that's representing our screen one row after another so we have to write it at the right location the right byte in memory then add on the length of the screen line as much talked about in some of the videos on images write the next bite in and then we go down and that would appear on screen is this like is this what's known as the screen buffer yeah exactly this is the sort of the way that the screen buffer is it's just literally a series of bits that represent in memory and actually one of the things you find is when people started to add color rather than say using two pixels next to each other which you could do or three pixels four pixels to represent they use the same idea of having a bitmap display but then they put another one behind it and you read the color by looking at the first plane of bits and the second plane of bits behind it so this might have a zero that might have a one in the same location for a pixel so that's the color of the pixel is zero one color two so we can write those bytes into memory at that location and that would display them on the screen except what we'd end up doing is we would replace what was on the screen beforehand so we'd always get a letter a but it would all be on a white or black background depending on what way you've got the colors but what we could do is we could use a different way of storing them so we could actually say and i'm going to be clever here just to try and illustrate this if i can figure it out let's say we wanted to draw a line to show that this had been crossed out and we could do that by having a line and we'd represent that just by a series of pixels one after the other going in a diagonal pattern and we could work out the bit pound for that and start off with 128 64 32 i'm not going to bother going through that so what we want to do is to overlay that on top if i just take the numbers and write them we don't overwrite what was here but what i can do is i can take the number i've got here and combine it with the bits that i've got already on the screen so what i can do is i can take the number here which is 128 combine it with the number we've got here using the or operation and basically all i'm doing here is oring every bit here with the equivalent bit in the other one so that would then become 0 or 128 is 128 and then that would set that pixel there leaving the rest things do the same with the other one so we get 64 here 60 or 64. becomes 124 in that case we now do the next one we would have 32 and we'd all that in there 60 or 32 is going to be 96 and so on and we can work out the numbers and we could do the same thing so just to check so the all operation there is saying if it's not already a one make it a what yeah so it's exactly the same boolean logic that you've seen in computer science before you take two bits in the same position you run them through an or gate if either bit is set or both bits then the output is one otherwise the output is zero if both bits are zero so you could do that you could also do the same using and and you'd get interesting effects effects where you'd mask off some pixels so you only have the pixels set where both the original image and the new image had a pixel set there and if you draw the right shape there you could say have one image it was a circle another image which was something and them together and you'd end up with whatever the first image was now in a circle so what we find is is that by we can write software which can combine the things and to draw things on the screen if we want to copy a value one and just replace it we can just store the values in there if we want to overlay them we can or them if we want to sort of mask them we can use an and to do that and various other things we can invert them by running them through a not gate so various boolean operations boolean logic operations can have interesting effects as to what appears on screen now you can write the software to do that but what actually turns out is is that most of these operations end up looking the same let's say i want to copy one this is my screen an image over here let's say we've got a picture of a dog that's my image and i want to copy it to appear down here now how would i do that well i would start off reading in the bytes and make up this dog in the corner look at the number poppy zero there and i would copy the same bytes there then do it for the next one along the next one along the next one along the next one along next one in fact you do it bit by bit because you're copying the bits and things and then eventually you'd get to the bits that sort of make up the dog i don't know if that looks like a dog let's have a little bit more it's vaguely like grommet isn't it it's looking straight and on if i have a couple of eyes it might be more you're testing my graphical skills here there we are so that's a dog and we copy it by basically copying all the bits from one location to you and to do that we can basically we find out where in memory this starts and we find out how many bits it's taking up and we copy those bits here to this new location where we're putting them well we find out where this where this starts in memory and we copy them to these things and we start writing the bits there now the interesting thing is that you can have two rectangles here two regions and they might be different sizes so you might say copy this rectangle here to this rectangle here which is slightly smaller and in that case you want to work out okay this one's smaller i'll stop when i get to the edge of this one then move on to the next line see you can work out we'll keep them the same size just to simplify things so we can do that in a for loop we go across the top line copying all the bits one by one and to speed it up we probably copy the bytes but the problem with that is is that if this that's fine if the they line up on a multiple of eight or a multiple of 16 if your computer works in short words and things and so on but if this is say position eight we can read it easily we want to put it at a position let's say this is a 322 shall we say that doesn't line up to a multiple of eight the nearest multiple of eight is 320. so 320 bytes in from the middle of the screen but actually we want the bits to start two pixels further on so actually what we need to do is take the eight bits from here move them here we don't want to write them exactly we sort of shift them right uh by two which means two bits fall out the edge and we have to keep track of them and then we write that byte into memory and then we've got two bits left over from the previous by which we need to get to now be in the top two bar bits of that byte and we copy the next bit and then we so we end up we can write the software to do this it's relatively straightforward code and we just step through this copy them across and then we go on to the next line we update our pointer here to get back to the position here we update the pointer from the rectangle here to get to the next line and we do exactly the same thing and if you want to cop write a character on the screen you do this but you just store the values if you want to overlay them you do the same thing because this source rectangle doesn't actually have to be on the screen it just has to be somewhere in memory and the same operation applies it could be a different shape screen you just add different values on and so on so what the bit blit algorithm does is it basically is a way of automatically taking one source rectangle so we can take all the pixels from here and we describe where it is with us things from where it is in memory because memory's memory the way we make it into an image is the way that we describe that memory and treat it in our software and then the hardware that displays it so this doesn't have to be the same location in memory as long as we know where it is we know how long the width of the screen is with that block is in pixels and bytes and things we can describe that and as we know where this is going they can overlap and things we can then copy it if we just want to copy it from one location to the other we read a byte right at the right position if we want to overlay them we read a byte read a byte from the source image all them together write them back having shifted things if we need to to get them in the right place if we want to mask them we do the same with an end the bitly algorithm just generalizes this so you can then use this algorithm to copy from one region somewhere in the computer's memory to another uh applying whatever operation whatever combination of logical operations you want to do and as we said there's only a small number that actually makes sense there's only 15 that you can do so there's only 16 that you can do two of which set things to all be zero or all be one which you don't really need to do because you can just copy from a black bit or a white bit of the screen and so on so yeah there's a limited number that actually makes sense but there's only 16 that you could possibly do anyway so that's the bit algorithm it's just a way of copying things about and combining them to create fancy images now why did this all get implemented in hardware well the problem is is that when you start to write software to do this you actually have to write a program that's run by the cpu that does this thing you could think about a very trivial operation where you start okay get the first pixel work out where that is on screen work out which bit is representing that read that bit move that so it's in the right place to set the bit on the screen at the other thing combine it with the bit that's already there based on the operation you want to do and things if you do that there's a famous paper by rob pike describing how this works and they implemented it on a 6 8 000 back in the 80s it took 8 minutes to copy a screen which is a bit slow so but you can do it but there's ways you can speed it up as we said you could read a bite at a time and shift that bite to the right place and then write a bite at a time you'll get eight times faster because you're reading eight bytes or you can do it in sixteens if you've got 16 bit machine and so on and get 16 times faster roughly but if you actually look at what's happening if you were to write a program that was do it the way the cpu works is the cpu is going to read the instruction it's going to execute particularly on the cpus of the time these days you've got caches as we talked about in there which will actually sort of save it having to read that from memory again but the way this operation works does cause some interesting issues with the cache uh and operations if you don't have separate instruction and data caches because you're reading lots of data and so you're going all over memory and things so it's not cash friendly so you read an instruction which says load a byte from memory you load it you manipulate it and you store it but each of those operations requires another cpu instruction so actually what end up happening is that your cpu is spending most time is fetching the instructions to execute rather than fetching the graphics images and manipulating them so what people did because this is actually a very structured algorithm it's read by from memory read byte from memory combine bytes write result back to memory increment pointers repeat until you get to the end of the line increment pointers in a slightly different way rinse and repeat for the next line and so on it's actually very easy to build this in hardware and so what the blitter chips were in the amiga what the blizzard ships was in the atari sc and other computers were hardware implementations of the bit blit algorithm and by implementing it in hardware you can get a huge speed increase because that chip is all it does is bit bit operations so it doesn't have to fetch the instruction to say load the bike from memory then load the bike it just knows that it's loading it from memory loads that bright loads this bike from the destination combines and writes it back out to the destination gets on with the next one it's it's it's a fixed processor it does one thing and it does it very very quickly but also if we think about we talked about how if this is at pixel 8 and we're moving it to pixel 322 we have to shift it about and then we've got the bits left over and things you can start to do clever things in your design let's draw out uh a register just a bit memory inside the chip with eight one two three four five six seven eight what we can do is read in each bit from that byte into memory now we could have a shift register where we actually shift it about but there's actually a clever way that we can do this which is what some of the bitter chips did is that we can have another eight bits and i'm going to draw these slightly smaller one two three four five six seven eight there which line up next to it and when we start at the beginning of the line we set all these to be zero either all blank or all ones depending on how you're doing it to the matter is we'll just say zero and then when we come to actually we load in these bits with the ones for the pixels let's say we're copying that letter a which we had over there we're on this line so we're going to load in these bits um starting from this so we're going to have a blank one we're going gonna have a one in it and one one one one one zero close enough close enough and things in there we've got zero there now let's say we want to shift this two places to the right what we actually do is we don't do any shifting at all we just say well okay we'll read the most significant bit bit seven from there so that becomes bit seven that we use we use bit six from here then we can read bit five from here bit four from there bit three two bit one and bit zero from there okay no need to shift it we can just select those bits and hardware and send them to the output and what we do is we could have we want to shift it by three where we start here and read it and we just have a basically a switch inside so imagine the sort of rotary switches you have the digital equivalent of that which selects where it wants to read them from to sort of shift this across and just moves it so we shift the answer back because everything we're going to copy is going to be shifted by the same amount so rather than shifting the input we just shift where we read the output to the left and we get the same effect what do we do with those left over now this is where it gets clever so we've got these two bits left over so we read the first byte we shift it by shifting our output things and then we wired this up so when we come to read the next byte as it loads in those bits there all these bits get copied zero one one one one one one into these bits over here because now when we read in from the same place bit seven is what got what was bit one over there in there in the top bit bit zeros got into what was bit six those things are copied in there so again we can build the hardware to enable us to copy these things very very quickly so we can design this into the literature whether it's the amigas whether it's your ties whichever one it is by using clever hardware and just selecting different positions and so on and then you can write them out there's the bytes or the words that you're going to write out at the screen so the blitzchip can do this much quicker than you could do it in software and in fact you can optimize it even quicker some of them will say well actually you've selected a mode where i'm not actually combining anything if i'm just writing the value in there i don't need to read the destinations value first i can just use that things and then you can combine it so that's bit blip it's an algorithm that was developed on the xerox alto machines for copying regions it was made popular in small talk as well back in the early 80s and then the computers of the 80s and into the 90s were using this type of operation to sort of composite all their graphics together if you wanted to copy some text on screen you'd use a bit blit except you might not because one of the problems with it is it's great when the regions are big because the setup cost is quite small you have to set up i want to copy from this block of memory it's got this size and dimensions things to this block of memory it's got this size of dimensions and things that takes time if you're just copying a single character eight bytes it's often quicker to just do that manually on your cpu because it's the cost is similar to do it manually as it would be to set it up but if you want to copy a window from one side of the screen to another or you're scrolling a whole line of text in microsoft word or something where it's a lot of text it's much quicker to do it using the bit blit algorithm and i'm assuming you know judging by the machines you've talked about that this was used a lot for gaming and adding kind of characters onto backgrounds and things like that yes exactly i mean one of the things that the amiga did to make to act as a good games machine was that the blitter operations could copy blitter objects bobs as they were called around the screen as they needed them to draw them on screen very very quickly it would set up the blitz to do it and often this could run in parallel to the main cpu so the blitz could be copying data while you calculated where the next one was going to go and then put it on screen in some form there's limitations because you can't have two things accessing memory at the same time easily but you get some small form of parallelism in there and so on to do it the problem these days why it's perhaps not used as much is that as we said when you've got continuous tone images sort of gray scales 24-bit 32-bit color you don't want to do doing bit operations to combine things you have an eiffel channel which is a sort of value between zero and one mapped into a 8-bit or 10-bit or 16-bit value depending on your image system and you would then do similar things in that you'd take the source image and you would multiply it by its alpha channel you'd take the destination image and you'd multiply that by one minus the alpha channel and then you would add them together to get the same sort of effect so you can sort of think about some of these operations being done in boolean logic as being sort of alpha channels where they're forced to be either one or zero and then we can do it with logical operations do not try this at home absolutely only try this at home oh yeah that's a very good try this at home but not anywhere else at all in some sense remember when we talked about 3d images you could view rgb as in some sense 3d so the first plane is r g and b or vice versa\n"