Why Files Become Bigger in Emails - Computerphile

Encoding Binary Data into ASCII Characters

When it comes to sending binary data over email or other systems that only support ASCII characters, we need a way to convert our binary data into a character-based form. One technique used is called uuencoding, which involves padding the binary data with zeros and then writing an equals character to indicate how many bytes should be read from the original binary file.

For example, let's say we want to encode a message that consists of three bytes. We would take those three bytes and convert them into six bits each, using a table to map the values to corresponding characters. This gives us a 24-bit value that we can then partition into eight-bit chunks, which we can then decode at the receiving end.

We use this technique to encode binary data in our message. We start with the first character 'k', which corresponds to the 10th value in our table. We write down the six-bit value for k as "00 01 00 10 00". We do the same for the next three characters 'g' and '0'. For the last character, we use the binary values of turn to create a string of bits.

Taking these bit strings, we split them into eight-bit chunks. The topmost significant eight bits give us the original message value '42', which is our first byte. The next eight bits give us the binary value for 'a', and the final eight bits give us the third byte with the binary value of turn.

This technique allows us to take a binary file and convert it into a series of characters that can be sent over email or other systems that only support ASCII characters. It may increase the size of the file slightly, but it is an effective way to transmit binary data in this format.

Another system used for encoding binary data is the ASCII 85 technique. This involves using powers of 85 to encode the data instead of using a base-64 encoding scheme. The ASCII 85 system uses lowercase and uppercase letters, numbers, and a few other symbols, but it has some limitations. For example, not all systems that transmit email at the time used ASCII 85 for encoding, which could lead to problems when sending files through those systems.

Despite these limitations, the ASCII 85 technique was widely used in the early days of email transmission. However, with the advent of uuencoding and other techniques, it has largely fallen out of use. Today, base64 encoding is the standard method for encoding binary data into ASCII characters.

One alternative system to base-64 encoding is the MIME (Multipurpose Internet Mail Extensions) system. This system uses a multimedia extension that signals that inside this message we have five songs and five clips of just the song recorded through a microphone. The MIME system allows for more flexibility in encoding binary data, but it requires support from both the sender and receiver systems.

In conclusion, there are several techniques used to encode binary data into ASCII characters, including uuencoding, ASCII 85, and MIME. While these techniques have their limitations, they allow us to transmit binary data over email or other systems that only support ASCII characters.

"WEBVTTKind: captionsLanguage: eni thought we could revisit the topic of email i know tom rodden way back at the start of computer file did a video overview of how email worked but i want to focus in on one bit which i find quite interesting because i'm that sort of strange type of character and that is how do we transmit things that aren't text through email i mean emails written at the time was created i think originally in the 70s the sort of standards we're using now to send what was predominantly going to be ascii text small amounts of text sent between two users on a unix system these days we'll email programs zip files and i want to look at how we encode things that aren't text to be sent over email if you look in the specifications for email you can go and find the rfc rfc822 for those who are interested in been updated by rfc2822 if you look in that it describes email as being a series of lines of text and by lines of text it means a run of ascii characters ideally no more than 80 characters or so they say i think it's 78 they actually specify followed by a carriage return and a line feed but it does stipulate it cannot be more than 990 characters so you've got to keep the lines relatively small 80 columns is effectively what they're aiming for so it's a series of lines of text that are sent between two computer systems it also says that those characters can only be ascii codes between 1 and 127 so it's 7 bit ascii and you can't use carriage return or line feed except to mean the end of the line so you can already see that we're limited in what we can send between email we can send text basically email is designed to send text if we want to send anything else we've got a problem but of course these days we all email if we use email that is all sorts of things we might send pictures music files programs pdfs all sorts of other things we might send in email so i want to look at how we can sort of take any arbitrary binary file and encode it so we can send it in an email so it sounds like some serious limitations if there's a 990 character limit you know it sounds more like an sms message or a tweet well the 990 was a limit on any one line so you can have as many lines as you like so you could have a as long a message as you like but you're right it's sort of it's very much in the sort of computer systems of the day you you may well be reading this on a tele type where it's being sent to a line printer and then you get to the edge of the sort of lined paper and you can't print any more characters anyway so you're gonna have to go into another line or you're reading it on a sort of small display which can't display any more than about 80 characters anyway so in some ways it wasn't that much of a limitation because you just have multiple lines one after the other for each line of the message you're wanting to send and the machines that they were using couldn't display graphics that well either most of them they were purely text-based machines or they encoded graphics using text characters sort of slashes and things and other escape codes to display things so it was a limitation but it wasn't that much of a limitation given the computers that they were using at the time of course though now we want to send any sort of file and to do it and it wasn't that long afterwards as tom says there was a program called uun code and the corresponding one called uud code which would take any arbitrary binary file and convert it into a series of characters that would fit the spec that could be sent over email uun code worked but it was very much a manual process you could only you'd have to find the ascii soup cut it out and feed it into the decoder program so what happened in the late 90s is there was another standard proposed called mime multi-purpose internet mail extensions which sort of came up with a slightly different way of doing it enables you to have different sections in the email that can be encoded in different ways it'll explain what they're encoded it'll explain what type of files there is a sort of a an image is it a an audio file what type of audio file that sort of thing in there but it still had to do the same sort of thing it still had to take the binary data that represents the actual image or whatever it is we're trying to send we'll use an image as an example and it has to convert it into a form that could still fit the old email standards we're gonna have to take a series of bytes one after the other and effectively convert them into a series of characters that we can decode at the other end to produce the original file again let's suppose we want to send an image so the first four bytes that we're going to want to send of that image are well obviously 42 13 10 and let's pick another number oh let's go for 56 and that's only the beginning of a whole load of more bytes but we said that we've only got the values 1 to 127 that we can send as part of an email that's what the specification says and also some of those bytes have special meaning so for example 13 the number 13 actually means if you think about an old typewriter move the carriage from one end right back to the beginning so you can type from the left again line feed has a number 10 in the ascii character set means move down onto the next line so carriage return line feed go on to the next line so these values we can't just send we need some way of taking these bytes and converting them into characters that we can send now the trick that's used is to not think about them as bytes as sort of decimal numbers but to actually think about them as a series of eight bits one after the other so the number 42 that's represented in binary is zero zero one zero one zero one zero i'm doing this from memory so i could well make a mistake at which point sean will correct me as i go so 13 is zero zero zero zero one one oh one ten is zero zero zero zero one zero one zero and 56 is zero zero one one one zero zero zero we get the picture rather than thinking about it as a series of bytes eight bit values we just think about it as a stream of bits and what we're going to do is we're going to cut this stream of bits up into a smaller chunk so rather than killing it up into eight chunks that we started with we'll cut it up into say five bit chunks or six bit chunks or whatever it is we decide is the best one to do to make that so let's think about why that would work well let's think about the characters that we've got to encode values so what characters have we got that we can actually use well we've got the letters a b c through to z and if we think about that that gives us 26 possible things could we use those 26 characters to encode things well if we took 5 bits that would require 32 characters that's more than we can get in a to z but if we took four bits we would find that we'd only need sixteen possible values two to the power of four is sixteen so we've got sixteen possible values in there zero three to fifteen and we could encode each of those values using a different letter give 0 a 1 b 2 c and so on until we've encoded all 16 possible values so we could just do that we could say okay let's take the first four bits that are here zero zero one zero they are sort of that's two we would then write that down as c and we could do the same for each of the subsequent chunks of four bits that were there but there's a problem with doing that and actually if you think about it each byte of the message was eight bit long and so actually what we'd end up doing is converting it to two characters and so the file that we'd end up with would double in size so doing it with just four bits while we can match that into the letters that we can send ends up sort of producing a file that doubles in size so perhaps not the best way to do it so let's rethink things a bit what about if we were to use five bit chunks or five bit chunks that has 32 possible values from zero through to 31 but we've only got 26 letters in the alphabet we've got a b c and so on through to z so that's not enough letters to encode the actual values but we've still got other values that we could use so for example we could follow the z with zero and then one two 3 4 and 5 and that would give us 32 possible values so now we could go through the same thing and we would take the first 5 bits which in this case would give us the value of 0 0 1 0 1 which has the value of five and we would take the fifth letter of the alphabet that's the first one of the message which would be e so we do that and then we do the next five bits zero one zero zero zero which is what's eight so we take the eighth letter of the alphabet which is sean h h and so now we've encoded ten bits which takes us into two characters and we could then do the same for the next five bits and we could do that and convert the whole message and the advantage of doing that is that it will take up it will take up more space all these encoding systems are going to take up more space to send the message and if we could just send the raw eight bit count bytes across because we're having to encode them into fewer values so we're going to end up using more characters because we're encoding them into fewer values we can't help that so we could do it with five bits but it's still a bit wasteful we know we can't go up to seven bits but seven bits we would have to use all the possible ones we can send and we know that some of them like zero we can't use because of the specification 10 has to follow 13 according to the specification so we couldn't send that so actually we haven't got enough with seven so what they actually chose was to use six bits which gives us 64 possible values so the encoding system that was developed was called base64 because we take the bytes we want to use and encode them from eight bits per symbol into six bits per symbol in a way that we can then convert them back now how do we do it well it's exactly the same technique as what we've looked at already i'm just going to undo this we're going to take chunks from the bytes that make up the message six bits of a time to form the characters we're going to need so we're going to need to define 64 characters that we can use so the way we do that well we'll start off using the capital a through to capital z that gives us 26. we can also use lowercase a through to lowercase zed that gives us 52 and then we can use zero one two three four five six seven eight and nine that gives us now 62 possible things we need a couple of extra symbols that we can type so when they develop the standard they use plus and they use the forward slash as well and that gives us 64 possible values so if we take six chunks of the bits that make up the message when we concatenated them one after the other we can then encode them into these symbols to form the characters that we can send over so let's have a look at our message again and work out how we do that now because we've chosen six bits three bytes that's three times eight numbers of bits will then code as four six bit characters for these six bit symbols we're using so every three bytes of our message will become four bytes of the encoded version so we can look at the three bytes that we've got at the beginning here and we've got 42 13 and 10 as the ones we're going to send so we can take these bytes and we first of all need to sort of group them in the same order both at the sending point and the receiving point so we get the numbers in the right way so the rules that have set forward for the base64 encoding is that the first byte becomes the most significant bit so we'd write that one down first zero zero one zero one zero one zero the next byte becomes the middle significant bits for one of a better way of writing it so that would be zero zero zero zero one one zero one in our message and then finally byte three would become the least significant bit so we're mapping these into a 24 bit number and now we start breaking that down into six bit chunks and then we can convert them into the characters in what's effectively our lookup table that we've generated before a b c d e etcetera's capital z a b c lowercase through lowercase said the numbers and so on so let's take the first six bits we always start from the most significant point zero zero one zero one zero is ten so we want the tenth letter of our sequence now we start with a being zero so the tenth symbol would be k i think so we take the first six bits and we can map them into a character we map them to the capital k there we now do for the next six bits so the next one is 32 which is the lowercase g so we now take the next six that's going to be 52. so the 52nd should be zero and then we do it for the final set of six bits which happen to be 10 again so that becomes the capital k that we had at the beginning so the first three bytes that we encode which were 42 13 10 they were the numbers that made up those bytes in the file that we created would encode as k lowercase g 0 capital k we could then convert them for the next three bytes and that would become another series of four characters we do the next three bytes after that give us another series of four characters and we could keep going through that until we'd come to the end of the binary file that we're wanting to send but there is a slight wrinkle here because our binary file might not be an exact multiple of three bytes long it might be sort of four bytes or it might be 902 bytes long so it might not be a multiple of three and so we need some way to encode that and the way that we do that is that we use an extra symbol to say that so at the end if we are encoding say two bytes we encode those two bytes using the same thing we pad it with zeros at the end and then we write an equals character to say only use two bytes of this and if we're only encoding one extra byte at the end we use two equals characters to say they're only encoding one extra byte and so by this technique we can take the message and convert every three bytes of it into four characters that we can send over email and actually this technique has been used in lots of other places where you'll need to encode binary data into a character based form that can be that then sent and at the other end you just do the reverse our message here we'd have k was the first set of six bits so we would look that up in the table we'd see that that's the tenth value so we'd write down the six bit value for k for ten which is zero zero one zero one zero we do the same for g which is 32 we do the same for zero and of course i can see the values already on screen here and we would do the same for 10 of course which we did at the beginning so we get 0 0 1 0 1 0. so we get that string of bits making of our 24 bits and then we would partition that up into eight we take the top most significant eight bits and we get the binary value which of course is our original message 42 and then we take the next eight bits which would give us the binary value for and we take the final eight bits of that 24 bit number the least significant bits which would give us the third byte which would be the binary value of turn so whenever we send a binary file what your computer has to do is to take the bytes of that binary file split it up into a series of bits in this case we're using six bits and then map those six bits to a particular character in the encoding so that we can then send it it makes the file slightly bigger in this case it's going to increase by about the third in size as you do it each time you encode it but it doesn't mean that we can send it over the network send it over as an email or any other system which makes use of this people have tried other systems which don't increase the size of the messenger much pdf for example initially they tried to design that in the way that it could just be sent over email without needing to be encoded or certainly not needing uun coding at all um so they use what they called ascii 85 and what ascii 85 does it shows 85 characters that was from the symbols that would supposedly be transmittable over email without needing to be converted so they used a through z lowercase and uppercase as before they would use the numbers and a few other symbols so they could get up to the 85 and you would do exactly the same thing although this time it was done around powers of 85 i think so you raise numbers and things it was slightly more complicated system uh it worked and you could encode things but the trouble they found was that not all systems that were transmitting email at that time in the early 90s used ascii to encode things and i think there was a problem that it was sent through some email systems that were running on ibm systems using ebcdic if i remember right and certainly the characters wouldn't be encoded properly in the ebcdic version or whatever it was that it was used so as it was sent through those email systems the file will get corrupted and so in the end people just were starting to uu encode it or mime encode it using base64 anyway and so it sort of fell out of fashion being used in pdf so there's various other techniques you can use but base64 has very much become the sort of standard now if you want to take some binary data and encode it into a form that it can be sent over something that only supports ascii base64 encoding isn't the way that you do it as well so they developed a system called main which is a multimedia extension and main then signals that inside this we've got all our five songs in and then we've got five clips of just the song recorded through a microphonei thought we could revisit the topic of email i know tom rodden way back at the start of computer file did a video overview of how email worked but i want to focus in on one bit which i find quite interesting because i'm that sort of strange type of character and that is how do we transmit things that aren't text through email i mean emails written at the time was created i think originally in the 70s the sort of standards we're using now to send what was predominantly going to be ascii text small amounts of text sent between two users on a unix system these days we'll email programs zip files and i want to look at how we encode things that aren't text to be sent over email if you look in the specifications for email you can go and find the rfc rfc822 for those who are interested in been updated by rfc2822 if you look in that it describes email as being a series of lines of text and by lines of text it means a run of ascii characters ideally no more than 80 characters or so they say i think it's 78 they actually specify followed by a carriage return and a line feed but it does stipulate it cannot be more than 990 characters so you've got to keep the lines relatively small 80 columns is effectively what they're aiming for so it's a series of lines of text that are sent between two computer systems it also says that those characters can only be ascii codes between 1 and 127 so it's 7 bit ascii and you can't use carriage return or line feed except to mean the end of the line so you can already see that we're limited in what we can send between email we can send text basically email is designed to send text if we want to send anything else we've got a problem but of course these days we all email if we use email that is all sorts of things we might send pictures music files programs pdfs all sorts of other things we might send in email so i want to look at how we can sort of take any arbitrary binary file and encode it so we can send it in an email so it sounds like some serious limitations if there's a 990 character limit you know it sounds more like an sms message or a tweet well the 990 was a limit on any one line so you can have as many lines as you like so you could have a as long a message as you like but you're right it's sort of it's very much in the sort of computer systems of the day you you may well be reading this on a tele type where it's being sent to a line printer and then you get to the edge of the sort of lined paper and you can't print any more characters anyway so you're gonna have to go into another line or you're reading it on a sort of small display which can't display any more than about 80 characters anyway so in some ways it wasn't that much of a limitation because you just have multiple lines one after the other for each line of the message you're wanting to send and the machines that they were using couldn't display graphics that well either most of them they were purely text-based machines or they encoded graphics using text characters sort of slashes and things and other escape codes to display things so it was a limitation but it wasn't that much of a limitation given the computers that they were using at the time of course though now we want to send any sort of file and to do it and it wasn't that long afterwards as tom says there was a program called uun code and the corresponding one called uud code which would take any arbitrary binary file and convert it into a series of characters that would fit the spec that could be sent over email uun code worked but it was very much a manual process you could only you'd have to find the ascii soup cut it out and feed it into the decoder program so what happened in the late 90s is there was another standard proposed called mime multi-purpose internet mail extensions which sort of came up with a slightly different way of doing it enables you to have different sections in the email that can be encoded in different ways it'll explain what they're encoded it'll explain what type of files there is a sort of a an image is it a an audio file what type of audio file that sort of thing in there but it still had to do the same sort of thing it still had to take the binary data that represents the actual image or whatever it is we're trying to send we'll use an image as an example and it has to convert it into a form that could still fit the old email standards we're gonna have to take a series of bytes one after the other and effectively convert them into a series of characters that we can decode at the other end to produce the original file again let's suppose we want to send an image so the first four bytes that we're going to want to send of that image are well obviously 42 13 10 and let's pick another number oh let's go for 56 and that's only the beginning of a whole load of more bytes but we said that we've only got the values 1 to 127 that we can send as part of an email that's what the specification says and also some of those bytes have special meaning so for example 13 the number 13 actually means if you think about an old typewriter move the carriage from one end right back to the beginning so you can type from the left again line feed has a number 10 in the ascii character set means move down onto the next line so carriage return line feed go on to the next line so these values we can't just send we need some way of taking these bytes and converting them into characters that we can send now the trick that's used is to not think about them as bytes as sort of decimal numbers but to actually think about them as a series of eight bits one after the other so the number 42 that's represented in binary is zero zero one zero one zero one zero i'm doing this from memory so i could well make a mistake at which point sean will correct me as i go so 13 is zero zero zero zero one one oh one ten is zero zero zero zero one zero one zero and 56 is zero zero one one one zero zero zero we get the picture rather than thinking about it as a series of bytes eight bit values we just think about it as a stream of bits and what we're going to do is we're going to cut this stream of bits up into a smaller chunk so rather than killing it up into eight chunks that we started with we'll cut it up into say five bit chunks or six bit chunks or whatever it is we decide is the best one to do to make that so let's think about why that would work well let's think about the characters that we've got to encode values so what characters have we got that we can actually use well we've got the letters a b c through to z and if we think about that that gives us 26 possible things could we use those 26 characters to encode things well if we took 5 bits that would require 32 characters that's more than we can get in a to z but if we took four bits we would find that we'd only need sixteen possible values two to the power of four is sixteen so we've got sixteen possible values in there zero three to fifteen and we could encode each of those values using a different letter give 0 a 1 b 2 c and so on until we've encoded all 16 possible values so we could just do that we could say okay let's take the first four bits that are here zero zero one zero they are sort of that's two we would then write that down as c and we could do the same for each of the subsequent chunks of four bits that were there but there's a problem with doing that and actually if you think about it each byte of the message was eight bit long and so actually what we'd end up doing is converting it to two characters and so the file that we'd end up with would double in size so doing it with just four bits while we can match that into the letters that we can send ends up sort of producing a file that doubles in size so perhaps not the best way to do it so let's rethink things a bit what about if we were to use five bit chunks or five bit chunks that has 32 possible values from zero through to 31 but we've only got 26 letters in the alphabet we've got a b c and so on through to z so that's not enough letters to encode the actual values but we've still got other values that we could use so for example we could follow the z with zero and then one two 3 4 and 5 and that would give us 32 possible values so now we could go through the same thing and we would take the first 5 bits which in this case would give us the value of 0 0 1 0 1 which has the value of five and we would take the fifth letter of the alphabet that's the first one of the message which would be e so we do that and then we do the next five bits zero one zero zero zero which is what's eight so we take the eighth letter of the alphabet which is sean h h and so now we've encoded ten bits which takes us into two characters and we could then do the same for the next five bits and we could do that and convert the whole message and the advantage of doing that is that it will take up it will take up more space all these encoding systems are going to take up more space to send the message and if we could just send the raw eight bit count bytes across because we're having to encode them into fewer values so we're going to end up using more characters because we're encoding them into fewer values we can't help that so we could do it with five bits but it's still a bit wasteful we know we can't go up to seven bits but seven bits we would have to use all the possible ones we can send and we know that some of them like zero we can't use because of the specification 10 has to follow 13 according to the specification so we couldn't send that so actually we haven't got enough with seven so what they actually chose was to use six bits which gives us 64 possible values so the encoding system that was developed was called base64 because we take the bytes we want to use and encode them from eight bits per symbol into six bits per symbol in a way that we can then convert them back now how do we do it well it's exactly the same technique as what we've looked at already i'm just going to undo this we're going to take chunks from the bytes that make up the message six bits of a time to form the characters we're going to need so we're going to need to define 64 characters that we can use so the way we do that well we'll start off using the capital a through to capital z that gives us 26. we can also use lowercase a through to lowercase zed that gives us 52 and then we can use zero one two three four five six seven eight and nine that gives us now 62 possible things we need a couple of extra symbols that we can type so when they develop the standard they use plus and they use the forward slash as well and that gives us 64 possible values so if we take six chunks of the bits that make up the message when we concatenated them one after the other we can then encode them into these symbols to form the characters that we can send over so let's have a look at our message again and work out how we do that now because we've chosen six bits three bytes that's three times eight numbers of bits will then code as four six bit characters for these six bit symbols we're using so every three bytes of our message will become four bytes of the encoded version so we can look at the three bytes that we've got at the beginning here and we've got 42 13 and 10 as the ones we're going to send so we can take these bytes and we first of all need to sort of group them in the same order both at the sending point and the receiving point so we get the numbers in the right way so the rules that have set forward for the base64 encoding is that the first byte becomes the most significant bit so we'd write that one down first zero zero one zero one zero one zero the next byte becomes the middle significant bits for one of a better way of writing it so that would be zero zero zero zero one one zero one in our message and then finally byte three would become the least significant bit so we're mapping these into a 24 bit number and now we start breaking that down into six bit chunks and then we can convert them into the characters in what's effectively our lookup table that we've generated before a b c d e etcetera's capital z a b c lowercase through lowercase said the numbers and so on so let's take the first six bits we always start from the most significant point zero zero one zero one zero is ten so we want the tenth letter of our sequence now we start with a being zero so the tenth symbol would be k i think so we take the first six bits and we can map them into a character we map them to the capital k there we now do for the next six bits so the next one is 32 which is the lowercase g so we now take the next six that's going to be 52. so the 52nd should be zero and then we do it for the final set of six bits which happen to be 10 again so that becomes the capital k that we had at the beginning so the first three bytes that we encode which were 42 13 10 they were the numbers that made up those bytes in the file that we created would encode as k lowercase g 0 capital k we could then convert them for the next three bytes and that would become another series of four characters we do the next three bytes after that give us another series of four characters and we could keep going through that until we'd come to the end of the binary file that we're wanting to send but there is a slight wrinkle here because our binary file might not be an exact multiple of three bytes long it might be sort of four bytes or it might be 902 bytes long so it might not be a multiple of three and so we need some way to encode that and the way that we do that is that we use an extra symbol to say that so at the end if we are encoding say two bytes we encode those two bytes using the same thing we pad it with zeros at the end and then we write an equals character to say only use two bytes of this and if we're only encoding one extra byte at the end we use two equals characters to say they're only encoding one extra byte and so by this technique we can take the message and convert every three bytes of it into four characters that we can send over email and actually this technique has been used in lots of other places where you'll need to encode binary data into a character based form that can be that then sent and at the other end you just do the reverse our message here we'd have k was the first set of six bits so we would look that up in the table we'd see that that's the tenth value so we'd write down the six bit value for k for ten which is zero zero one zero one zero we do the same for g which is 32 we do the same for zero and of course i can see the values already on screen here and we would do the same for 10 of course which we did at the beginning so we get 0 0 1 0 1 0. so we get that string of bits making of our 24 bits and then we would partition that up into eight we take the top most significant eight bits and we get the binary value which of course is our original message 42 and then we take the next eight bits which would give us the binary value for and we take the final eight bits of that 24 bit number the least significant bits which would give us the third byte which would be the binary value of turn so whenever we send a binary file what your computer has to do is to take the bytes of that binary file split it up into a series of bits in this case we're using six bits and then map those six bits to a particular character in the encoding so that we can then send it it makes the file slightly bigger in this case it's going to increase by about the third in size as you do it each time you encode it but it doesn't mean that we can send it over the network send it over as an email or any other system which makes use of this people have tried other systems which don't increase the size of the messenger much pdf for example initially they tried to design that in the way that it could just be sent over email without needing to be encoded or certainly not needing uun coding at all um so they use what they called ascii 85 and what ascii 85 does it shows 85 characters that was from the symbols that would supposedly be transmittable over email without needing to be converted so they used a through z lowercase and uppercase as before they would use the numbers and a few other symbols so they could get up to the 85 and you would do exactly the same thing although this time it was done around powers of 85 i think so you raise numbers and things it was slightly more complicated system uh it worked and you could encode things but the trouble they found was that not all systems that were transmitting email at that time in the early 90s used ascii to encode things and i think there was a problem that it was sent through some email systems that were running on ibm systems using ebcdic if i remember right and certainly the characters wouldn't be encoded properly in the ebcdic version or whatever it was that it was used so as it was sent through those email systems the file will get corrupted and so in the end people just were starting to uu encode it or mime encode it using base64 anyway and so it sort of fell out of fashion being used in pdf so there's various other techniques you can use but base64 has very much become the sort of standard now if you want to take some binary data and encode it into a form that it can be sent over something that only supports ascii base64 encoding isn't the way that you do it as well so they developed a system called main which is a multimedia extension and main then signals that inside this we've got all our five songs in and then we've got five clips of just the song recorded through a microphone\n"