Yoda Parsing - Computerphile

A Compiler Analysis Exercise: Understanding the Basics of Code Generation

The task at hand is to analyze what's wanted in a compiler, specifically focusing on code generation. Since we're not doing anything at the moment, our code generation will be simple yet effective. We'll take the subject-verb-object parse tree as it's called for the input sentence and swap around the object to the front, leaving the subject in the middle and the verb at the end. This approach is a great exercise in understanding how to hang actions off what's called the YAGS (Yet Another Grammar Specification) grammar.

For a bit of fun and to keep things simple, we've limited our vocabulary to words. We'll analyze the input sentence and determine whether it's been put into subject-verb-object order correctly. If it has, then as the action of our parser, we'll transform it into Yoda order and output it that way. Ideally, we'd have a speech synthesizer in place to speak it out loud, but for now, let's see how it works.

We've compiled up a program to do all this, which is called YODA (Yet Another Dictionary-Oriented Grammar Analyzer). It's waiting for input and takes the standard sentence as its first test piece. The sentence we've chosen is "The robot stroked two furry dice," and we're excited to see how it works.

Furry Dice Analysis

Our program has analyzed the input sentence and identified the rules used to analyze the subject-verb-object order. It's happy to report that it's in svo (subject-verb-object) order, which is a necessary starting point. However, the transformation of the sentence into Yoda order is where things get interesting.

The program has picked out from the input sentence what the object was at the end and promoted it to the front. Then, it left the subject in the middle and moved the verb to last position. The resulting sentence is "Furry dice, a robot stroked," which is our Yoda-ized version of the original sentence.

Another Test Piece: The Dog Bit the Man

For our next test piece, we'll try something very clear and simple: "The dog bit the man." Our program analyzes this sentence as well, confirming that it's in svo order. When transformed into Yoda order, the resulting sentence is "The man, a dog bit," which is quite different from the original sentence.

While this Yoda-ized version of the sentence may not be immediately clear, we can still determine who's getting bitten: the dog is biting the man. The subject-verb-object order is still intact, even after applying the Yoda transformation.

Expanding the Vocabulary

As our program continues to learn and grow, we can expand its vocabulary to make it more useful. We're including a zip file full of all the lex and YAGS files that make this up, which some of you can try out and re-run. You'll need a Linux system or similar environment to get started. Additionally, we've included an intermediate and complete C program file that these preprocessors generate, which you can try compiling as well.

The C Program: A Challenging Task

Compiling the C program is a bit more complicated than the other parts of YODA, mainly because it requires specific libraries to function correctly. For those on Unix or Linux, this shouldn't be too difficult, but for Windows or Macintosh users, it may require some extra effort.

Once you've succeeded in getting the basic thing working, you can have a lot of fun expanding Yoda's vocabulary and making it more Star Wars-related. We're calling these words "droids" instead of "robots," which is a clever observation by Sean. You can translate the word robot into droid or even come up with new words that fit within the YODA grammar.

A Translator: Turning SVO to Yoda

If we want to turn any sentence from Standard English (SVO) back into Yoda order, we'll need a translator. We can imagine going from London and subtracting England to get Um (pronounced "um"), which doesn't sound very familiar at first. Then, if we add Japan, we might end up with Tokyo twice – a weird and wacky outcome.

While this isn't exactly how translation works, it's an amusing thought experiment that highlights the complexity of language processing and code generation. Our YODA program may not be perfect, but it's a fun starting point for exploring the basics of compiler analysis and natural language processing.

"WEBVTTKind: captionsLanguage: eni thought it's a bit of fun we could extend what we've done already you don't have to watch the videos you might want to go back and watch the previous videos later on we certainly here in western europe when we utter a tend to be happy with subject verb object order the man goes to town it's really quite common across a lot of languages what we've done is put together a vocabulary with lots of silly things in it like the dog the man the robot for the subject bit kick stroke for the verb and the object of course is the the thing that these actions are done on so you can have the robot kick the dog whatever you like but then we got to thinking subject verb object is this favored by all beings in the universe are there some beings out there that regardless of the actual details of the language and the words be it finnish english french spanish don't like subject verb object orderings they'd like to do it a weird way around how about object subject verb so instead of saying the man goes to town as we would say to town the man goes sounded to me 20 years ago when i first stumbled on this very much like yoda the jedi master for those of you coming into this cold and direct because you saw the word yoda and grepped over the entire universe for what this could possibly mean you landed back here in nottingham and you're finding that we've done a yoda syntax transformer we started off by doing the furry grammar and being able to make up sentences like the robot stroked to furry dice but we didn't do anything with it all we did in those early ones but there are details there you might find it interesting to say how is it decided that the robot stroke to fairy dice is in some sense legal and okay because that's what we've done we're basically saying it's okay you use rule four use row three use rule six so we were sitting there struggling wondering what would be what you typically do in a compiler we've analyzed what's wanted in a compiler you generate code so what's our code generation because we're not doing anything at the moment our code generation is going to be so simple it's going to be take the subject verb object parse tree as it's called for the input sentence swap around the object to the front leave the subject in the middle and the verb at the end so it's quite a good exercise on how to hang your actions off what's called the yak grammar that implements this and i think you might actually enjoy that so just for a bit of fun in this limited vocabulary of words we've got what we're going to do is analyze the input and say have you really put this in in subject verb or object boring order and if you have done that correctly then as the action of our parser we will yoda rise it we will turn it into yoda order and put it out that way i just wish we had a speech synthesizer in here sean bring one with you next time so that we could speak it but let's see i have got a compiled up program to do all of this it is called yoda it's waiting for input the standard sentence the one we like best of all in this silly grammar we've put together is the robot stroked two furry dice so that shall be our first test piece furry dice oh look at that not only has it analyzed for me which rules in the grammar were used to analyze our subject verb object sentence and to be happy that it is in svo order that's a necessary starting point but then the transformation the action of our brilliant yoda compiler if you like is yoda says to fairy dice the robot stroked so in other words we have picked out from the input sentence what the object was at the end we've promoted that to the front then we've left the subject out the next piece after that and the verb comes last two fairy dust robot stroke go on ask me another one shawn let's see if it works well let's go for a very clear simple one the dog bit the man the dog bit the man happy with the analysis looks slightly different to last time but it is still subject verb object yoda would say the man the dog a bit i think that works don't you sean you do you've always got to say well even in that reordering is it clear who's getting bitten yeah we're very clear that uh that is a subject it is the dog biting the man still yes because this grammar has this one cute phrase of stroke two furry dies i've fred i've called the whole grammar furry but this is yoderized furry speak now so maybe we need people to contribute to this and expand its vocabulary we're putting out a zip file full of all of the lex and yak files that make this up some of you could try out and re re-run the whole thing if you've got linux systems basically even for those of you that haven't i'm also including the intermediate and complete c program file that those preprocessors generate so you could always come in the middle try compiling the c file it will probably be okay don't get frustrated by missing libraries if you're on unix or linux you should be okay if you follow the instructions for those of you brave souls running c on either windows or macintosh i've been out on the web and looked up and you can get it to work but what happens is people translate the tools but forget about the libraries but never mind let's see where we get to and i hope you all have a lot of fun with us i've even included the binaries for 64-bit intel based linux here some of you may even be able to just execute those just i don't know but many of you might want to recompile the c and hopefully if the libraries are there you know you may be able to get the whole thing working again once you've succeeded in getting the basic thing going you may want to have a lot of fun making yoda's vocabulary much more star wars related i've come off the furry grammar that i was already doing just as a bit of a silly very elementary exercise but now you could you could fill up your vocab strings with jedi lightsaber death star all this kind of stuff sean has just pointed out to me they're not called robots in star wars they're called droids is that right so you can translate the word robot into droid or you could even come backwards you know if you to speak you give we want it back as svo subject verb object so back from yoda ordering back into english ordering will be another thing to do translator a translator yeah if i go from london and then subtract uh england and then add um i don't know japan we'd hope for tokyo we'd hope for tokyo and we get tokyo we get tokyo twice weirdlyi thought it's a bit of fun we could extend what we've done already you don't have to watch the videos you might want to go back and watch the previous videos later on we certainly here in western europe when we utter a tend to be happy with subject verb object order the man goes to town it's really quite common across a lot of languages what we've done is put together a vocabulary with lots of silly things in it like the dog the man the robot for the subject bit kick stroke for the verb and the object of course is the the thing that these actions are done on so you can have the robot kick the dog whatever you like but then we got to thinking subject verb object is this favored by all beings in the universe are there some beings out there that regardless of the actual details of the language and the words be it finnish english french spanish don't like subject verb object orderings they'd like to do it a weird way around how about object subject verb so instead of saying the man goes to town as we would say to town the man goes sounded to me 20 years ago when i first stumbled on this very much like yoda the jedi master for those of you coming into this cold and direct because you saw the word yoda and grepped over the entire universe for what this could possibly mean you landed back here in nottingham and you're finding that we've done a yoda syntax transformer we started off by doing the furry grammar and being able to make up sentences like the robot stroked to furry dice but we didn't do anything with it all we did in those early ones but there are details there you might find it interesting to say how is it decided that the robot stroke to fairy dice is in some sense legal and okay because that's what we've done we're basically saying it's okay you use rule four use row three use rule six so we were sitting there struggling wondering what would be what you typically do in a compiler we've analyzed what's wanted in a compiler you generate code so what's our code generation because we're not doing anything at the moment our code generation is going to be so simple it's going to be take the subject verb object parse tree as it's called for the input sentence swap around the object to the front leave the subject in the middle and the verb at the end so it's quite a good exercise on how to hang your actions off what's called the yak grammar that implements this and i think you might actually enjoy that so just for a bit of fun in this limited vocabulary of words we've got what we're going to do is analyze the input and say have you really put this in in subject verb or object boring order and if you have done that correctly then as the action of our parser we will yoda rise it we will turn it into yoda order and put it out that way i just wish we had a speech synthesizer in here sean bring one with you next time so that we could speak it but let's see i have got a compiled up program to do all of this it is called yoda it's waiting for input the standard sentence the one we like best of all in this silly grammar we've put together is the robot stroked two furry dice so that shall be our first test piece furry dice oh look at that not only has it analyzed for me which rules in the grammar were used to analyze our subject verb object sentence and to be happy that it is in svo order that's a necessary starting point but then the transformation the action of our brilliant yoda compiler if you like is yoda says to fairy dice the robot stroked so in other words we have picked out from the input sentence what the object was at the end we've promoted that to the front then we've left the subject out the next piece after that and the verb comes last two fairy dust robot stroke go on ask me another one shawn let's see if it works well let's go for a very clear simple one the dog bit the man the dog bit the man happy with the analysis looks slightly different to last time but it is still subject verb object yoda would say the man the dog a bit i think that works don't you sean you do you've always got to say well even in that reordering is it clear who's getting bitten yeah we're very clear that uh that is a subject it is the dog biting the man still yes because this grammar has this one cute phrase of stroke two furry dies i've fred i've called the whole grammar furry but this is yoderized furry speak now so maybe we need people to contribute to this and expand its vocabulary we're putting out a zip file full of all of the lex and yak files that make this up some of you could try out and re re-run the whole thing if you've got linux systems basically even for those of you that haven't i'm also including the intermediate and complete c program file that those preprocessors generate so you could always come in the middle try compiling the c file it will probably be okay don't get frustrated by missing libraries if you're on unix or linux you should be okay if you follow the instructions for those of you brave souls running c on either windows or macintosh i've been out on the web and looked up and you can get it to work but what happens is people translate the tools but forget about the libraries but never mind let's see where we get to and i hope you all have a lot of fun with us i've even included the binaries for 64-bit intel based linux here some of you may even be able to just execute those just i don't know but many of you might want to recompile the c and hopefully if the libraries are there you know you may be able to get the whole thing working again once you've succeeded in getting the basic thing going you may want to have a lot of fun making yoda's vocabulary much more star wars related i've come off the furry grammar that i was already doing just as a bit of a silly very elementary exercise but now you could you could fill up your vocab strings with jedi lightsaber death star all this kind of stuff sean has just pointed out to me they're not called robots in star wars they're called droids is that right so you can translate the word robot into droid or you could even come backwards you know if you to speak you give we want it back as svo subject verb object so back from yoda ordering back into english ordering will be another thing to do translator a translator yeah if i go from london and then subtract uh england and then add um i don't know japan we'd hope for tokyo we'd hope for tokyo and we get tokyo we get tokyo twice weirdlyi thought it's a bit of fun we could extend what we've done already you don't have to watch the videos you might want to go back and watch the previous videos later on we certainly here in western europe when we utter a tend to be happy with subject verb object order the man goes to town it's really quite common across a lot of languages what we've done is put together a vocabulary with lots of silly things in it like the dog the man the robot for the subject bit kick stroke for the verb and the object of course is the the thing that these actions are done on so you can have the robot kick the dog whatever you like but then we got to thinking subject verb object is this favored by all beings in the universe are there some beings out there that regardless of the actual details of the language and the words be it finnish english french spanish don't like subject verb object orderings they'd like to do it a weird way around how about object subject verb so instead of saying the man goes to town as we would say to town the man goes sounded to me 20 years ago when i first stumbled on this very much like yoda the jedi master for those of you coming into this cold and direct because you saw the word yoda and grepped over the entire universe for what this could possibly mean you landed back here in nottingham and you're finding that we've done a yoda syntax transformer we started off by doing the furry grammar and being able to make up sentences like the robot stroked to furry dice but we didn't do anything with it all we did in those early ones but there are details there you might find it interesting to say how is it decided that the robot stroke to fairy dice is in some sense legal and okay because that's what we've done we're basically saying it's okay you use rule four use row three use rule six so we were sitting there struggling wondering what would be what you typically do in a compiler we've analyzed what's wanted in a compiler you generate code so what's our code generation because we're not doing anything at the moment our code generation is going to be so simple it's going to be take the subject verb object parse tree as it's called for the input sentence swap around the object to the front leave the subject in the middle and the verb at the end so it's quite a good exercise on how to hang your actions off what's called the yak grammar that implements this and i think you might actually enjoy that so just for a bit of fun in this limited vocabulary of words we've got what we're going to do is analyze the input and say have you really put this in in subject verb or object boring order and if you have done that correctly then as the action of our parser we will yoda rise it we will turn it into yoda order and put it out that way i just wish we had a speech synthesizer in here sean bring one with you next time so that we could speak it but let's see i have got a compiled up program to do all of this it is called yoda it's waiting for input the standard sentence the one we like best of all in this silly grammar we've put together is the robot stroked two furry dice so that shall be our first test piece furry dice oh look at that not only has it analyzed for me which rules in the grammar were used to analyze our subject verb object sentence and to be happy that it is in svo order that's a necessary starting point but then the transformation the action of our brilliant yoda compiler if you like is yoda says to fairy dice the robot stroked so in other words we have picked out from the input sentence what the object was at the end we've promoted that to the front then we've left the subject out the next piece after that and the verb comes last two fairy dust robot stroke go on ask me another one shawn let's see if it works well let's go for a very clear simple one the dog bit the man the dog bit the man happy with the analysis looks slightly different to last time but it is still subject verb object yoda would say the man the dog a bit i think that works don't you sean you do you've always got to say well even in that reordering is it clear who's getting bitten yeah we're very clear that uh that is a subject it is the dog biting the man still yes because this grammar has this one cute phrase of stroke two furry dies i've fred i've called the whole grammar furry but this is yoderized furry speak now so maybe we need people to contribute to this and expand its vocabulary we're putting out a zip file full of all of the lex and yak files that make this up some of you could try out and re re-run the whole thing if you've got linux systems basically even for those of you that haven't i'm also including the intermediate and complete c program file that those preprocessors generate so you could always come in the middle try compiling the c file it will probably be okay don't get frustrated by missing libraries if you're on unix or linux you should be okay if you follow the instructions for those of you brave souls running c on either windows or macintosh i've been out on the web and looked up and you can get it to work but what happens is people translate the tools but forget about the libraries but never mind let's see where we get to and i hope you all have a lot of fun with us i've even included the binaries for 64-bit intel based linux here some of you may even be able to just execute those just i don't know but many of you might want to recompile the c and hopefully if the libraries are there you know you may be able to get the whole thing working again once you've succeeded in getting the basic thing going you may want to have a lot of fun making yoda's vocabulary much more star wars related i've come off the furry grammar that i was already doing just as a bit of a silly very elementary exercise but now you could you could fill up your vocab strings with jedi lightsaber death star all this kind of stuff sean has just pointed out to me they're not called robots in star wars they're called droids is that right so you can translate the word robot into droid or you could even come backwards you know if you to speak you give we want it back as svo subject verb object so back from yoda ordering back into english ordering will be another thing to do translator a translator yeah if i go from london and then subtract uh england and then add um i don't know japan we'd hope for tokyo we'd hope for tokyo and we get tokyo we get tokyo twice weirdly\n"