A Compiler Analysis Exercise: Understanding the Basics of Code Generation
The task at hand is to analyze what's wanted in a compiler, specifically focusing on code generation. Since we're not doing anything at the moment, our code generation will be simple yet effective. We'll take the subject-verb-object parse tree as it's called for the input sentence and swap around the object to the front, leaving the subject in the middle and the verb at the end. This approach is a great exercise in understanding how to hang actions off what's called the YAGS (Yet Another Grammar Specification) grammar.
For a bit of fun and to keep things simple, we've limited our vocabulary to words. We'll analyze the input sentence and determine whether it's been put into subject-verb-object order correctly. If it has, then as the action of our parser, we'll transform it into Yoda order and output it that way. Ideally, we'd have a speech synthesizer in place to speak it out loud, but for now, let's see how it works.
We've compiled up a program to do all this, which is called YODA (Yet Another Dictionary-Oriented Grammar Analyzer). It's waiting for input and takes the standard sentence as its first test piece. The sentence we've chosen is "The robot stroked two furry dice," and we're excited to see how it works.
Furry Dice Analysis
Our program has analyzed the input sentence and identified the rules used to analyze the subject-verb-object order. It's happy to report that it's in svo (subject-verb-object) order, which is a necessary starting point. However, the transformation of the sentence into Yoda order is where things get interesting.
The program has picked out from the input sentence what the object was at the end and promoted it to the front. Then, it left the subject in the middle and moved the verb to last position. The resulting sentence is "Furry dice, a robot stroked," which is our Yoda-ized version of the original sentence.
Another Test Piece: The Dog Bit the Man
For our next test piece, we'll try something very clear and simple: "The dog bit the man." Our program analyzes this sentence as well, confirming that it's in svo order. When transformed into Yoda order, the resulting sentence is "The man, a dog bit," which is quite different from the original sentence.
While this Yoda-ized version of the sentence may not be immediately clear, we can still determine who's getting bitten: the dog is biting the man. The subject-verb-object order is still intact, even after applying the Yoda transformation.
Expanding the Vocabulary
As our program continues to learn and grow, we can expand its vocabulary to make it more useful. We're including a zip file full of all the lex and YAGS files that make this up, which some of you can try out and re-run. You'll need a Linux system or similar environment to get started. Additionally, we've included an intermediate and complete C program file that these preprocessors generate, which you can try compiling as well.
The C Program: A Challenging Task
Compiling the C program is a bit more complicated than the other parts of YODA, mainly because it requires specific libraries to function correctly. For those on Unix or Linux, this shouldn't be too difficult, but for Windows or Macintosh users, it may require some extra effort.
Once you've succeeded in getting the basic thing working, you can have a lot of fun expanding Yoda's vocabulary and making it more Star Wars-related. We're calling these words "droids" instead of "robots," which is a clever observation by Sean. You can translate the word robot into droid or even come up with new words that fit within the YODA grammar.
A Translator: Turning SVO to Yoda
If we want to turn any sentence from Standard English (SVO) back into Yoda order, we'll need a translator. We can imagine going from London and subtracting England to get Um (pronounced "um"), which doesn't sound very familiar at first. Then, if we add Japan, we might end up with Tokyo twice – a weird and wacky outcome.
While this isn't exactly how translation works, it's an amusing thought experiment that highlights the complexity of language processing and code generation. Our YODA program may not be perfect, but it's a fun starting point for exploring the basics of compiler analysis and natural language processing.