How to Build a Quine

A quine is simply a program that prints itself. This is my explanation of how they are built.

Writing a quine seems like a chicken-and-egg problem but if you really enforce the physical separation between data and code–that is defining the string to print and the code that prints it out–it's a straight-forward programming exercise.

I divided the quine program into three parts: the data, the code that prints the first part of the program and the data, and the code that uses the data to print the entirety of the code (this section and the former). The data that I mentioned is a just a representation of the code part of the program. After breaking the program into these three parts, it becomes clear that the two code parts (the print-it-out part and the turn-it-into-code part) are two simple loops that can live completely independent of the data. The key to the trick is to write the code that uses the data and then turn that code into data after the fact, thus giving the program knowledge of how to print itself.

Here's the full paste of  the quine which should make it easier to follow along:

I found it easier to store the code inside the data part of the program as the decimal values of the code's ASCII values. This representation avoided unfun issues such as getting the quoting and escaping right. So I wrote the whole program and left the data variable blank since this would be built from the code that yet to be written!

The first part of the code focused on printing the PHP boilerplate, the newlines down to the first variable, and the variable declaration itself. I then looped through the data array of integers and printed them. I then printed out the closing characters of the variable declaration.

The second part of the code turned the data array of ASCII values into their associated characters and printed them. Once again, this was just a simple loop that went through each ASCII value, converted it to its character representation, and then printed out the character. To me, this is the interesting part as the code is actually printing out itself.

To actually build the data array, I wrote a second program that took a block of text and printed out a comma-delimited string of the ASCII values. I copied and pasted the code portion of my quine, which was everything below the data array, and ran it through this second program, taking its output and placing it into the quine program as the data variable.

Now I had the full quine done. I had the data array, which represented the actual code of the quine: one code part that printed out the PHP boilerplate and the actual data array and another code part that looped through the data array and printed its ASCII representation which turns out to be the remaining code.

Running the program and then diffing the output with the original program resulted in nothing. Identical programs, success.

After I finished this, I wanted to do the trick where you have the first program print a second program in another language that prints out a program in a third language and so on that finally prints out the original program. After writing the first quine and seeing how it worked, this turned out to be a simple exercise in quoting and escaping among the various programming languages.

The key to the polyglot quine is that only the very last program (i.e., the one that prints out the original program) is the one that does any real work. All of the other ones are programs that just say “print this string” where the string is the next program. I modified my PHP quine, translating the code portions into JavaScript, and having the PHP program simply print the JavaScript. This gave me a PHP program that print out a JavaScript program that would then print out the original PHP program.

To add more languages, I would take the existing print statement, wrap the string in the next language's print function, recalculate my data array and then move to the next language. After doing it once, I realized that there wasn't anything technically difficult to the polyglot quines and felt somewhat embarrassed as passing it off as something impressive. I imagine it's a lot like a magician's act: there's a lot of fireworks and pretty ladies and the tiger is wearing a top hat but the trick is that there is a mirror brought in at the end that makes the actual “magic” happen and the rest is just for show. I added a few common languages, figured that was enough garnish, and stopped at a PHP->C->Python->JS->PHP program.

Reading how a quine is built is one thing but actually working through the problem and seeing that code and data which we usually keep separate can be melded into the same thing is really a revelatory experience. It may be a bit like a learning how a magic trick is done but the deeper, Lisp-like knowledge that you get from actually seeing that “code is data and data is code” is well worth it.

Further reading: