Why are computer graphics so inefficient

For any seasoned UNIX user it is simply a given that working with text on a computer is more efficient, by orders of magnitude, then working with graphics. And indeed any one who has worked on a computer at all will realize that graphics, beyond anything else, will test the limits of your hardware. But why are computers so inefficient when it comes to graphics, and will advances in modern technology mitigate this, to the point of making text obsolete even?

Monk power

To answer the first question, lets use a thought experiment. Computers were rare in the dark ages, a lord would usually have to rely on a monk to read or write a letter (assuming the likely event that the lord in question was illiterate). Based on measurements I did on my self reading/writing Psalm 149 in the Bible, one could expect a monk to have roughly this efficiency:

149th Psalm
Words:	138
Characters:	757

Action	Minutes	Words/Min
Writing:	5.82	24
Reading:	0.77	180

Now a professional monk might perform better, lets say a write speed of 25 words per minute, and read speed of 250 words per minute. In fact, lets be generous and say that our monk knows shorthand, which will allow him to have a write speed of about 100 words per minute. Just as we use horsepower to measure a machines physical workload, we can now use monkpower (or MP) to measure a machines mental workload, ei. how much reading and writing it can do per minute. Our new MP unit then means a write speed of 100 words/minute and a read speed of 250 words/minute.

How computers read text

Unlike a monk that sees a letter and recognizes it, computers have no real understanding of the alphabet at all. In stead you feed them a bunch of signals called bits, zeros and ones. You can think of this process as interpreting Morse code, with zeros and ones in stead of dots and dashes. Now, when the computer receives these bits, it searches a look-up table to see what value it should return. Lets assume 00 returns a, 01 b, 10 c, and 11 d. How do we define e then? Clearly we need more bits per letter. 5 bits suffice to include all 26 letters of the alphabet. The classic ASCII encoding had 7 bits per letter, allowing for 128 unique characters (2^7), including upper and lower case letters, numbers and a few special characters. Later this was expanded to 8 bits per letter, in order to support international character sets. 8 bits is collectively called a byte, so a single text character takes up 1 byte (8 bits) of space. So the real value of 'a' is actually 01100010. Note that the computer does not understand what the letter "a" is, it is just the thingy magingy it returns when it is given 01100010 as its input.

As a side note, modern Unicode-8 encoding supports virtually all alphabet characters and symbols known to man. The amount of bits used depend on what character you are typing, Unicode-8 is backwards compatible with ASCII, so a normal English letter still only take 1 byte of space, but a Chinese character will take up more space.

You can probably see that enabling a computer to read and write plain text is relatively simple, it is just a little bit more complex then interpreting Morse code, since it must support a few more characters (Morse code does not distinguish between upper and lower case for instance). And how well does it perform when reading/writing Psalm 149? Actually that would be very hard to measure accurately today, since computers are way too fast. Instead let us check how long it takes the computer to read and write the whole Bible, not just once, but a 100 times. The following is taken from my workstation, using an 8 core i7 with 3.50 GHz:

The Bible * 100
Words:	92,628,900
Characters:	516,381,600

Action	Minutes	Words/Min
Writing:	0.0076	12,188,013,158
Reading:	0.005	18,525,780,000

In Monk Powers:
Writing:	121,880,132 MP
Reading:	74,103,120 MP

Theoretically it would take a monk about 260 hours to copy the Bible by hand, although in practice it would be more like a year, since a monk, unlike a computer, cannot run 24/7, and he probably wouldn't scribble the holy book in sloppy shorthand. But don't worry too much about the inaccuracy of our new monkpower unit, horsepower isn't accurate either... However, when my computer copied 100 Bibles in 0.458 seconds. It is clear that Monk Power as a unit is too fine grained. The Catholic Church employs about 200,000 priests, so lets up the ante to Vatican Power (1 VP = 200,000 MP):

In Vatican Powers:
Writing:	609.40 VP
Reading:	370.52 VP

Apparently as far as text is concerned my computer is a Holy Vatican Warlord Assassin! But why are we talking about monks? Well, the point I am trying to get across is simply how ridiculously powerful and efficient computers are when it comes to handling text. Essentially, the British Empire during the height of its reign could not hope to match my computers ability to read and write text, it would literally require hundreds of millions of bureaucrats. Imagine how much money and effort is saved when a single cheap computer can replace all those bureaucrats! Therein lies the real value of computers... But how well does computers handle graphics?

How computers create graphics

Reading the previous discussion about how computers "understands" text, it might dawn on you that these machines aren't all that clever. So how can you make a computer create graphics? There is no magic or "intelligence" involved, the picture must simply be described in a painstakingly formal manner. First you carve up the screen into tiny square dots, or pixels, on my screen there are 1920 * 1080, or 2,073,600, such pixels. Next you specify a color for each pixel.

Naturally a computer does not understand what a color is. But computer screens are built in a way that allows them to produce a certain color whenever they receive a certain frequency. So just like a computer doesn't need to understand English to produce text, it just needs a way to encode letters, so the computer doesn't need to understand art to produce colors, it just needs a way to define color frequencies. Colors are usually written as RGB values, such as #00ffff for cyan. Without going into all the details, these RGB values hold 24 bits of data, which is enough to span the entire visual spectrum of colors. Often color is encoded as 32 bits, which allows for transparency as well. That means that each pixel requires a 4 byte color value

To illustrate this process: A blind monk goes to a filing cabinet spanning millions of drawers and feels his way to a drawer marked #00ffff, which contains cyan pebbles, he takes one and wobbles over to the screen and glues this tiny pebble on the topmost right hand corner. Then he goes and fetches another pebble, then another... Finally all 2 million pebbles have been glued to the screen and the mosaic is finished. Of course the blind monk has no artistic appreciation of his work, however stunningly beautiful it may be, to him its just busywork. Assuming 32 bit colors and a 1920x1080 screen, the data would take 8,294,400 bytes, or 7.91 Megabytes.

A short side note here about RGB, the alien looking notation "#00ffff" which means cyan (RGB means Red Green Blue, "#00ffff" actually means 0% red 100% green 100% blue, which produces the color cyan), is ironically written for the benefits of us humans. What the computer actually reads is 000000001111111111111111.

Now suppose you are watching a film at 30 frames per second, that would require nearly 250 Mb of memory per second. To make a comparison, it will take you about 90 hours to read the Bible, which takes up 5 Mb of space. Watching Blues Brothers however would only take up 2½ hours of your time, but the DVD takes up 7.7 Gb of space, in other words about 1500 Bibles worth of text (which is about 3 times more then you would ever read in your entire lifespan - assuming you read a lot).

A bottomless ocean

But wait a minute, supposing one second of film requires ¼ Gb of space, how can a two and a half hour film only take up 7.7 Gb? Well, films usually have less then 1920x1080 resolution. And they are highly compressed, the full mosaic is not assembled as it were. If the Blues Brothers DVD had a 1920x1080x32 resolution with 30 frames per second and no compression, it would take nearly 2 Tb of space (that's about 2,000 Gb or about 2,000,000 Mb).

With no compression, at 1920x1080 resolution, the Blues Brothers DVD would take a whopping 2 Terrabytes

At this point we may begin to understand why vendors keep making videos in higher resolutions. Just how demanding can film resolutions become before we reach the limit of human visual capacity? Well, our eyes can typically see about 576 million pixels in 120 degrees. But simply splashing our retina at capacity is not enough, in the real world we can move our eyes and our heads and look around in full 3D. In order to actually film 360 degrees in all directions with 3D you would need at least 18 cameras (120 degrees * 3 * 3 * 2 since you need two angles in each camera direction to create a 3D feel). Assuming 576 Mp * 18 angles * 32 bit * 30 FPS and no compression, the Blues Brothers movie would take a whooping 74.25 Pb (that's 76,039 Terabytes!).

Note however the word 3D feel. The impressive engineering feat described above would only allow you to create a very realistic looking sphere, you could move your head around sure, but if you started to walk the illusion would soon break. It is like those old western movie sets, where the shops and houses look real from the town street, but if you peak around the corners you will see that its only a facade. Suppose you wanted a movie where you could walk about freely, where you could knock Elroy's hat off or duck away from Carry Fishers flame thrower? In that case simply filming in all directions from one point would not be enough, you would have to simulate graphics in true 3D. How much space would that require?

That depends... Lets say you wanted to create a Star Trek Holodeck at 20 x 20 x 20 meters. Lets further suppose that such a Holodeck would have a resolution fine grained enough to max out the human visual range of 576 Mp per square meter. Such a simulation would take up 442.37 Tb per cubic meter, or 3.375 Eb for the entire Holodeck (that's 3,538,960 Tb). As you move along in the story the images in this Holodeck must be updated. Lets suppose the Blues Brothers movie, all 148 minutes of it, require you to move at a general pace of 60 miles per hour (people who have seen the movie will agree that this is a low estimate...), then the entire room needs to be updated a total of 11,909 times. That brings the movie up to a grand total of 39.25 Yb (that's Yottabyte - or - 42,145,474,640 Terrabytes).

Loading the Blues Brothers DVD onto a Holodeck would require approximately 42 billion Terrabytes

Not even a nuclear power plant in the back yard could power such a graphic card! There is a lot of supposition in the math here, but the point is simply this: Computer graphics are a bottomless ocean, there is simply no limit to how demanding we can make it.

Can modern technology make graphics more efficient

No. As computers gets faster, they can draw graphics faster and thus handle the load more efficiently. Nevertheless graphics compared to text will always be extremely inefficient. Suppose we make our computers 1000 times more powerful, well, then we can watch Blues Brothers with full screen resolution, and we don't need to bother with compression or optimizations. Suppose we make them 10,000,000 times more powerful, well then we can actually produce movies that max out our visual capacity in fake 3D splendor. But none of this make graphics efficient. In terms of disk space, an old fashioned DVD movie is already about 10,000 times less efficient at storing information then plain text, and this inefficiency will only grow worse as movies are produced in ever increasing quality. It is interesting to note that while my computer can rival the combined powers of humanity itself when it comes to shredding text, it still struggles to play high resolution videos.

This does not mean that computer graphics has no place in our society. Modern technology has enabled us to watch hi-res sci-fy movies that could never be created with purely physical movie sets and MRI scanners that would be quite impractical without computer graphics. The point of this discussion is not that we should all boycott graphics, I enjoy MRI scans as much as the next man, my point is simply that computer graphics are expensive. Useful yes. Fun? Absolutely! But not at all efficient, and it never will be. That realization alone can do wonders for your work flow.

But who cares if graphics are expensive? Our super beefy hardware can take it. You said it yourself, A single DVD can hold more books than you can ever read. And who likes reading anyway? Isn't modern computers making text obsolete?

Is text obsolete

That is a strange question. If you have read this far, you surely realize that text does convey information even in our technologically sophisticated age. Because the internet (and Holywood) is proliferated with short movies based on novels and other textual stories, there is a common misconception that videos somehow will convey information in less time then reading text, but this is not the case. You may find it more exciting to watch a 5 minute Youtube video then to read a 500 page book, to be sure, but that is not a relevant comparison. The simple fact is that a 5 minute video cannot contain the amount of information found in a 500 page book. Even if this book is read aloud in a video, it would actually take longer time to watch that film then it would to read the text.

A side note here: Some people I have talked to seems to have the strange idea that converting a movie to text, would require more time reading the book then it would watching the movie. The basic argument being that it would take a lot of time reading a detailed description of the images shown in the movie. But this idea is false. The medium of novels are the plot, not the graphics, so most graphical details are left to the readers imagination. As an example, take this scene from The Hound of the Baskervilles:

«My first impression as I opened the door was that a fire had broken out, for the room was so filled with smoke that the light of the lamp upon the table was blurred by it. As I entered, however, my fears were set a rest, for it was the acrid fumes of strong tobacco which took me by the throat and set me coughing.»

What was the color of the carpet in Mr. Sherlock Holmes office? What books were in his bookcase, was there even a bookcase? These and a million more details are left to the imagination of the reader. In a movie of course all such details must be painted out in painstaking detail. Notice also the plot here, in a book you are free to describe someones impression, his thoughts and fears. His smell and the feeling in the back of his throat, such plot finesse is not easy to do in a movie. How can the viewer know that the boy and the girl care passionately for each other unless they explicitly show it? The medium of movies are graphics, not the plot, so most plot details are left to the viewers imagination.

In actuality it takes more time watching a movie, then it takes reading a book, if the movie contains the same amount of information. To illustrate this point: The book of Jonah in the Bible spans 3 pages and would take approximately 6 minutes to read. A feature length film produced by Jehovah's Witnesses, called "The Story of Jonah", covers the contents of this book quite faithfully. It doesn't add much more then a few nature scenes as the prophet walks from place to place, and two small scenes providing some comic relief. The movie takes 45 minutes to watch.* Suppose you wanted a film covering the whole Bible, assuming the same format, that film would take about 500 hours. About the same as the entire Star Trek franchise (as of 2018), and the production cost would be mind boggling! Watching the 78 minute Disney video "The Jungle Book", in stead of spending 220 minutes reading the original masterwork from Rudyard Kipling, is like reading a 5 page comic loosely inspired by the book, in stead of reading the 113 page original. This is the reason why people who have read the book always complains when watching the movie. The movie never spans the entire content of the book, if it did, it would take a hundred hours to watch!

Can film replace text? In terms of reading perhaps, but it would require incredible efforts to produce it, and contrary to what many may think, it would require more time and attention from its viewers. Dumbing down a book into a short film, takes away content, it is similar to reading a one line summary of the book. It is much easier to read such a one line description of course, but don't kid yourself into thinking that this is somehow equivalent to studying the full text.

The Power of Text

For us UNIX aficionados however the usefulness of text goes beyond simple readability. Text is searchable, editable, scriptable, useful. As an example, I used maybe an hour writing a simple script that prints a Bible verse from an Epub file. This would have been totally impossible to do if the Bible came as a 500 hour long movie!

Another example. I made a script that extracts the Epub text and places each chapter of the Bible in its own file in a directory named after the book it is in, for instance Genesis chapter 1 is in book/Genesis/1. Now given this how would you count the number of times Moses is mentioned in the Bible? How about this:

$ grep Moses books/*/* | wc -l

That returns 822 in roughly 0 seconds. This solution has a slight problem though. The Bible chapters I am using has footnotes appended at the end, each of which begins with the caret sign (^). How can we count the number of times Moses is mentioned purely in the Biblical text, excluding the footnotes? How about this:

$ for chap in books/*/*; do sed '/\^/q' $chap; done | grep Moses | wc -l

That returns 814, apparently Moses is featured 8 times in the footnotes. Theses quick examples illustrates an important point: text is information that computers can use and manipulate. You can work with it, mine out information, feed it to the pipeline. Both examples took me about 10 minutes to come up with and test. Binary data, such as images, movies or sound files, however fun, are much harder to work with! In fact from a practical point of view it may even be said that non-textual data is inherently non-computable, it is only usable through specialized software that is incredibly hard to develop. For instance writing a piece of software that counts the number of times Moses is mentioned in a film, would be extremely difficult and highly specialized, you could not use such a program for anything else, and it would take you 10 years or more to write it - if, you were a really good programmer.

Lets suppose you wanted to collect all the above Bible chapters into a single file and email it to each member of a Bible study group. You could do so with the following script:

for book in $(ls -tr books); do
    for chap in $(ls -tr books/$book); do
        cat books/$book/$chap >> bible.txt
    done
done
for addr in $(cat bible_group); do
    cat bible | mail $addr
done

The script here uses ls -t, which lists files in the order they were created (and reverses it with -r to get oldest to newest). It exploits the fact that the script we used to create these files in the first place did so in a chronological order. It took about a minute to write and execute this script. How would you sort the 1189 chapters of the Bible, collect them into a singe file of 2587 pages using only graphical tools? A file of this size would likely crash Microsoft Word, and even if Word doesn't crash it would take a million years to load (printed Bibles usually have less pages since the text is very compact). How would you send a copy to each of your friends with GUI tools, would you do so one by one? Doing such a job manually would likely take you a week or more.

So computers are not only efficient at reading and writing text, but working with text on a computer requires much less time and effort from the user as well. This realization lies at the heart of UNIX. So in summary, is text obsolete? If so then information, automation and efficiency is equally obsolete.

Update: I stumbled upon a blog post that conveys my message in a much more simple and elegant way, thanks S. K! (now aren't you glad I told you, after reading my lengthy tirade?)