The price of a messy codebase: No LaTeX for the iPad
Any LaTeX user with an iPad has had the same thought: I want to use my favourite document creation system on my favourite device. Despite everything I am about to say about the LaTeX codebase, there is nothing like it for composing beautiful documents, and the iPad is the most beautiful platform out there, so it is natural to try and combine the two. This is the story of our failure to do so, and inside is a cautionary tale of the consequences of a messy codebase, the moral being that fourty years of work on TeX will most likely not make it into the tablet era due to the chaotic nature of the resulting codebase.
The App Store guidelines insist that any iPad app be a single executable. Jailbreaking would ease this restriction, but in this case Apple’s gatekeepers are right. A 4GB TeX distribution dependant on over 100 binaries is not acceptable on the iPad. It is incapable of delivering the slick user experience that the iOS platform’s adherents expect and love. If we were going to bring LaTeX to the iPad we have would do so such that it would be usable and appealing to the majority of the iOS userbase, in short it would have to jump platforms as a single binary.
That didn’t seem like much of an obstacle. We would set armv7 as the target architecture, and link the resulting code with an Objective-C frontend. This dream lasted a few days until we discovered that TeX, the typesetting engine underlying LaTeX, isn’t written in C. TeX is written in WEB, Donald Knuth’s “literate” programming language. WEB source is a mix of Pascal and documentation from which can be extracted Pascal source code and LaTeX source documentation with TANGLE and WEAVE respectively.
The first step in compiling WEB code is to run TANGLE to produce unhelpful Pascal source files. Thoughtfully the TeX build system includes a translator to produce compilable C files. Although WEB was hugely influential as the progenitor of modern source code documentation it is now obsolete, and modern extensions to TeX have been written in C. This is compiled alongside the translated WEB code. It is not hard to imagine the effect this has on the readability of the codebase.
The convoluted build system causes additional complications when cross compiling. At one point an object file is built and linked to the WEB application’s source to produce the CTANGLE executable. This executable translates part of TeX’s original WEB source code to C, which is linked with the original object file as part of the target executable. Whe cross compiling, this object file must first be built for x86_64 to build CTANGLE, then replaced with an armv7 version for the final binary. Problems like this were painful and frustrating, but with a suitably intelligent build script it was just a matter of patience and time. The killer obstacle for us was kpathsea.
As best we could tell kpathsea is a library to find fonts and scripts in the sprawling TeX distribution, and it is written in C. Unfortunately it calls bash scripts, which in turn call other C executables. To move forward here we would have had to have reimplemented these bash scripts in C, or perhaps even linked in the bash executable, but instead we stopped. This was the turning point where a scratch rewrite looked more appealing than porting the existing code.
Possibly kpathsea did us a favour. LaTeX is too large (4GB) and too slow for the pad. It takes a letargic three seconds to typeset a two page document on my brand new Mac Book Pro, so I can’t believe it would be quicker than ten seconds on my pad. This is totally unacceptable, especially as I have to run TeX twice to fill in references, or four times if I use BibTeX. ARGGGHHH……
That is enough ranting for now. The problem is that LaTeX is great. Once you’ve set it up and learnt the syntax, how else can you create documents that look so good? I badly want it on my iPad, and it is a tragedy that I won’t in the near future. I don’t feel like I can blame this on Apple’s gatekeeping, because in this case they are right. LaTeX’s deficiencies can hide behind a shiny frontend on a blazingly fast Mac Book Pro, but on an iPad it will feel the same as OpenOffice does on a netbook.
LaTeX needs a scratch rewrite. The speed, bloat, and complexity cannot be solved with the current uneasy mix of C, WEB, and Bash. If translating the code to C and refactoring/commenting was practical it would leave a 4GB distribution, much of which is obsolete and unnecessary, especially metafont. Now that Operating Systems ship with a range of standard fonts, LaTeX has no need for its custom font package. XeLaTeX has made great progress in using native system fonts, adding full unicode support as a side effect, but for the most part it shares the TeX’s codebase and its maladies. I can see no choice other than starting from scratch.
At the risk of inflaming GPL advocates I hope that the LaTeX of the future will be under the LGPL. Aside from better compatibility with the App Store guidelines, it has been in the community’s best interests for commercial products to interact with LaTeX on desktop systems, and with the restriction that iPad software must consist of a single binary, if the commercial sector is not going to be shutout on tablet systems, the LGPL is a must.
I have to admit I’m not volunteering to begin the Great LaTeX Rewrite; I don’t think I am the hero to lead the LaTeX faithful into the tablet era, but I hope that someone reading this is. I love LaTeX, and I love my iPad, but for now I can only love one at a time.