initial commit

2019-05-15 16:55:03 +02:00 · 2019-05-15 16:55:03 +02:00 · 60052b2f16
commit 60052b2f16
28 changed files with 6045 additions and 0 deletions
--- a/32
+++ b/32
@ -0,0 +1,32 @@
+1 test-file
+2 MANIFEST
+D books/
+D books/tools/
+3 bootstrap
+4 bootstrap2
+5 sortpages
+6 Makefile
+7 heap.c
+8 heap.h
+9 mempool.c
+10 mempool.h
+11 util.c
+12 util.h
+13 repair.c
+14 subst.c
+15 subst.h
+16 unmunge.c
+17 munge.c
+18 yapp.doc
+19 yapp
+20 psgen
+21 makemanifest
+D books/ps/
+22 prolog.ps
+23 charmap.ps
+D books/example/
+24 Makefile
+25 .cvsignore
+26 filelist
+27 footer.ps
+28 us-constitution.gz
--- a/477
+++ b/477
@ -0,0 +1,477 @@
+PREFACE
+-------
+
+This book grew out of a project to publish source code for cryptographic
+software, namely PGP (Pretty Good Privacy), a software package for the
+encryption of electronic mail and computer files. PGP is the most widely
+used software in the world for email encryption. Pretty Good Privacy, Inc
+(or "PGP") has published the source code of PGP for peer review, a long-
+standing tradition in the history of PGP. The first time a fully implemented
+cryptographic software package was published in its entirety in book form
+was "PGP Source Code and Internals," by Philip Zimmermann, published by The
+MIT Press, 1995, ISBN 0-262-24039-4.
+
+Peer review of the source code is important to get users to trust the
+software, since any weaknesses can be detected by knowledgeable experts who
+make the effort to review the code. But peer review cannot be completely
+effective unless the experts conducting the review can compile and test the
+software, and verify that it is the same as the software products that are
+published electronically. To facilitate that, PGP publishes its source code
+in printed form that can be scanned into a computer via OCR (optical
+character recognition) technology.
+
+Why not publish the source code in electronic form? As you may know,
+cryptographic software is subject to U.S. export control laws and
+regulations. The new 1997 Commerce Department Export Administration
+Regulations (EAR) explicitly provide that "A printed book or other printed
+material setting forth encryption source code is not itself subject to the
+EAR." (see 15 C.F.R. §734.3(b)(2)). PGP, in an overabundance of caution,
+has only made available its source code in a form that is not subject to
+those regulations. So, books containing cryptographic source code may be
+published, and after they are published they may be exported, but only
+while they are still in printed form.
+
+Electronic commerce on the Internet cannot fully be successful without
+strong cryptography. Cryptography is important for protecting our privacy,
+civil liberties, and the security of our personal and business transactions
+in the information age. The widespread deployment of strong cryptography
+can help us regain some of the privacy and security that we have lost due
+to information technology. Further, strong cryptography (in the form of
+PGP) has already proven itself to be a valuable tool for the protection of
+human rights in oppressive countries around the world, by keeping those
+governments from reading the communications of human rights workers.
+
+This book of tools contains no cryptographic software of any kind, nor does
+it call, connect, nor integrate in any way with cryptographic software. But
+it does contain tools that make it easy to publish source code in book form.
+And it makes it easy to scan such source code in with OCR software rapidly
+and accurately.
+
+Philip Zimmermann
+prz@acm.org
+
+November 1997
+
+
+
+INTRODUCTION
+------------
+
+This book contains tools for printing computer source code on paper in
+human-readable form and reconstructing it exactly using automated tools.
+While standard OCR software can recover most of the graphic characters,
+non-printing characters like tabs, spaces, newlines and form feeds cause
+problems.
+
+In fact, these tools can print any ASCII text file; it's just that the
+attention these tools pay to spacing is particularly valuable for computer
+source code. The two-dimensional indentation structure of source code is
+very important to its comprehensibility. In some cases, distinctions
+between non-printing characters are critical: the standard make utility
+will not accept spaces where it expects to see a tab character.
+
+Producing a byte-for-byte identical copy of the original is also valuable
+for authentication, as you can verify a checksum.
+
+There are five problems we have addressed:
+
+1. Getting good OCR accuracy.
+2. Preserving whitespace.
+3. Preserving lines longer than can be printed on the page.
+4. Dealing with data that isn't human-readable.
+5. Detecting and correcting any residual errors.
+
+The first problem is partly addressed by using a font designed for OCR
+purposes, OCR-B. OCR-A is a very ugly font that contains only the digits 0
+through 9 and a few special punctuation symbols. OCR-B is a very readable
+monospaced font that contains a full ASCII set, and has been popular as a
+font on line printers for years because it distinguishes ambiguous
+characters and is clear even if fuzzy or distorted.
+
+The most unusual thing about the OCR-B font is the way that it prints a
+lower-case letter 1, with a small hook on the bottom, something like an
+upper-case L. This is to distinguish it from the numeral 1. We also made
+some modifications to the font, to print the numeral 0 with a slash, and
+to print the vertical bar in a broken form. Both of these are such common
+variants that they should not present any intelligibility barrier. Finally,
+we print the underscore character in a distinct manner that is hopefully
+not visually distracting, but is clearly distinguishable from the minus
+sign even in the absence of a baseline reference.
+
+The most significant part of getting good OCR accuracy is, however, using
+the OCR tools well. We've done a lot of testing and experimentation and
+present here a lot of information on what works and what doesn't.
+
+To preserve whitespace, we added some special symbols to display spaces,
+tabs, and form feeds. A space is printed as a small triangular dot
+character, while a hollow rightward-pointing triangle (followed by blank
+spaces to the right tab stop) signifies a tab. A form feed is printed as
+a yen symbol, and the printed line is broken after the form feed.
+
+Making the dot triangular instead of square helps distinguish it from a
+period. To reduce the clutter on the page and make the text more readable,
+the space character is only printed as a small dot if it follows a blank
+on the page (a tab or another space), or comes immediately before the end
+of the line. Thus, the reader (human or software) must be able to
+distinguish one space from no spaces, but can find multiple spaces by
+counting the dots (and adding one).
+
+The format is designed so that 80 characters, plus checksums, can be
+printed on one line of an 8.5x11" (or A4) page, the still-common punched
+card line length. Longer lines are managed with the simple technique of
+appending a big ugly black blob to the first part of the line indicating
+that the next printed line should be concatenated with the current one
+with no intervening newline. Hopefully, its use is infrequent.
+
+While ASCII text is by far the most popular form, some source code is not
+readable in the usual way. It may be an audio clip, a graphic image bitmap,
+or something else that is manipulated with a specialized editing tool. For
+printing purposes, these tools just print any such files as a long string
+of gibberish in a 64-character set designed to be easy to OCR unambiguously.
+Although the tools recognize such binary data and apply extra consistency
+checks, that can be considered a separate step.
+
+Finally, the problem of residual errors arises. OCR software is not perfect,
+and uses a variety of heuristics and spelling-check dictionaries to clean up
+any residual errors in human-language text. This isn't reliable enough for
+source code, so we have added per-page and per-line checksums to the printed
+material, and a series of tools to use those checksums to correct any
+remaining errors and convert the scanned text into a series of files again.
+
+This "munged" form is what you see in most of the body of this book. We
+think it does a good job of presenting source code in a way that can be read
+easily by both humans and computers.
+
+The tools are command-line oriented and a bit clunky. This has a purpose
+beyond laziness on the authors' parts: it keeps them small. Keeping them
+small makes the "bootstrapping" part of scanning this book easier, since you
+don't have the tools to help you with that.
+
+
+
+SCANNING
+--------
+
+Our tests were done with OmniPage 7.0 on a Power Macintosh 8500/120 and an
+HP ScanJet 4c scanner with an automatic document feeder. The first part of
+this is heavily OmniPage-specific, as that appears to be the most widely
+available OCR software.
+
+The tools here were developed under Linux, and should be generally portable
+to any Unix platform. Since this book is about printing and scanning source
+code, we assume the readers have enough programming background to know how
+to build a program from a Makefile, understand the hazards of CR, LF or CRLF
+line endings, and such minor details without explicit mention.
+
+The first step to getting OrnniPage 7 to work well is to set it up with
+options to disable all of its more advanced features for preserving font
+changes and formatting. Look in the Seffings menu.
+
+· Create a Zone Contents File with all of ASCII in it, plus the extra
+  bullet, currency, yen and pilcrow symbols. Name it "Source Code".
+· Create a Source Code style set. Within it, create a Source Code zone style
+  and make it the default.
+· Set the font to something fixed-width, like Courier.
+· Set a fixed font size (10 point) and plain text, left-aligned.
+· Set the tab character to a space.
+· Set the text flow to hard line returns.
+· Set the margins to their widest.
+· The font mapping options are irrelevant.
+
+Go to the settings panel and:
+
+· Under Scanner, set the brightness to manual. With careful setting of the
+  threshold, this generates much better results than either the automatic
+  threshold or the 3D OCR. Around 144 has been a good setting for us; you
+  may want to start there.
+· Under OCR, you'll build a training file to use later, but turn off
+  automatic page orientation and select your Source Code style set in the
+  Output Options. Also set a reasonable reject character. (For test, we
+  used the pi symbol, which came across from the Macintosh as a weird
+  sequence, but you can use anything as long as you make the appropriate
+  definition in subst.c.)
+
+Do an initial scan of a few pages and create a manual zone encompassing
+all of the text. Leave some margin for page misalignment, and leave space
+on the sides for the left-right shift caused by the book binding being in
+different places on odd and even pages.
+
+Set the Zone Contents and the Style set to the Source Code settings. After
+setting the Style Set, the Zone Style should be automatically set correctly
+(since you set Source Code as the default).
+
+Then save the Zone Template, and in the pop-up menu under the Zone step on
+the main toolbar you can now select it.
+
+Now we're ready to get characters recognized. The first results will be
+terrible, with lots of red (unrecognizable) and green (suspicious) text in
+the recognized window. Some tweaking will improve this enormously.
+
+The first step is setting a good black threshold. Auto brightness sets the
+threshold too low, making the character outlines bleed and picking up a lot
+of glitches on mostly-blank pages. Try training OCR on the few pages you've
+scanned and look at the representative characters. Adjust the threshold so
+the strokes are clear and distinct, neither so thin they are broken nor so
+think they smear into each other. The character that bleeds worst is
+lowercase w, while the underscore and tab symbols have the thinnest lines
+that need worry.
+
+You'll have to re-scan (you can just click the AUTO button) until you get
+satisfactory results.
+
+The next step is training. You should scan a significant number of pages
+and teach OmniPage about any characters it has difficulty with. There are
+several characters which have been printed in unusual ways which you must
+teach OmniPage about before it can recognize them reliably. We also have
+some characters that are unique, which the tools expect to be mapped to
+specific Latin-1 characters to be processed.
+
+They characters most in need of training are as follows:
+
+· Zero is printed 'slashed.'
+· Lowercase L has a curled tail to distinguish it clearly from other
+  vertical characters like 1 and I.
+· The or-bar or pipe symbol '|' is printed "broken" with a gap in the
+  middle to distinguish it similarly.
+· The underscore character has little "serifs" on the end to distinguish
+  it from a minus sign. We also raised it a just a tad higher than the
+  normal underscore character, which was too low in the character cell to
+  be reliably seen by OmniPage.
+· Tabs are printed as a hollow right-pointing triangle, followed by blanks
+  to the correct alignment position. If not trained enough, OmniPage
+  guesses this is a capital D. You should train OmniPage to recognize this
+  symbol as a currency symbol (Latin-1 244).
+· Any spaces in the original that follow a space, or a blank on the printed
+  page, are printed as a tiny black triangle. You should train OmniPage to
+  recognize this as a center dot or bullet (Latin-1 267). We didn't use a
+  standard center dot because OmniPage confused it with a period.
+· Any form feeds in the original are printed as a yen currency symbol
+  (Latin-1 245).
+· Lines over 80 columns long are broken after 79 columns by appending a big
+  ugly black block. You should train OmniPage to recognize this as a
+  pilcrow (paragraph symbol, Latin-1 266). We did this because after
+  deciding something black and visible was suitable, we found out the font
+  we used doesn't have a pilcrow in it.
+
+The zero and the tab character, because of their frequency, deserve special
+attention.
+
+In addition, look for any unrecognized characters (in red) and retrain those
+pages. If you get an unrecognized character, that character needs training,
+but Caere says that "good examples" are best to train on, so if the training
+doesn't recognize a slightly fuzzy K, and there's a nice crisp K available
+to train on, use that.
+
+Other things that need training:
+
+· ~ (tilde), ^ (caret), ` (backquote) and ' (quote). These get dropped
+  frequently unless you train them.
+· i, j and; (semicolon). These get mixed up.
+· 3 and S. These also get mixed up.
+· Q can fail to be recognized.
+· C and [ can be confused.
+· c/C, o/O, p/P, s/S, u/U, v/V, w/W, y/Y and z/Z are often confused. This
+  can be helped by some training.
+· r gets confused with c and n. I don't understand c, but it happens.
+· f gets confused with i.
+
+The OCR training pages have lots of useful examples of troublesome
+characters. Scan a few pages of material, training each page, then scan a
+few dozen pages and look for recognition problems. Look for what OmniPage
+reports as troublesome, and when you have the repair program working, use
+it to find and report further errors. Train a few pages particularly dense
+in problems and append the troublesome characters to the training file, the
+re-recognize the lot.
+
+Double-check your training file for case errors. It's easy to miss the shift
+key in the middle of a lot of training and will result in terrible results
+even though OmniPage won't report anything amiss. We have spent a while
+wondering why OmniPage wasn't recognizing capital S or capital W, only to
+find that OmniPage was just doing what it was trained to do.
+
+We have heard some reports that OmniPage has problems with large training
+files. We have observed OmniPage suffering repeatable internal errors
+sometimes after massive training additions, but they were cured by deleting
+a few training images. Appending more training images to the training file
+did not cause the problem to re-appear.
+
+Repairing the OCR results
+
+If the only copy of the tools you have is printed in this book, see the next
+chapter on bootstrapping at this point. Here, we assume that you have the
+tools and they work.
+
+When you have some reasonable OCR results, delete any directory pages. With
+no checksum information, they just confuse the postprocessing tools. (The
+tools will just stop with an error when they get to the "uncorrectable"
+directory name and you'll have to delete it then, so it's not fatal if you
+forget.) Copy the data to a machine that you have the repair and unmunge
+utilities on.
+
+The repair utility attempts automatic table-driven correction of common
+scanning errors. You have to recompile it to change the tables, but are
+encouraged to if you find a common problem that it does not correct reliably.
+If it gets stuck, it will deposit you into your favorite editor on or
+slightly after the offending line. (The file you will be editing is the
+unprocessed portion of the input.) After you correct the problem and quit
+the editor, repair will resume.
+
+"Your favorite editor" is taken from the $VISUAL and $EDITOR environment
+variables, or the -e option to repair.
+
+The repair utility never alters the original input file. It will produce
+corrected output for file in file.out, and when it has to stop, it writes
+any remaining uncorrected input back out to file.in (via a temporary
+file.dump) and lets you edit this file. If you re-run repair on file and
+file.in exists, repair will restart from there, so you may safely quit and
+re-run repair as often as you like. (But if you change the input file, you
+need to delete the .in file for repair to notice the change.)
+
+Statistics on repair's work are printed to file.log. This is an excellent
+place to look to see if any characters require more training.
+
+As it works, repair prints the line it is working on. If you see it make a
+mistake or get stuck, you can interrupt it (control-C or whatever is
+appropriate), and it will immediately drop into the editor. If you interrupt
+it a second time, it will exit rather than invoking the editor. If the
+editor returns a non-zero result code (fails), repair will also stop. (E.g.
+:cq in vim.)
+
+One thing that repair fixes without the least trouble is the number of
+spaces expected after a printing tab character. It's such an omnipresent OCR
+software error that repair doesn't even log it as a correction.
+
+In some cases, repair can miscorrect a line and go on to the next line,
+possibly even more than once, finally giving up a few lines below the actual
+error. If you are having trouble spotting the error, one helpful trick is to
+exit the editor and let repair try to fix the page again, but interrupt it
+while it is still working on the first line, before it has found the
+miscorrection.
+
+The Nasty Lines
+
+Some lines of code, particularly those containing long runs of underscore or
+minus characters, are particularly difficult to scan reliably. The repair
+program has a special "nasty lines" feature to deal with this. If a file
+named "nastylines" (or as specified by the -l option) exists, they are
+checksummed and are considered as total replacements for any input line with
+the same checksum. So, for example, if you place a blank line in the
+nastylines file, any scanner noise on blank lines will be ignored.
+
+The "nastylines" file is re-read every time repair restarts after an edit,
+so you can add more lines as the program runs. (The error-correction patterns
+should be done this way, too, but that'll have to wait for the next release.)
+
+Sortpages
+
+If, in the course of scanning, the pages have been split up or have gotten
+out of order, a perl script called sortpages can restore them to the proper
+order. It can merge multiple input files, discard duplicates, and warns about
+any missing pages it encounters. This script requires that the pages have
+been repaired, so that the page headers can be read reliably. The repair
+program does not care about the order it works on pages in; it examines each
+page independently. Unmunge, however, does need the pages in order.
+
+Unmunging
+
+After repair has finished its work, the unmunge program strips out the
+checksums and, based on the page headers, divides the data up among various
+files. Its first argument is the file to unpack. The optional second argument
+is a manifest file that lists all of the files and the directories they go
+in. Supplying this (an excellent idea) lets unmunge recreate a directory
+hierarchy and warn about missing files.
+
+When you have unmunged everything and reconstructed the original source code,
+you are done. Unmunge verifies all of the checksums independently of repair,
+as a sanity check, and you can have high confidence that the files are
+exactly the same as the originals that were printed.
+
+
+
+BOOTSTRAPPING
+-------------
+
+There's a problem using the postprocessing tools to correct OCR errors, when
+the code being OCRed is the tools themselves. We've tried to provide a
+reasonably easy way to get the system up and running starting from nothing
+but a copy of OmniPage.
+
+You could just scan all of the tools in, correct any errors by hand, delete
+the error-checking information in a text editor, and compile them. But
+finding all the errors by hand is painful in a body of code that large.
+With the aid of perl (version 5), which provides a lot of power in very
+little code, we have provided some utilities to make this process easier.
+
+The first-stage bootstrap is a one-page perl script designed to be as small
+and simple as possible, because you'll have to hand-correct it. It can verify
+the checksums on each line, and drop you into the editor on any lines where
+an error has occurred. It also knows how to strip out the visible spaces and
+tabs, how to correct spacing errors after visible tab characters, and how to
+invoke an editor on the erroneous line.
+
+Scan in the first-stage bootstrap as carefully as possible, using OmniPage's
+warnings to guide you to any errors, and either use a text editor or the
+one-line perl command at the top of the file to remove the checksums and
+convert any funny printed characters to whitespace form.
+
+The first thing to do is try running it on itself, and correct any errors you
+find this way. Note that the script writes its output to the file named in
+the page header, so you should name your hand-corrected version differently
+(or put it in a different directory) to avoid having it overwritten.
+
+The second-stage bootstrap is a much denser one-pager, with better error
+detection; it can detect missing lines and missing pages, and takes an
+optional second argument of a manifest file which it can use to put files
+in their proper directories. It's not strictly necessary, but it's only one
+more (dense) page and you can check it against itself and the original
+bootstrap.
+
+Both of the botstrap utilities can correct tab spacing errors in the OCR
+output. Although this doesn't matter in most source code, it is included
+in the checksums.
+
+Once you have reached this point, you can scan in the C code for repair and
+unmunge. The C unmunge is actually less friendly than the bootstrap
+utilities, because it is only intended to work with the output of repair.
+It is, however, much faster, since computing CRCs a bit at a time in an
+interpreted language is painfully slow for large amounts of data. It can
+also deal with binary files printed in radix-64.
+
+
+
+PRINTING
+--------
+
+Despite the title of this book, this process of producing a book is not well
+documented, since it's been evolving up to the moment of publication. There,
+is, however, a very useful working example of how to produce a book
+(strikingly similar to this book) in the example directory, all controlled
+by a Makefile.
+
+Briefly, a master perl script called psgen takes three parameters: a file
+list, a page numbers file to write to, and a volume number (which should
+always be 1 for a one-volume book). It runs the listed files through the
+munge utility, wraps them in some simple PostScript, and prepends a prolog
+that defines the special characters and PostScript functions needed by the
+text.
+
+The file list also includes per-file flags. The most important is the
+text/binary marker. Text files can also have a tab width specified, although
+munge knows how to read Emacs-style tab width settings from the end of a
+source file.
+
+The prolog is assembled from various other files and defines by psgen using
+a simple preprocessor called yapp (Yet Another Preprocessor). This process
+includes some book-specific information like the page footer.
+
+Producing the final PostScript requires the necessary non-standard fonts
+(Futura for the footers and OCRB for the code) and the psutils package,
+which provides the includeres utility used to embed the fonts in the
+PostScript file. The fonts should go in the books/ps directory, as
+"Futura.pfa" and the like.
+
+The pagenums file can be used to produce a table of contents. For this book,
+we generated the front matter (such as this chapter) separately, told psgen
+to start on the next page after this, and concatenated the resultant
+PostScript files for printing. The only trick was making the page footers
+look identical.
--- a/example/.cvsignore
+++ b/example/.cvsignore
@ -0,0 +1,3 @@
+pagenums
+MANIFEST
+code.ps
--- a/example/Makefile
+++ b/example/Makefile
@ -0,0 +1,23 @@
+BOOKROOT=..
+TOOLSDIR=$(BOOKROOT)/tools
+PSDIR=$(BOOKROOT)/ps
+YAPP=$(TOOLSDIR)/yapp
+MAKEMANIFEST=$(TOOLSDIR)/makemanifest
+PSGEN=BOOKROOT=$(BOOKROOT) $(TOOLSDIR)/psgen
+INCLUDERES=(cd $(PSDIR); includeres)
+
+code.ps pagenums: filelist footer.ps MANIFEST books
+	$(PSGEN) -P2 -l3 -DfooterFile=footer.ps filelist pagenums 1 \
+		| $(INCLUDERES) > code.ps
+
+books:
+	ln -s $(BOOKROOT) books
+
+MANIFEST: filelist
+	$(MAKEMANIFEST) $< > $@
+
+clean:
+	rm -f `cat .cvsignore`
+
+gv%: %.ps
+	gv $<
--- a/example/filelist
+++ b/example/filelist
@ -0,0 +1,32 @@
+V 1 8
+T MANIFEST
+D books/
+D books/tools/
+T books/tools/bootstrap
+T books/tools/bootstrap2
+T4 books/tools/sortpages
+T books/tools/Makefile
+T books/tools/heap.c
+T books/tools/heap.h
+T books/tools/mempool.c
+T books/tools/mempool.h
+T books/tools/util.c
+T books/tools/util.h
+T books/tools/repair.c
+T books/tools/subst.c
+T books/tools/subst.h
+T books/tools/unmunge.c
+T books/tools/munge.c
+T books/tools/yapp.doc
+T4 books/tools/yapp
+T4 books/tools/psgen
+T4 books/tools/makemanifest
+D books/ps/
+T books/ps/prolog.ps
+T books/ps/charmap.ps
+D books/example/
+T books/example/Makefile
+T books/example/.cvsignore
+T books/example/filelist
+T books/example/footer.ps
+B books/example/us-constitution.gz
--- a/example/footer.ps
+++ b/example/footer.ps
@ -0,0 +1,5 @@
+% A program to print the page footer, using the magic P function,
+% which takes a string and a font.
+(Tools for Publishing Source Code via OCR ) /Futura P
+(\343) /Symbol P	% Copyright symbol
+( 1997 Pretty Good Privacy, Inc.) /Futura P
--- a/example/us-constitution.gz
+++ b/example/us-constitution.gz
--- a/ps/charmap.ps
+++ b/ps/charmap.ps
@ -0,0 +1,68 @@
+%%BeginResource: procset Latin1-vec 0 0
+/Latin1-vec [
+/.notdef	/.notdef	/.notdef	/.notdef
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/space		/exclam		/quotedbl	/numbersign	
+/dollar		/percent	/ampersand	/${rightQuoteGlyph}
+/parenleft	/parenright	/asterisk	/plus	
+/comma		/hyphen		/period		/slash	
+/${zeroGlyph}	/one		/two		/three	
+/four		/five		/six		/seven	
+/eight		/nine		/colon		/semicolon	
+/less		/equal		/greater	/question	
+/at		/A		/B		/C		
+/D		/E		/F		/G		
+/H		/I		/J		/K		
+/L		/M		/N		/O		
+/P		/Q		/R		/S		
+/T		/U		/V		/W		
+/X		/Y		/Z		/bracketleft		
+/backslash	/bracketright	/asciicircum	/${underscoreGlyph}
+/${leftQuoteGlyph} /a		/b		/c		
+/d		/e		/f		/g		
+/h		/i		/j		/k		
+/l		/m		/n		/o		
+/p		/q		/r		/s		
+/t		/u		/v		/w		
+/x		/y		/z		/braceleft		
+/${barGlyph}	/braceright	/tilde		/.notdef
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/.notdef	/.notdef	/.notdef	/.notdef	
+/space		/exclamdown	/cent		/sterling	
+/${tabGlyph}	/yen		/brokenbar	/section	
+/dieresis	/copyright	/ordfeminine	/guillemotleft	
+/logicalnot	/hyphen		/registered	/macron	
+/degree		/plusminus	/twosuperior	/threesuperior
+/acute		/mu		/${pilcrowGlyph} /${bulletGlyph}
+/cedilla	/dotlessi	/ordmasculine	/guillemotright	
+/onequarter	/onehalf	/threequarters	/questiondown	
+/Agrave		/Aacute		/Acircumflex	/Atilde	
+/Adieresis	/Aring		/AE		/Ccedilla	
+/Egrave		/Eacute		/Ecircumflex	/Edieresis	
+/Igrave		/Iacute		/Icircumflex	/Idieresis	
+/Eth		/Ntilde		/Ograve		/Oacute	
+/Ocircumflex	/Otilde		/Odieresis	/multiply	
+/Oslash		/Ugrave		/Uacute		/Ucircumflex	
+/Udieresis	/Yacute		/Thorn		/germandbls	
+/agrave		/aacute		/acircumflex	/atilde	
+/adieresis	/aring		/ae		/ccedilla	
+/egrave		/eacute		/ecircumflex	/edieresis	
+/igrave		/iacute		/icircumflex	/idieresis	
+/eth		/ntilde		/ograve		/oacute	
+/ocircumflex	/otilde		/odieresis	/divide	
+/oslash		/ugrave		/uacute		/ucircumflex	
+/udieresis	/yacute		/thorn		/ydieresis	
+]def
+%%EndResource
--- a/ps/prolog.ps
+++ b/ps/prolog.ps
@ -0,0 +1,306 @@
+##set pageNumFont="Futura"
+##set dirNameFont="Futura-Heavy"
+##set fontsNeeded="${font} Symbol Futura Futura-Heavy"
+##set includeFontComments=<<"END"
+%%IncludeResource: font ${font}
+%%IncludeResource: font Symbol
+%%IncludeResource: font Futura
+%%IncludeResource: font Futura-Heavy
+END
+##if ${font} eq Courier
+##set charShrinkFactor=0.93
+##set zeroGlyph=Oslash
+##set underscoreGlyph=underscore
+##set bulletGlyph=bullet
+##set tabGlyph=currency
+##set leftQuoteGlyph=quoteleft
+##set rightQuoteGlyph=quoteright
+##set pilcrowGlyph=paragraph
+##set barGlyph=bar
+##else
+##set charShrinkFactor=1
+##set zeroGlyph=Oslash
+##set underscoreGlyph=underscore2
+##set bulletGlyph=bullet2
+##set tabGlyph=tabsym
+##set leftQuoteGlyph=grave
+##set rightQuoteGlyph=quoteright	### was "acute"
+##set pilcrowGlyph=erase
+##set barGlyph=orsym
+##set do_custom_chars=1
+##endif
+%!PS-Adobe-3.0
+%%Orientation: Portrait
+%%Pages: (atend)
+%%DocumentNeededResources: font ${fontsNeeded}
+%%DocumentMedia: Letter 612 792 74 white ()
+%%EndComments
+%%BeginDefaults
+%%PageMedia: Letter
+%%PageResources: font ${fontsNeeded}
+%%EndDefaults
+%%BeginProlog
+%%BeginResource: procset Custom-Preamble 0 0
+%
+% Document definitions
+% (Upper case to avoid collisions)
+%
+
+% 8.5x11 paper is 612x792 points, but 24 points near the edge or so
+% shouldn't be used.
+/Topmargin 770 def
+/Leftmargin 30 def
+/Rightmargin 612 Leftmargin sub def
+/Botmargin 22 def
+/Bindoffset 40 def
+
+/Lineskip -10 def
+% How much to shrink characters by?
+/Factor ${charShrinkFactor} def
+/Fontsize 9.5 Factor mul def
+% (1000 units is std height, so Courier at 6/10 aspect ratio is 600.
+% Widen to make up for scaling loss.
+/Charwidth
+  Rightmargin Leftmargin sub Bindoffset sub 87 div Fontsize div 1000 mul
+def
+
+% Print a header (expects page number on stack)
+/OddPageStart
+{ save exch /MyFont findfont Fontsize scalefont setfont 
+  /CurrentLeft Leftmargin Bindoffset add def
+  /CurrentRight Rightmargin def
+  CurrentLeft Topmargin moveto } def
+
+/EvenPageStart
+{ save exch /MyFont findfont Fontsize scalefont setfont 
+  /CurrentLeft Leftmargin def
+  /CurrentRight Rightmargin Bindoffset sub def
+  CurrentLeft Topmargin moveto } def
+
+% /MyFont findfont [Fontsize 0 0 Fontsize 0 0] makefont setfont
+
+% Print the name of the directory in a large font
+/DirPage
+{
+  /${dirNameFont} findfont 14 scalefont setfont
+  0 -10 rmoveto (Directory) show
+  CurrentLeft 30 add currentpoint exch pop 20 sub moveto show
+} def
+
+% Advance a line
+/L {show CurrentLeft currentpoint exch pop Lineskip add moveto} bind def 
+
+% Print the "inside" footer line using P (string font => )
+% We do some magic involving redefining P to first measure the
+% width of this string and then print it, so you must use it
+% to do all printing.
+/Foot {
+##ifdef footerFile
+##include "${footerFile}"
+##endif
+} def
+
+% /P is defined in the Setup section
+
+% Print an odd footer
+/OddPageEnd
+ { CurrentLeft Botmargin moveto CurrentRight Botmargin lineto
+   1 setlinewidth stroke
+   CurrentLeft Botmargin 10 sub moveto
+   Foot
+   10 string cvs dup stringwidth
+   pop CurrentRight exch sub currentpoint exch pop moveto
+   /${pageNumFont} P
+   showpage
+   restore
+} def
+
+% Print an even footer
+/EvenPageEnd
+ { CurrentLeft Botmargin moveto CurrentRight Botmargin lineto
+   1 setlinewidth stroke
+   Leftmargin Botmargin 10 sub moveto
+   /${pageNumFont} P 
+   CurrentRight FootWidth sub currentpoint exch pop moveto
+   Foot
+   showpage
+   restore
+} def
+
+##ifdef do_custom_chars
+% A 1000-point OCRB discunderline consists of:
+% 111.45  -173.688 moveto
+% 609.356 -173.688 lineto
+% 609.356  -70.9227 lineto
+% 111.45   -70.9227 lineto
+% closepath
+% 720.0    -0.0 moveto
+% Line thickness is
+% 102.7653 pts.
+
+% This would suggest the following values:
+/underleft 111.45 def
+/underright 609.356 def
+/underthick 102.7643 def
+/underup underthick def
+/underdown 0 def
+/underserif 25 def
+
+% These look better in GhostScript, but not on a real Adobe rasterizer
+%/underright 600 def
+%/underleft 100 def
+%/underthick 75 def
+
+171
+211
+36081
+% The default bullet character is
+% 254.0 341.0 moveto
+% 254.0 170.0 lineto
+% 465.0 170.0 lineto
+% 465.0 341.0 lineto
+% closepath
+% Our modified version is based on:
+/bullwid 204 def
+/bullht 176.75 def
+/bullleft 254 341 add bullwid sub 2 div def
+/bullright 254 341 add bullwid add 2 div def
+/bullbot 254 def
+/bulltop bullbot bullht add def
+
+% And a custom-created tab symbol
+/tableft 250 def
+/tabright 550 def
+/tabtop 550 def
+/tabbot 50 def
+/tablinewidth 35 def
+
+% Let's try a vertical bar
+% OCRB defines (|)
+% 411.062 -173.688 moveto
+% 411.062 741.043 lineto
+% 308.297 741.043 lineto
+% 308.297 -173.688 lineto
+% closepath
+% 720.0 -0.0 moveto
+/orleft 308.297 def
+/orright 411.062 def
+/orbot -173.688 def
+/ortop 741.043 def
+/orbreak 150 def	% Width of break
+/orbbot ortop orbot add orbreak sub 2 div def	% Bottom of break
+/orbtop ortop orbot add orbreak add 2 div def	% Top of break
+##endif
+
+% newfontname encoding-vec fontname -> -	make a new encoded font
+/MF2 {
+  % Make a dict for the new font, with room for the /Metrics
+  findfont dup length 1 add dict begin
+  % Copy everything except the FID entry
+  {1 index /FID eq {pop pop} {def} ifelse} forall
+  % Set the encoding vector
+  /Encoding exch def
+
+##ifdef do_custom_chars
+  % Create a new expanded CharStrings dictionary
+  CharStrings dup length 5 add dict
+  begin { def } forall
+  % Create a custom underscore character
+  /underscore2 {
+	pop
+	//Charwidth 0 % width, bounding box follows
+	//underleft //underdown neg //underright //underthick //underup add
+	setcachedevice
+	//underleft //underthick //underup add moveto
+	//underleft //underserif add //underthick //underup add lineto
+	//underleft //underserif add //underthick lineto
+	//underright //underserif sub //underthick lineto
+	//underright //underserif sub //underthick //underup add lineto
+	//underright //underthick //underup add lineto
+	//underright //underdown neg lineto
+	//underright //underserif sub //underdown neg lineto
+	//underright //underserif sub 0 lineto
+	//underleft //underserif add 0 lineto
+	//underleft //underserif add //underdown neg lineto
+	//underleft //underdown neg lineto
+	closepath fill
+  } bind def
+  % Create a custom bullet character.
+  /bullet2 {
+	pop
+	//Charwidth 0 % width, bounding box follows
+	//bullleft //bullbot //bullright //bulltop
+	setcachedevice
+	//bullleft //bullbot moveto
+	//bullleft bullright add 2 div bulltop lineto
+	//bullright //bullbot lineto
+	closepath fill
+  } bind def
+  % Create a custom tab character.
+  /tabsym {
+	pop
+	//Charwidth 0 % width, bounding box follows
+	//tableft //tablinewidth sub //tabbot //tablinewidth sub
+	//tabright //tablinewidth add //tabtop //tablinewidth add
+	setcachedevice
+	//tablinewidth setlinewidth
+	true setstrokeadjust
+	0 setlinejoin
+	//tableft //tabbot moveto
+	//tabright //tabtop //tabbot add 2 div lineto
+	//tableft //tabtop lineto
+	closepath stroke
+  } bind def
+  /orsym {
+	pop
+	//Charwidth 0 % width, bounding box follows
+	//orleft //orbot //orright //ortop
+	setcachedevice
+	//orleft //orbot moveto
+	//orleft //orbbot lineto
+	//orright //orbbot lineto
+	//orright //orbot lineto
+	closepath
+	//orleft //ortop moveto
+	//orleft //orbtop lineto
+	//orright //orbtop lineto
+	//orright //ortop lineto
+	closepath fill
+  } bind def
+  /CharStrings currentdict end def
+##endif
+
+  % Create a new dict to be the /Metrics values
+  CharStrings dup length dict
+  % Now fill in the metrics dict with the desired width
+  begin { pop Charwidth def } forall /Metrics currentdict end def
+  % End of definitions
+  currentdict end 
+  % Define the font
+  definefont pop
+} bind def
+
+% Check PostScript language level.
+/gs_languagelevel /languagelevel where { pop languagelevel } { 1 } ifelse def
+
+%%EndResource
+##include "charmap.ps"
+${includeFontComments}
+%%EndProlog
+
+
+%%BeginSetup
+
+/MyFont Latin1-vec /${font} MF2
+/#copies 1 def
+
+% Compute the width of the /Foot string, by defining P to
+% add up the x-width of the characters.
+/P { findfont 9 scalefont setfont stringwidth pop add } def
+/FootWidth 0 Foot def
+% Redefine P to print, as usual
+/P { findfont 9 scalefont setfont show } def
+%%BeginResource: procset foo 0 0
+% This is an example
+%%EndResource
+%%EndSetup
--- a/tools/Makefile
+++ b/tools/Makefile
@ -0,0 +1,30 @@
+all: unmunge repair munge
+
+OPT = -g -O -W -Wall
+COMMON_OBJS = util.o
+
+UNMUNGE_OBJS = $(COMMON_OBJS) unmunge.o
+MUNGE_OBJS = $(COMMON_OBJS) munge.o
+REPAIR_OBJS = $(COMMON_OBJS) heap.o mempool.o subst.o repair.o
+
+unmunge: $(UNMUNGE_OBJS)
+	$(CC) $(OPT) -o $@ $(UNMUNGE_OBJS)
+
+munge: $(MUNGE_OBJS)
+	$(CC) $(OPT) -o $@ $(MUNGE_OBJS)
+
+repair: $(REPAIR_OBJS)
+	$(CC) $(OPT) -o $@ $(REPAIR_OBJS)
+
+.c.o:
+	$(CC) $(OPT) -o $@ -c $<
+
+clean:
+        -rm -f *.o munge unmunge repair core *.core
+
+unmunge.o: util.h
+munge.o: util.h
+repair.o: heap.h mempool.h util.h subst.h
+heap.o: heap.h
+mempool.o: mempool.h
+subst.o: subst.h
--- a/tools/bootstrap
+++ b/tools/bootstrap
@ -0,0 +1,68 @@
+#!/usr/bin/perl -s
+#
+# bootstrap -- Simpler version of unmunge for bootstrapping
+#
+# Unmunge this file using:
+#   perl -ne 'if (s/^ *[^-\s]\S{4,6} ?//) { s/[\244\245\267]/ /g; print; }'
+#
+# $Id: bootstrap,v 1.15 1997/11/14 03:52:53 mhw Exp $
+
+sub Fatal	{ print STDERR @_;  exit(1); }
+sub Max		{ my ($a, $b) = @_;  ($a > $b) ? $a : $b; }
+sub TabSkip	{ $tabWidth - 1 - (length($_[0]) % $tabWidth); }
+
+($tab,$yen,$pilc,$cdot,$tmp1,$tmp2)=("\244","\245","\266","\267","\377","\376");
+$editor = $ENV{'VISUAL'} || $ENV{'EDITOR'} || 'vi';
+$inFile = $ARGV[0];
+doFile: {
+    open(IN, "<$inFile") || die;
+    for ($lineNum = 1; ($_ = <IN>); $lineNum++) {
+	s/^\s+//;  s/\s+$//;	# Strip leading and trailing spaces
+	next if (/^$/);		# Ignore blank lines
+	($prefix, $seenCRCStr, $dummy, $_) = /^(\S{2})(\S{4})( (.*))?/;
+
+	# Correct the number of spaces after each tab
+	while (s/$tab( *)/$tmp1 . ($tmp2 x &Max(length($1), &TabSkip($`)))/e) {}
+	s/ ( +)/" " . ($cdot x length($1))/eg;	# Correct center dots
+	s/$tmp1/$tab/g;  s/$tmp2/ /g;  # Restore tabs and spaces from correction
+	s/\s*$/\n/;		# Strip trailing spaces, and add a newline
+
+	$crc = $seenCRC = 0;			# Calculate CRC
+	for ($data = $_; $data ne ""; $data = substr($data, 1)) {
+	    $crc ^= ord($data);
+	    for (1..8) {
+		$crc = ($crc >> 1) ^ (($crc & 1) ? 0x8408 : 0);
+	    }
+	}
+	if ($crc != hex($seenCRCStr)) {		# CRC mismatch
+	    close(IN);  close(OUT);
+	    unlink(@filesCreated);
+	    @filesCreated = ();
+	    @oldStat = stat($inFile);
+	    system($editor, "+$lineNum", $inFile);
+	    @newStat = stat($inFile);
+	    redo doFile if ($oldStat[9] != $newStat[9]);  # Check mod date
+	    &Fatal("Line $lineNum invalid: $_");
+	}
+
+	if ($prefix eq '--') {			# Process header line
+	    ($code, $pageNum, $file) = /^(\S{19}) Page (\d+) of (.*)/;
+	    $tabWidth = hex(substr($code, 11, 1));
+	    if ($file ne $lastFile) {
+		print "$file\n";
+		&Fatal("$file: already exists\n") if (!$f && (-e $file));
+		close(OUT);
+		open(OUT, ">$file") || &Fatal("$file: $!\n");
+		push(@filesCreated, ($lastFile = $file));
+	    }
+	} else {				# Unmunge normal line
+	    s/$tab( *)/"\t".(" " x (length($1) - &TabSkip($`)))/eg;
+	    s/$yen\n/\f/;	# Handle form feeds
+	    s/$pilc\n//;	# Handle continuation lines
+	    s/$cdot/ /g;	# Center dots -> spaces
+
+	    print OUT;
+	}
+    }
+    close(IN);  close(OUT);
+}
--- a/tools/bootstrap2
+++ b/tools/bootstrap2
@ -0,0 +1,72 @@
+#!/usr/bin/perl -s
+#
+# bootstrap2 -- Second stage bootstrapper, a version of unmunge
+#
+# $Id: bootstrap2,v 1.4 1997/11/14 03:52:54 mhw Exp $
+
+sub Cleanup	{ close(IN);  close(OUT);  unlink(@files);  @files = (); }
+sub Fatal	{ &Cleanup();  print STDERR @_;  exit(1); }
+sub TabSkip	{ $tabWidth - 1 - (length($_[0]) % $tabWidth); }
+sub TabFix	{ my ($needed, $actual) = (&TabSkip($_[0]), length($_[1]));
+    $tmp1 . ($tmp2 x $needed) . (" " x ($actual - $needed)); }
+sub HumanEdit	{ my ($file, $line, @message) = ($inFile, @_);  &Cleanup();
+    @old = stat($file);  system($editor, "+$line", $file);  @new = stat($file);
+    redo doFile if ($old[9] != $new[9]);	# Check mod date
+    &Fatal("Line $line, ", @message); }
+
+($tab,$yen,$pilc,$cdot,$tmp1,$tmp2)=("\244","\245","\266","\267","\377","\376");
+$editor = $ENV{'VISUAL'} || $ENV{'EDITOR'} || 'vi';
+($inFile, $manifest, @rest) = @ARGV;
+if ($manifest ne "") {		# Read manifest file
+    open(MANIFEST, "<$manifest") || &Fatal("$manifest: $!\n");
+    while (<MANIFEST>) { $dir = $1 if /^D\s+(.*)$/;
+	$index[$1] = $dir . $2 if /^(\d+)\s+(.*)$/; }
+}
+doFile: {
+    $seenPCRC = $pcrc1 = 0;  $lastFlags = 1;  $lastFileNum = 0;
+    open(IN, "<$inFile") || &Fatal("$inFile: $!\n");
+    for ($line = 1; ($_ = <IN>); $line++) {
+	s/^\s+//;  s/\s+$//;	# Strip leading and trailing spaces
+	next if (/^$/);		# Ignore blank lines
+	($prefix, $seenCRCStr, $dummy, $_) = /^(\S{2})(\S{4})( (.*))?/;
+	while (s/$tab( *)/&TabFix($`, $1)/eo) {}  # Correct spaces after tabs
+	s/($tmp2| )( +)/$1 . ($cdot x length($2))/ego;	# Correct center dots
+	s/$tmp1/$tab/go;  s/$tmp2/ /go;  # Restore tabs/spaces from correction
+	s/\s*$/\n/;		# Strip trailing spaces, and add a newline
+
+	$crc = 0;  $pcrc = $pcrc1;		# Calculate CRCs
+	for ($data = $_; $data ne ""; $data = substr($data, 1)) {
+	    $crc ^= ord($data);  $pcrc1 ^= ord($data);
+	    for (1..8) { $crc = ($crc >> 1) ^ (($crc & 1) ? 0x8408 : 0);
+		$pcrc1 = ($pcrc1 >> 1) ^ (($pcrc1 & 1) ? 0xedb88320 : 0); }
+	}
+	($seenPLCRC, $seenCRC) = map { hex($_) } ($prefix, $seenCRCStr);
+	&HumanEdit($line, "CRC failed: $_") if $crc != $seenCRC;
+	if ($prefix eq '--') {			# Process header line
+	    &HumanEdit($line - 1, "Page CRC failed") if $pcrc != $seenPCRC;
+	    ($humanHdr, $pageNum, $file) = /^\S{19} (Page (\d+) of (.*))/;
+	    ($vers, $flags, $seenPCRC, $tabWidth, $prodNum, $fileNum) =
+		map { hex($_) } /^(\S)(\S\S)(\S{8})(\S)(\S{3})(\S{4})/;
+	    if ($fileNum != $lastFileNum) {
+		print STDERR "MISSING files\n" if $fileNum != $lastFileNum + 1;
+		&Fatal("Missing pages\n") if $pageNum != 1 || !($lastFlags & 1);
+		if ($manifest ne "") {
+		    ($_ = $index[$fileNum]) =~ m%([^/]*)$%;
+		    &Fatal("Manifest mismatch\n") if ($file ne $1);
+		    ($file = $_) =~ s|/+|mkdir($`, 0777), "/"|eg;  # mkdir -p
+		}
+		&Fatal("$file: already exists\n") if (!$f && (-e $file));
+		close(OUT);  open(OUT, ">$file") || &Fatal("$file: $!\n");
+		push(@files, $file);  print "$fileNum $file\n";
+	    } else {
+		&Fatal("MISSING pages\n") if ($pageNum != $lastPageNum + 1);
+	    }
+	    ($lastFlags,$lastFileNum,$lastPageNum) = ($flags,$fileNum,$pageNum);
+	    $pcrc1 = 0;
+	} else {				# Unmunge normal line
+	    &HumanEdit($line, "CRC failed: $_") if ($pcrc1 >> 24) != $seenPLCRC;
+	    s/$tab( *)/"\t".(" " x (length($1) - &TabSkip($`)))/ego;
+	    s/$yen\n/\f/o;  s/$pilc\n//o;  s/$cdot/ /go;  print OUT;
+	}
+    }
+}
--- a/tools/heap.c
+++ b/tools/heap.c
@ -0,0 +1,144 @@
+/*
+ * heap.c -- Simple priority queue.  Takes pointers to cost values
+ * (presumably the first field in a larger structure) and returns
+ * them in increasing order of cost.
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Colin Plumb and Mark H. Weaver
+ *
+ * $Id: heap.c,v 1.2 1997/07/05 02:55:23 colin Exp $
+ */
+
+#include <stdio.h>	/* For fprintf(stderr, "Out of memory") */
+#include <stdlib.h>	/* For malloc() & co. */
+
+#include "heap.h"
+
+#define HeapParent(i)			((i) / 2)
+#define HeapLeftChild(i)		((i) * 2)
+#define HeapRightChild(i)		((i) * 2 + 1)
+#define HeapElem(h, i)			(h)->elems[i]
+#define HeapMinElem(h)			HeapElem(h, 1)
+#define HeapElemCost(e)			(*(e))
+#define HeapCost(h, i)			HeapElemCost(HeapElem(h, i))
+#define HeapSize(h)				((h)->numElems)
+
+static void
+SiftDown(Heap const *heap, HeapCost *e)
+{
+	HeapIndex size = HeapSize(heap), parent = 1, child;
+	HeapCost cparent = HeapElemCost(e), cchild;
+
+	for (;;) {
+		child = 2*parent;
+		if (child > size)
+			break;
+		cchild = HeapCost(heap, child);
+		if (child < size && cchild > HeapCost(heap, child+1)) {
+			cchild = HeapCost(heap, child+1);
+			child++;
+		}
+		if (cparent <= cchild)
+			break;	/* Stop sifting down */
+		HeapElem(heap, parent) = HeapElem(heap, child);
+		parent = child;
+	}
+	HeapElem(heap, parent) = e;
+}
+
+/* Debug tool: verify heap property */
+void
+HeapVerify(Heap *heap)
+{
+	HeapIndex i;
+
+	for (i = 2; i <= HeapSize(heap); i++)
+		if (HeapCost(heap, i) < HeapCost(heap, HeapParent(i)))
+			fprintf(stderr, "DEBUG: VerifyHeap failed at elem %d\n", i);
+}
+
+/* Remove and return the minimum cost from the heap. */
+HeapCost *
+HeapGetMin(Heap *heap)
+{
+	HeapIndex lastElem = HeapSize(heap);
+	HeapCost *retval;
+
+	if (!lastElem)
+		return NULL;
+	retval = HeapMinElem(heap);
+	HeapSize(heap) = lastElem-1;
+	SiftDown(heap, HeapElem(heap, lastElem));
+	return retval;
+}
+
+/* Helper - set heap size, reallocating if needed */
+static void
+HeapResize(Heap *heap, HeapIndex newNumElems)
+{
+	if (newNumElems >= heap->elemsAllocated) {
+		HeapIndex newAllocSize = heap->elemsAllocated * 2;
+
+		if (newAllocSize <= newNumElems)
+			newAllocSize = newNumElems + 1;
+		heap->elems = (HeapCost **)realloc((void *)heap->elems,
+									  sizeof(*heap->elems) * newAllocSize);
+		if (heap->elems == NULL) {
+			fprintf(stderr, "Fatal error: Out of memory growing heap\n");
+			exit(1);
+		}
+		heap->elemsAllocated = newAllocSize;
+	}
+	heap->numElems = newNumElems;
+}
+
+/* Add an element to the heap */
+void
+HeapInsert(Heap *heap, HeapCost *newElem)
+{
+	HeapIndex parent, i = ++HeapSize(heap);
+	HeapCost cost = HeapElemCost(newElem);
+
+	HeapResize(heap, i);
+	/* Sift up until parent = 0 */
+	while ((parent = HeapParent(i)) && HeapCost(heap, parent) > cost) {
+		HeapElem(heap, i) = HeapElem(heap, parent);
+		i = parent;
+	}
+	heap->elems[i] = newElem;
+}
+
+/* Initialize a new heap */
+void
+HeapInit(Heap *heap, HeapIndex initSize)
+{
+	initSize++;	/* Add one for temporary element */
+	if (initSize < 1)
+		initSize = 1;
+	heap->elems = (HeapCost **)malloc(initSize * sizeof(*heap->elems));
+	if (heap->elems == NULL) {
+		fprintf(stderr, "Fatal error: Out of memory creating heap\n");
+		exit(1);
+	}
+	heap->elemsAllocated = initSize;
+	heap->numElems = 0;
+}
+
+/* Free up a heap's resources. */
+void
+HeapDestroy(Heap *heap)
+{
+	free((void *)heap->elems);
+	heap->elemsAllocated = 0;
+	heap->numElems = 0;
+	heap->elems = NULL;
+}
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
--- a/tools/heap.h
+++ b/tools/heap.h
@ -0,0 +1,43 @@
+/*
+ * heap.h -- Simple priority queue.  Takes pointers to cost values
+ * (presumably the first field in a larger structure) and returns
+ * them in increasing order of cost.
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Colin Plumb and Mark H. Weaver
+ *
+ * $Id: heap.h,v 1.6 1997/10/31 04:22:46 mhw Exp $
+ */
+
+#ifndef HEAP_H
+#define HEAP_H 1
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <limits.h>
+
+typedef int HeapCost;
+#define COST_INFINITY INT_MAX
+typedef unsigned HeapIndex;
+
+typedef struct Heap {
+	HeapCost	**elems;
+	HeapIndex	numElems, elemsAllocated;
+} Heap;
+
+void HeapInit(Heap *heap, HeapIndex initSize);
+void HeapDestroy(Heap *heap);
+void HeapInsert(Heap *heap, HeapCost *newElem);
+HeapCost *HeapGetMin(Heap *heap);
+void HeapVerify(Heap *heap);
+
+#endif
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
--- a/tools/makemanifest
+++ b/tools/makemanifest
@ -0,0 +1,31 @@
+#!/usr/bin/perl
+
+$fileNum = 0;
+while(<>)
+{
+	/^([VDTB])(\S*)\s+(.*)/ || die("Bad filelist, line $.");
+	($type, $options, $name) = ($1, $2, $3);
+
+	if ($type eq "D")
+	{
+		$dir = $name;
+		print "D $dir\n";
+	}
+	elsif ($type eq "V")
+	{
+		# Do nothing
+	}
+	else
+	{
+		$fileNum++;
+		$tail = $name;
+		$tail =~ s|^.*/||;
+		die("Bad filelist, line $.") if $name ne $dir . $tail;
+		print "$fileNum $tail\n";
+	}
+}
+
+#
+# vi: ai ts=4
+# vim: si
+#
--- a/tools/mempool.c
+++ b/tools/mempool.c
@ -0,0 +1,137 @@
+/*
+ * mempool.c - Pooled memory allocation, similar to GNU obstacks.
+ *
+ * $Id: mempool.c,v 1.5 1997/11/13 23:53:08 colin Exp $
+ */
+#include <assert.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>	/* For malloc() & free() */
+
+#include "mempool.h"
+
+/*
+ * The memory pool allocation functions
+ *
+ * These are based on a linked list of memory blocks, usually of uniform
+ * size.  New memory is allocated from the tail of the current block,
+ * until that is inadequate, then a new block is allocated.
+ * The entire pool can be freed at once by calling memPoolFree().
+ */
+struct PoolBuf {
+	struct PoolBuf *next;
+	unsigned size;
+	/* Data follows */
+};
+
+/* The prototype empty pool, including the default allocation size. */
+static struct MemPool EmptyPool = { 0, 0, 0, 4096, 0 , 0, 0};
+
+/* Initialize the pool for first use */
+void
+memPoolInit(struct MemPool *pool)
+{
+	*pool = EmptyPool;
+}
+
+/* Set the pool's purge function */
+void
+memPoolSetPurge(struct MemPool *pool, int (*purge)(void *), void *arg)
+{
+	pool->purge = purge;
+	pool->purgearg = arg;
+}
+
+/* Free all the memory in the pool */
+void
+memPoolEmpty(struct MemPool *pool)
+{
+	struct PoolBuf *buf;
+
+	while ((buf = pool->head) != 0) {
+		pool->head = buf->next;
+		free(buf);
+	}
+	pool->freespace = 0;
+	pool->totalsize = 0;
+}
+
+
+/*
+ * Restore a pool to a marked position, freeing subsequently allocated
+ * memory.
+ */
+void
+memPoolCutBack(struct MemPool *pool, struct MemPool const *cutback)
+{
+	struct PoolBuf *buf;
+
+	assert(pool);
+	assert(cutback);
+	assert(pool->totalsize >= cutback->totalsize);
+
+	while((buf = pool->head) != cutback->head) {
+		pool->head = buf->next;
+		free(buf);
+	}
+	*pool = *cutback;
+}
+
+/*
+ * Allocate a chunk of memory for a structure.  Alignment is assumed to be
+ * a power of 2.  It could be generalized, if that ever becomes relevant.
+ * Note that alignment is from the beginning of an allocated chunk, which
+ * is guaranteed by ANSI to be as aligned as can possibly matter.
+ */
+void *
+memPoolAlloc(struct MemPool *pool, unsigned len, unsigned alignment)
+{
+	char *p;
+	unsigned t;
+
+	/* Where to allocate next object */
+	p = pool->freeptr;
+	/* How far it is from the beginning of the chunk. */
+	t = p - (char *)pool->head;
+	/* How much to round up freeptr to make alignment */
+	t = -t & --alignment;
+
+	/* Okay, does it fit? */
+	if (pool->freespace >= len+t) {
+		pool->freespace -= len+t;
+		p += t;
+		pool->freeptr = p + len;
+		return p;
+	}
+
+	/* It does not fit in the current chunk.  Go for a bigger chunk. */
+
+	/* First, figure out how much to skip at the beginning of the chunk */
+	alignment &= -(unsigned)sizeof(struct PoolBuf);
+	alignment += sizeof(struct PoolBuf);
+	/* Then, figure out a chunk size that will fit */
+	t = pool->chunksize;
+	assert(t);
+	while (len + alignment > t)
+		t *= 2;
+	while ((p = malloc(t)) == 0) {
+		/* If that didn't work, try purging or smaller allocations */
+		if (!pool->purge || !pool->purge(pool->purgearg)) {
+			t /= 2;
+			if (len + alignment > t)
+				fputs("Out of memory!\n", stderr);
+				exit (1);	/* Failed */
+		}
+	}
+
+	/* Update the various pointers. */
+	pool->totalsize += t;
+	((struct PoolBuf *)p)->next = pool->head;
+	((struct PoolBuf *)p)->size = t;
+	pool->head = (struct PoolBuf *)p;
+	pool->freespace = t - len - alignment;
+	p += alignment;
+	pool->freeptr = p + len;
+
+	return p;
+}
--- a/tools/mempool.h
+++ b/tools/mempool.h
@ -0,0 +1,36 @@
+/* $Id: mempool.h,v 1.2 1997/11/13 23:53:09 colin Exp $ */
+
+#ifndef MEMPOOL_H
+#define MEMPOOL_H
+
+typedef struct MemPool {
+	struct PoolBuf *head;
+	char *freeptr;
+	unsigned freespace;
+	unsigned chunksize;	/* Default starting point */
+	unsigned long totalsize;
+	int (*purge)(void *);	/* Return non-zero to retry alloc */
+	void *purgearg;
+} MemPool;
+
+/* A global pool for miscellaneous stuff. */
+extern struct MemPool MiscPool;
+
+/*
+ * Nice clean interfaces
+ */
+void memPoolInit(struct MemPool *pool);
+void memPoolSetPurge(struct MemPool *pool, int (*purge)(void *), void *arg);
+void memPoolEmpty(struct MemPool *pool);
+void memPoolCutBack(struct MemPool *dest, struct MemPool const *cutback);
+void *memPoolAlloc(struct MemPool *pool, unsigned len, unsigned alignment);
+#ifdef DEADCODE
+char const *memPoolStore(struct MemPool *pool, char const *str);
+#endif
+
+/* Lookie here!  An ASNI-compliant alignment finder! */
+#define alignof(type) (sizeof(struct{type _x; char _y;}) - sizeof(type))
+
+#define memPoolNew(pool, type) memPoolAlloc(pool, sizeof(type), alignof(type))
+
+#endif /* MEMPOOL_H */
--- a/tools/munge.c
+++ b/tools/munge.c
@ -0,0 +1,543 @@
+/*
+ * munge.c -- Program to convert a text file into "munged" form,
+ *            suitable for reconstruction from printed form.  Tabs are
+ *            made visible and checksums are added to each line and each
+ *            page to protect against transcription errors.
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Designed by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann
+ * Written by Mark H. Weaver
+ *
+ * $Id: munge.c,v 1.32 1997/11/12 23:28:53 mhw Exp $
+ */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <ctype.h>
+#include <stdlib.h>
+
+#include "util.h"
+
+/*
+ * The file is divided into pages, and the format of each page is
+ *
+--f414 000b2dc79af40010002 Page 1 of munge.c
+
+bc38e5 /*
+40a838  * munge.c -- Program to convert a text file into munged form
+647222  *
+193f28  * Copyright (C) 1997 Pretty Good Privacy, Inc.
+827222  *
+699025  * Designed by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann
+0d050c  * Written by Mark H. Weaver
+ *
+ * Where the first 2 columns are the high 8 bits (in hex) of a running
+ * CRC-32 of the page (the string "--", unlikely to be confused with
+ * any digits, indicates a page header line) and the next 4 columns
+ * are a CRC-16 of the rest of the line.  Then a space (not counted in
+ * the CRC), and the line of text.  Tabs are printed as the currency
+ * symbol (ISO Latin 1 character 164) followed by the appropriate number
+ * of spaces, and any form feeds are printed as a yen symbol (Latin 1 165).
+ * The CRC is computed on the transformed line, including the trailing
+ * newline.  No trailing whitespace is permitted.
+ *
+ * The header line contains a (hex) number of the form 0ffcccccccctpppnnnn,
+ * where the digit 0 is a version number, ff are flags, ccccccc is the CRC-32
+ * of the page, t is the tab size (usually 4 or 8; 0 for binary files that
+ * are sent in radix-64), ppp is the product number (usually 1, different
+ * for different books), and nnnn is the file number (sequential from 1).
+ *
+ * This is followed by " Page %u of " and the file name.
+ */
+
+typedef struct MungeState
+{
+	EncodeFormat const *	fmt;
+	EncodeFormat const *	hFmt;
+	int				binaryMode, tabWidth;
+	long			origLineNumber;
+	long			productNumber, fileNumber, pageNumber, lineNumber;
+	unsigned long	fileOffset;
+	CRC				pageCRC;
+	char const *	fileName;
+	char const *	fileNameTail;
+	char *			pageBuffer;	/* Buffer large enough to hold one page */
+	char *			pagePos;	/* Current position in pageBuffer */
+	word16			hdrFlags;
+	FILE *			file;
+	FILE *			out;
+} MungeState;
+
+
+void ChecksumLine(EncodeFormat const *fmt, char const *line, size_t length,
+				  char *prefix, CRC *pageCRC)
+{
+	CRC			lineCRC;
+	CRC			runCRCPart = 0;
+
+	lineCRC = CalculateCRC(fmt->lineCRC, 0, (byte const *)line, length);
+	if (pageCRC != NULL)
+	{
+		*pageCRC = CalculateCRC(fmt->pageCRC, *pageCRC,
+								(byte const *)line, length);
+		runCRCPart = RunningCRCFromPageCRC(fmt, *pageCRC);
+	}
+
+	prefix += EncodeCheckDigits(fmt, runCRCPart, fmt->runningCRCBits, prefix);
+	prefix += EncodeCheckDigits(fmt, lineCRC, fmt->lineCRC->bits, prefix);
+
+	*prefix++ = ' ';	/* Write a space over the null byte */
+}
+
+/* Returns 1 for convenience */
+int PrintFileError(MungeState *state, char const *message)
+{
+	fprintf(stderr, "%s in %s %s %lu\n", message, state->fileName,
+			state->binaryMode ? "offset" : "line",
+			state->binaryMode ? state->fileOffset : state->origLineNumber);
+	return 1;
+}
+
+int MungeLine(MungeState *state, char *buffer, int length,
+			  char *line, int *bufferUsed)
+{
+	int		i = 0, j = 0, jOld = 0;
+	char	ch;
+
+	for (i = 0; i < length && j < LINE_LENGTH; i++)
+	{
+		jOld = j;
+		ch = buffer[i];
+		if (ch == '\t')
+		{
+			line[j++] = TAB_CHAR;
+			if (state->tabWidth < 1)
+				return PrintFileError(state,
+									  "ERROR: Tab found in radix64 stream");
+			else
+				while (j % state->tabWidth && j < LINE_LENGTH)
+					line[j++] = TAB_PAD_CHAR;
+		}
+		else if (ch == '\n')
+		{
+			if (i + 1 < length)
+				return PrintFileError(state,
+								"UNEXPECTED ERROR: fgets read past newline!?");
+			break;
+		}
+		else if (ch == '\f')
+		{
+			break;
+		}
+		else if (ch == ' ' && (j <= 0 || line[j-1] == ' ' ||
+							   line[j-1] == SPACE_CHAR ||
+							   i+1 >= length || buffer[i+1] == '\n'))
+		{
+			line[j++] = SPACE_CHAR;
+		}	
+		else if (ch >= ' ' && ch <= '~')
+			line[j++] = ch;
+		else
+			return PrintFileError(state, "ERROR: Non-ASCII char");
+	}
+
+	if (i < length && buffer[i] == '\n')
+	{
+		i++;
+		state->origLineNumber++;
+	}
+	else if (i < length && buffer[i] == '\f' && j < LINE_LENGTH)
+	{
+		i++;
+		line[j++] = FORMFEED_CHAR;
+	}
+	else
+	{
+		/* If there's no newline, we need to add the continuation marker */
+		if (i > 0 && j >= LINE_LENGTH)
+		{
+			/* Remove the last character if we're out of room */
+			i--;
+			j = jOld;
+		}
+		line[j++] = CONTIN_CHAR;
+	}
+
+	/* Strip trailing spaces */
+	while (j > 0 && isspace((unsigned char)line[j - 1]))
+		j--;
+
+	if (j > LINE_LENGTH)	/* This should never happen */
+		return PrintFileError(state, "ERROR: Internal error, line too long");
+
+	/* Add trailing newline and NULL */
+	line[j++] = '\n';
+	line[j++] = '\0';
+
+	/* Return number of chars used from buffer */
+	*bufferUsed = i;
+
+	return 0;
+}
+
+static void
+Encode3(byte const src[3], char dest[4])
+{
+	dest[0] = radix64Digits[                     (src[0]>>2 & 0x3f)];
+	dest[1] = radix64Digits[(src[0]<<4 & 0x30) | (src[1]>>4 & 0x0f)];
+	dest[2] = radix64Digits[(src[1]<<2 & 0x3c) | (src[2]>>6 & 0x03)];
+	dest[3] = radix64Digits[(src[2]    & 0x3f)];
+}
+
+static int
+EncodeLine(byte const *src, int srcLen, char *dest)
+{
+	char *	destp = dest;
+	byte	tempSrc[3];
+
+	for (; srcLen >= 3; srcLen -= 3)
+	{
+		Encode3(src, destp);
+		src += 3; destp += 4;
+	}
+
+	if (srcLen > 0)
+	{
+		memset(tempSrc, 0, sizeof(tempSrc));
+		memcpy(tempSrc, src, srcLen);
+		Encode3(src, destp);
+		src += 3; destp += 4; srcLen -= 3;
+		while (srcLen < 0)
+			destp[srcLen++] = RADIX64_END_CHAR;
+	}
+
+	return destp - dest;
+}
+
+static int
+MungeBinaryLine(MungeState *state, byte const *buffer, int length, char *line)
+{
+	char	binLine[128];
+	int		binLength;			/* Destination length */
+	int		used;
+
+	binLength = EncodeLine(buffer, length, binLine);
+
+	/* Append newline */
+	binLine[binLength++] = '\n';
+	binLine[binLength] = '\0';
+
+	return MungeLine(state, binLine, binLength, line, &used);
+}
+
+int MaybePageBreak(MungeState *state)
+{
+	EncodeFormat const *	fmt = state->fmt;
+	EncodeFormat const *	hFmt = state->hFmt;
+
+	if (state->lineNumber >= LINES_PER_PAGE)
+	{
+		char	line[512];
+		char *	lineData	= line + PREFIX_LENGTH;
+		char *	p			= lineData;
+		
+		p += EncodeCheckDigits(hFmt, 0, HDR_VERSION_BITS, p);
+		p += EncodeCheckDigits(hFmt, state->hdrFlags, HDR_FLAG_BITS, p);
+		p += EncodeCheckDigits(hFmt, state->pageCRC, fmt->pageCRC->bits, p);
+		p += EncodeCheckDigits(hFmt, state->tabWidth, HDR_TABWIDTH_BITS, p);
+		p += EncodeCheckDigits(hFmt, state->productNumber, HDR_PRODNUM_BITS, p);
+		p += EncodeCheckDigits(hFmt, state->fileNumber, HDR_FILENUM_BITS, p);
+
+		sprintf(p, " Page %ld of %s\n", state->pageNumber + 1,
+				state->fileNameTail);
+
+		if (strlen(lineData) > LINE_LENGTH + 1)
+		{
+			PrintFileError(state, "ERROR: Header line too long");
+			fprintf(stderr, "> %s", lineData);
+			return -1;
+		}
+
+		/* Compute checksums and prefix them to line */
+		ChecksumLine(fmt, lineData, strlen(lineData), line, NULL);
+
+		fprintf(state->out, "%c%c%s\n%s\f", HDR_PREFIX_CHAR,
+				fmt->headerTypeChar, line + 2, state->pageBuffer);
+
+		state->pageNumber++;
+		state->lineNumber = 0;
+		state->pageCRC = 0;
+		state->pagePos = state->pageBuffer;		/* Clear page buffer */
+	}
+	return 0;
+}
+
+/*
+ * Search for Emacs "tab-width: " maker in file.
+ * Emacs is stricter about the format, but this will do.
+ */
+int FindTabWidth(MungeState *state)
+{
+	char const * const	tabWidthMarker = " tab-width: ";
+	char				buffer[512];
+	char *				p;
+	int					length;
+	int					tabWidth = 0;
+
+	fseek(state->file, -(sizeof(buffer) - 1), SEEK_END);
+	length = fread(buffer, 1, sizeof(buffer) - 1, state->file);
+	buffer[length] = '\0';
+	p = strstr(buffer, tabWidthMarker);
+	if (p != NULL)
+	{
+		p += strlen(tabWidthMarker);
+		while (*p != '\0' && *p != '\n' && isspace(*p))
+			p++;
+		tabWidth = strtol(p, &p, 10);
+		while (*p != '\0' && *p != '\n' && isspace(*p))
+			p++;
+		if (*p != '\n' || tabWidth < 2)
+			tabWidth = 0;
+		else if (tabWidth > 16)
+			fprintf(stderr, "WARNING: Weird tab-width (%d), %s\n",
+							tabWidth, state->fileName);
+	}
+	return tabWidth;
+}
+
+/*
+ * Open the given source file and send the munged output to the
+ * FILE *, with the given options.
+ */
+int MungeFile(char const *fileName, FILE *out, EncodeFormat const *fmt,
+			  int binaryMode, int defaultTabWidth,
+			  long productNumber, long fileNumber)
+{
+	MungeState *	state;
+	int				length, used;
+	char			line[PREFIX_LENGTH + LINE_LENGTH + 10];
+	char *			lineData = line + PREFIX_LENGTH;
+	char			buffer[128];
+	int				result = 0;
+
+	state = (MungeState *)calloc(1, sizeof(*state));
+	state->fmt = fmt;
+	state->hFmt = &hexFormat;
+	state->origLineNumber = 1;
+	state->fileName = fileName;
+	state->pageCRC = 0;
+	state->productNumber = productNumber;
+	state->fileNumber = fileNumber;
+	state->pageNumber = 0;
+	state->lineNumber = 0;
+	state->fileOffset = 0;
+	state->binaryMode = binaryMode;
+	state->pageBuffer = malloc(PAGE_BUFFER_SIZE);
+	state->pageBuffer[0] = '\0';
+	state->pagePos = state->pageBuffer;
+	state->hdrFlags = 0;
+	state->out = out;
+
+	state->fileNameTail = strrchr(state->fileName, '/');
+	if (state->fileNameTail == NULL)
+		state->fileNameTail = state->fileName;
+	else
+		state->fileNameTail++;
+
+	state->file = fopen(state->fileName, binaryMode ? "rb" : "r");
+	if (state->file == NULL)
+	{
+		result = errno;
+		fprintf(stderr, "ERROR opening %s: %s\n",
+				state->fileName, strerror(result));
+		goto error;
+	}
+	
+	if (state->binaryMode)
+	{
+		state->tabWidth = 0;
+	}
+	else
+	{
+		state->tabWidth = FindTabWidth(state);
+		if (state->tabWidth == 0)
+			state->tabWidth = defaultTabWidth;
+		rewind(state->file);
+	}
+
+	while (!feof(state->file))
+	{
+		if (state->binaryMode)
+		{
+			length = fread(buffer, 1, BYTES_PER_LINE, state->file);
+			if (length < 1)
+			{
+				if (feof(state->file))
+					break;
+				goto fileError;
+			}
+			if ((result = MaybePageBreak(state)))
+				goto error;
+			if ((result = MungeBinaryLine(state, buffer, length, lineData)))
+				goto error;
+			state->fileOffset += length;
+		}
+		else
+		{
+			if (fgets(buffer, sizeof(buffer), state->file) == NULL)
+			{
+				if (feof(state->file))
+					break;
+				goto fileError;
+			}
+			length = strlen(buffer);
+			if ((result = MaybePageBreak(state)))
+				goto error;
+			if ((result = MungeLine(state, buffer, length, lineData, &used)))
+				goto error;
+
+			if (used < length)
+				if (fseek(state->file, used - length, SEEK_CUR))
+					goto fileError;
+		}
+
+		/* Compute checksums and prefix them to the line */
+		ChecksumLine(fmt, lineData, strlen(lineData), line, &state->pageCRC);
+
+		strcpy(state->pagePos, line);
+		length = strlen(state->pagePos);
+		/* Suppress trailing whitespace on blank lines */
+		if (length == PREFIX_LENGTH+1 && state->pagePos[length-1] == '\n') {
+			state->pagePos[--length-1] = '\n';
+			state->pagePos[length] = '\0';
+		}
+		state->pagePos += length;
+
+		state->lineNumber++;
+	}
+
+	if (state->lineNumber > 0)
+	{
+		/* Force a final page break */
+		state->lineNumber = LINES_PER_PAGE;
+		state->hdrFlags |= HDR_FLAG_LASTPAGE;
+		if ((result = MaybePageBreak(state)))
+			goto error;
+	}
+
+	result = 0;
+	goto done;
+
+fileError:
+	result = ferror(state->file);
+
+error:
+done:
+	if (state != NULL)
+	{
+		if (state->file != NULL)
+			fclose(state->file);
+		free(state);
+	}
+	return result;
+}
+
+int main(int argc, char *argv[])
+{
+	int		result = 0;
+	int		i, j;
+	int		defaultTabWidth = 4;
+	int		binaryMode = 0;
+	long	productNumber = 1;
+	long	fileNumber = 1;
+	char *	endOfNumber;
+	EncodeFormat const *	fmt = NULL;
+
+	InitUtil();
+
+	for (i = 1; i < argc && argv[i][0] == '-'; i++)
+	{
+		if (0 == strcmp(argv[i], "--"))
+		{
+			i++;
+			break;
+		}
+		for (j = 1; argv[i][j] != '\0'; j++)
+		{
+			if (isdigit(argv[i][j]))
+			{
+				defaultTabWidth = argv[i][j] - '0';
+				if (defaultTabWidth < 2 || defaultTabWidth > 9)
+					fprintf(stderr, "WARNING: Weird default tab-width (%d)\n",
+									defaultTabWidth);
+			}
+			else if (argv[i][j] == 'b')
+			{
+				binaryMode = 1;
+			}
+			else if (argv[i][j] == 'F')
+			{
+				fmt = FindFormat(argv[i][j+1]);
+				if (!fmt || argv[i][j+2] != '\0')
+				{
+					fprintf(stderr, "ERROR: Invalid format char\n");
+					exit(1);
+				}
+				break;
+			}
+			else if (argv[i][j] == 'p')
+			{
+				productNumber = strtol(&argv[i][j+1], &endOfNumber, 10);
+				if (*endOfNumber != '\0')
+				{
+					fprintf(stderr, "ERROR: Invalid product number\n");
+					exit(1);
+				}
+				break;
+			}
+			else if (argv[i][j] == 'f')
+			{
+				fileNumber = strtol(&argv[i][j+1], &endOfNumber, 10);
+				if (*endOfNumber != '\0')
+				{
+					fprintf(stderr, "ERROR: Invalid file number\n");
+					exit(1);
+				}
+				break;
+			}
+			else
+			{
+				fprintf(stderr, "ERROR: Unrecognized option -%c\n", argv[i][j]);
+				exit(1);
+			}
+		}
+	}
+	if (!fmt)
+		fmt = binaryMode ? &radix64Format : &hexFormat;
+
+	for (; i < argc; i++)
+	{
+		if ((result = MungeFile(argv[i], stdout, fmt, binaryMode,
+								defaultTabWidth, productNumber,
+								fileNumber)) != 0)
+		{
+			/* If result > 0, message should have already been printed */
+			if (result < 0)
+				fprintf(stderr, "ERROR: %s\n", strerror(result));
+			exit(1);
+		}
+		fileNumber++;
+	}
+	
+	return 0;
+}
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
--- a/tools/psgen
+++ b/tools/psgen
@ -0,0 +1,324 @@
+#!/usr/bin/perl
+#
+# psgen -- Postscript generator for code portion of source books
+#
+# Reads in a list of files/dirs from <filelist>, runs munge on each of
+# them, and generates a single postscript file to stdout.  The page numbers
+# for each file/dir are put into the file <pagenums>.
+#
+# usage: psgen [ options... ] <filelist> <pagenums> <volume #>  > foo.ps
+#			-l<firstLogicalPage>
+#			-p<firstPhysicalPage>
+#			-f<font>
+#			-D<defs> (passed to yapp)
+#			-P<productNumber>
+#			-o<mungedOutFile>
+#			-e				(auto edit errors)
+#
+# $Id: psgen,v 1.18 1997/11/13 21:44:16 colin Exp $
+#
+
+$bookRoot = $ENV{"BOOKROOT"} || ".";
+$toolsDir = "$bookRoot/tools";
+$psDir = "$bookRoot/ps";
+$editor = $ENV{"EDITOR"} || "vi";
+
+# Configuration settings - external file names
+$mungeProg = "$toolsDir/munge";
+$yappProg = "$toolsDir/yapp";
+$preambleFile = "$psDir/prolog.ps";
+$tempFile = "/tmp/psgen-$$";
+
+# Parse arguments
+$firstLogPage = $firstPhysPage = 0;
+$productNumber = 1;
+$font = "OCRB";
+$autoEdit = 0;
+while ($#ARGV >= 0 && $ARGV[0] =~ /^-/)
+{
+	$_ = shift @ARGV;
+	if (/^--$/)
+	{
+		last;
+	}
+	elsif (/^-l(\d+)$/)
+	{
+		$firstLogPage = $1;
+	}
+	elsif (/^-p(\d+)$/)
+	{
+		$firstPhysPage = $1;
+	}
+	elsif (/^-f(.+)$/)
+	{
+		$font = $1;
+	}
+	elsif (/^-D(.+)$/)
+	{
+		$yappDefs .= " " . $_;
+	}
+	elsif (/^-P(\d+)$/)
+	{
+		$productNumber = $1;
+	}
+	elsif (/^-o(.+)$/)
+	{
+		$mungedOutFile = $1;
+	}
+	elsif (/^-e$/)
+	{
+		$autoEdit = 1;
+	}
+	else
+	{
+		&Error("Unrecognized option: '$_'");
+	}
+}
+$fileListFile = shift @ARGV || die "Missing file list argument (arg 1)";
+$pageNumFile = shift @ARGV || die "Missing page number file argument (arg 2)";
+$volume = shift @ARGV || die "Missing volume number argument (arg 3)";
+
+# Determine initial page numbers
+{
+	my $nextLogPage = 1;
+	my $nextPhysPage = 3;
+	my $volNum = 0;		# Which volume's page numbers we're reading
+
+	if ($volume > 1)
+	{
+		open(OLDPAGENUMS, "<$pageNumFile") || die;
+		while (<OLDPAGENUMS>)
+		{
+			if (/^Volume\s+(\d+)$/)
+			{
+				$volNum = $1;
+			}
+			elsif (/^Next:\s+(\d+)\s*$/ && $volNum == $volume - 1)
+			{
+				$nextLogPage = $1;
+			}
+		}
+		close(OLDPAGENUMS);
+	}
+	else
+	{
+		unlink($pageNumFile);
+	}
+	$firstLogPage = $nextLogPage if ($firstLogPage == 0);
+	$firstPhysPage = $nextPhysPage if ($firstPhysPage == 0);
+}
+
+# Names of PostScript operators invoked.  These are the interface
+# between this file and the $preambleFile.
+$oddPageStartPS = "OddPageStart";
+$evenPageStartPS = "EvenPageStart";
+$oddPageEndPS = "OddPageEnd";
+$evenPageEndPS = "EvenPageEnd";
+$dirPagePS = "DirPage";
+# This is short because it's emitted every line
+$linePS = "L";
+
+# Handle an error from munge.
+# A result of 0 means to retry, 1 means to exit
+sub MungeError
+{
+	my $result = 1;
+
+	open(FILEH, "<$tempFile") || die;
+	while (<FILEH>)
+	{
+		print STDERR;
+		if (/ in (.*) line (\d+)$/)
+		{
+			my ($fileName, $lineNumber) = ($1, $2);
+
+			if ($autoEdit)
+			{
+				my @statResult = stat($fileName);
+				my $oldMTime = $statResult[9];
+
+				system("'$editor' '+$lineNumber' '$fileName' 1>&2");
+				@statResult = stat($fileName);
+				$result = ($statResult[9] == $oldMTime);
+				last;
+			}
+		}
+	}
+	close(FILEH);
+	unlink($tempFile) || die "Couldn't unlink $tempFile";
+	return $result;
+}
+
+sub CopyFileToPS
+{
+	local $fileName = $_[0];
+	local $args = "'-I$psDir' '-Dfont=$font'";
+	local $_;
+
+	$args .= $yappDefs;
+	open(FILEH, "$yappProg $args '$fileName' |") || die;
+	while (<FILEH>)
+	{
+		print PSOUT $_;
+	}
+	close(FILEH) || exit(1);
+	1;
+}
+
+# Wrap a string in parens as required by PostScript, with proper quoting.
+sub StringPS
+{
+	local $str = $_[0];
+
+	$str =~ s/([\\()])/\\$1/g;
+	"(" . $str . ")";
+}
+
+# Emit a start of page.  The Postscript DSC %%Page: header 
+# (followed by logical page number, then physical) and
+# the top-of-page function (which is passed the page number as a string)
+sub PageStartPS
+{
+	local $pageNum = $_[0];
+
+	"%%Page: " . ($pageNum + $firstLogPage) . " " .
+				 ($pageNum + $firstPhysPage) . "\n" .
+		&StringPS($pageNum + $firstLogPage) .
+		((($pageNum + $firstLogPage) % 2) ? $oddPageStartPS
+										  : $evenPageStartPS) . "\n";
+}
+
+sub PageEndPS
+{
+	local $pageNum = $_[0];
+
+	((($pageNum + $firstLogPage) % 2) ? $oddPageEndPS : $evenPageEndPS) . "\n";
+}
+
+# Save the page number to a table-of-contents file
+sub SavePageNum
+{
+	local ($fileName, $pageNum) = @_;
+
+	print PAGENUMS ($pageNum + $firstLogPage), ": $fileName\n";
+}
+
+# The main code.
+
+open(PSOUT, ">-") || die;
+open(FILELIST, "<$fileListFile") || die;
+open(PAGENUMS, ">>$pageNumFile") || die;
+if ($mungedOutFile ne "")
+{
+	open(MUNGEDOUT, ">$mungedOutFile") || die;
+}
+
+print PAGENUMS "Volume $volume\n";
+
+&CopyFileToPS($preambleFile);
+
+$fileNumber = 0;
+$pageNum = 0;	# This is 0-based, since it is added to $first{Log,Phys}Page
+$enable = 0;
+
+while (<FILELIST>)
+{
+	/^([VDTB])(\S*)\s+(.*)/ || die "Illegal file list line $.";
+
+	local ($fileType, $options, $arg) = ($1, $2, $3);
+
+	if ($fileType eq "V")
+	{
+		@args = split(/\s+/, $arg);
+		if ($enable = ($args[0] == $volume))
+		{
+			$defaultTabWidth = int($args[1]);
+		}
+	}
+	elsif ($fileType eq "D")
+	{
+		next unless $enable;	# Do nothing if we're in the wrong volume
+		$dirName = $arg;
+		&SavePageNum($dirName, $pageNum);
+		print PSOUT &PageStartPS($pageNum);
+		print PSOUT &StringPS($dirName), $dirPagePS, "\n";
+		print PSOUT &PageEndPS($pageNum);
+		$pageNum++;
+	}
+	else
+	{
+		my $done = 0;
+
+		$fileNumber++;
+		$fileName = $arg;
+		next unless $enable;	# Do nothing if we're in the wrong volume
+		&SavePageNum($fileName, $pageNum);
+		$quotedFileName = $fileName;
+		$quotedFileName =~ s/'/\\'/g;
+		$tabWidth = ($options =~ /(\d)/) ? $1 : $defaultTabWidth;
+		$args = ($fileType eq "B") ? "-b" : "";
+		$args .= " -$tabWidth -p$productNumber -f$fileNumber";
+		while (!$done)
+		{
+			if (open(FILE, "$mungeProg $args '$quotedFileName' 2>$tempFile |"))
+			{
+				$line = <FILE>;
+				print MUNGEDOUT $line;
+
+				while ($line ne "")
+				{
+					print PSOUT &PageStartPS($pageNum);
+
+					while ($line ne "" and $line !~ /^\f/)
+					{
+						chop $line;
+						print PSOUT &StringPS($line), $linePS, "\n";
+						$line = <FILE>;
+						print MUNGEDOUT $line;
+					}
+					$line =~ s/^\f//;
+
+					print PSOUT &PageEndPS($pageNum);
+					$pageNum++;
+				}
+
+				if (close(FILE))
+				{
+					$done = 2;
+				}
+				else
+				{
+					$done = &MungeError();
+				}
+			}
+			else
+			{
+				$done = &MungeError();
+			}
+		}
+		if ($done == 1)
+		{
+			die;
+		}
+	}
+}
+
+# Print PostScript DSC trailer with the correct number of pages
+print PSOUT "%%Trailer\n%%Pages: ", $pageNum, "\n%%EOF\n";
+
+print PAGENUMS "Pages: ", $pageNum, "\n";
+print PAGENUMS "Next: ", ((($pageNum+1) & ~1) + $firstLogPage), "\n";
+
+close(PAGENUMS) || die;
+close(FILELIST) || die;
+close(PSOUT) || die;
+
+if ($mungedOutFile ne "")
+{
+	close(MUNGEDOUT) || die;
+}
+
+#
+# vi: ai ts=4
+# vim: si
+#
--- a/tools/repair.c
+++ b/tools/repair.c
--- a/tools/sortpages
+++ b/tools/sortpages
@ -0,0 +1,185 @@
+#!/usr/bin/perl
+#
+# $Id: sortpages,v 1.8 1997/12/11 19:20:58 mhw Exp $
+#
+
+@fileNameFromNumber = ();
+@pagesFound = ();
+$theProductNumber = 0;
+
+for $fileIndex (0..$#ARGV)
+{
+	$fileName = $ARGV[$fileIndex];
+	open(FILE, "<$fileName") || die;
+	while (!eof(FILE))
+  	{
+  		$filePos = tell(FILE);
+  		$_ = <FILE>;
+ 		if (/^\f?-\S/)
+  		{
+  			my ($versionHex, $flagsHex, $pageCRCHex, $tabWidthHex,
+				$productNumberHex, $fileNumberHex, $pageNumber, $name)
+					  = (/^\f?-\S\S{4}\ 		# CRC followed by a space
+						 ([0-9a-f])				# Format version
+						 ([0-9a-f]{2})			# Flags
+						 ([0-9a-f]{8})			# Running CRC32
+						 ([0-9a-f])				# Tab width (0 means radix64)
+						 ([0-9a-f]{3})			# Product number
+						 ([0-9a-f]{4})			# File number
+						 \ Page\ (\d+)\ of\ (.*)/x);
+			my $version = hex($versionHex);
+			my $flags = hex($flagsHex);
+			my $productNumber = hex($productNumberHex);
+			my $fileNumber = hex($fileNumberHex);
+
+			unless ($version == 0 && $productNumber > 0
+						&& $fileNumber > 0 && $pageNumber > 0
+						&& $name ne "")
+			{
+				print STDERR "ERROR: Invalid header info ",
+							 "at $fileName line $.\n";
+				exit(1);
+			}
+
+			if (!defined($fileNameFromNumber[$fileNumber]))
+			{
+				$fileNameFromNumber[$fileNumber] = $name;
+			}
+			elsif ($fileNameFromNumber[$fileNumber] ne $name)
+			{
+				print STDERR "ERROR: Mismatched filename ",
+							 "at $fileName line $.\n";
+				exit(1);
+			}
+
+			if (!$theProductNumber)
+			{
+				$theProductNumber = $productNumber;
+			}
+			elsif ($theProductNumber != $productNumber)
+			{
+				print STDERR "ERROR: Different product number ",
+							 "at $fileName line $.\n";
+				exit(1);
+			}
+
+			push @pagesFound, (sprintf "%5d:%4d:%d:%d:%d",
+					 $fileNumber, $pageNumber, $flags, $fileIndex, $filePos);
+		}
+	}
+	close(FILE) || die;
+}
+
+@pagesFound = sort @pagesFound;
+
+$result = 0;
+$lastFileNumber = 0;
+$lastPageNumber = 0;
+$nextFileNumber = 1;
+$nextPageNumber = 1;
+$fileIndexOpen = -1;
+foreach (@pagesFound)
+{
+	my ($fileNumber, $pageNumber, $flags, $fileIndex, $filePos) = split /:/;
+
+	$fileNumber = int($fileNumber);
+	$pageNumber = int($pageNumber);
+
+	if ($fileNumber == $lastFileNumber && $pageNumber == $lastPageNumber)
+	{
+		print STDERR "DUPLICATE: File $fileNumber, page $pageNumber, skipped\n";
+		next;
+	}
+
+	if ($nextFileNumber < $fileNumber && $nextPageNumber != 1)
+	{
+		print STDERR "MISSING: File $nextFileNumber, ",
+					 "pages $nextPageNumber - END\n";
+		$nextPageNumber = 1;
+		$nextFileNumber++;
+		$result = 1;
+	}
+	if ($nextFileNumber < $fileNumber)
+	{
+		print STDERR "MISSING: Files $nextFileNumber - ",
+					 $fileNumber-1, "\n";
+		$nextFileNumber = $fileNumber;
+		$nextPageNumber = 1;
+		$result = 1;
+	}
+	if ($nextFileNumber != $fileNumber)
+	{
+		print STDERR "ERROR: Internal error, unexpected fileNumber\n";
+		exit(1);
+	}
+
+	if ($nextPageNumber < $pageNumber)
+	{
+		print STDERR "MISSING: File $fileNumber, pages $nextPageNumber - ",
+					 $pageNumber-1, "\n";
+		$nextPageNumber = $pageNumber;
+		$result = 1;
+	}
+	if ($nextPageNumber != $pageNumber)
+	{
+		print STDERR "ERROR: Internal error, unexpected pageNumber\n";
+		exit(1);
+	}
+
+	if ($fileIndexOpen != $fileIndex)
+	{
+		if ($fileIndexOpen >= 0)
+		{
+			close(FILE) || die;
+			$fileIndexOpen = -1;
+		}
+		$fileName = $ARGV[$fileIndex];
+		open(FILE, "<$fileName") || die;
+		$fileIndexOpen = $fileIndex;
+	}
+	seek(FILE, $filePos, 0) || die($!);
+
+	$_ = <FILE>;
+	print;
+	while (<FILE>)
+	{
+		last if /^\f?-\S/;
+		print;
+	}
+	$lastFileNumber = $fileNumber;
+	$lastPageNumber = $pageNumber;
+
+	if ($flags & 1)		# Bit 0 of flags indicates last page of file
+	{
+		$nextFileNumber++;
+		$nextPageNumber = 1;
+	}
+	else
+	{
+		$nextPageNumber++;
+	}
+}
+
+if ($nextPageNumber != 1)
+{
+	print STDERR "MISSING: File $nextFileNumber, ",
+				 "pages $nextPageNumber - END\n";
+	$nextPageNumber = 1;
+	$nextFileNumber++;
+	$result = 1;
+}
+
+print STDERR "Highest file number encountered: ", $nextFileNumber - 1, "\n";
+
+if ($fileIndexOpen >= 0)
+{
+	close(FILE) || die;
+	$fileIndexOpen = -1;
+}
+
+exit($result);
+
+#
+# vi: ai ts=4
+# vim: si
+#
--- a/tools/subst.c
+++ b/tools/subst.c
@ -0,0 +1,222 @@
+/*
+ * subst.c -- Repair substitution tables
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Colin Plumb
+ *
+ * $Id: subst.c,v 1.14 1997/11/03 22:12:00 colin Exp $
+ *
+ * IT IS EXPECTED that users of this program will play with these tables
+ * and the cost values in the subst.h header.  (Some day, they'll all
+ * get moved to an external config file.)
+ *
+ * NOTE: Other cost are hiding in the Filter functions in repair.c.
+ * Remember to keep them all on the same scale.
+ */
+
+/*
+ * The repair program copies its input to its output, making various
+ * substitutions, until it manages to produce a version that satisfies
+ * the parser.  This includes having a correct CRC for each line.
+ * Each substitution has a cost, and the combinations are tried in order
+ * of increasing cost.  NOTE that even translating "A"->"A" counts as
+ * a substitution, although it may have zero cost.
+ *
+ * The intention is to correct transcription errors, where the
+ * errors have a distinctly non-uniform distribution.  Slight
+ * differences in cost produce a preference in trying some errors
+ * first.  If an error costs half as much as another, combinations
+ * of two of that error will be compared to one of the more expensive.
+ * Too many cheap substitutions will result is repair spending
+ * a very log time searching before considering the more expensive
+ * substitutions.
+ *
+ * The following parameters and the raw substitution tables are expected
+ * to be edited by the user based on experience.  Eventually, this
+ * will be moved into an external config file, but for now it's a matter
+ * of recompiling.
+ */
+
+#include "subst.h"
+#include "util.h"
+
+/* what the OCR software reports for "unrecognizable */
+#define UNRECOG_STRING "~\274"
+
+/*
+ * The input substitutions to make (one-to-one).   These are listed in
+ * the order of correction. i.e. uncorrected input first, then corrected
+ * output.  Substitutions are one-way; to get two-way, list it twice.
+ */
+
+struct RawSubst const substSingles[] = {
+	/* Identity substitutions - note that period (.) is excluded */
+	{ "!\"#$%&'()*+,-./0123456789:;<=>?" SPACE_STRING,
+	  "!\"#$%&'()*+,-./0123456789:;<=>?" SPACE_STRING, 0, 0, NULL },
+	{ "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_\t" TAB_STRING,
+	  "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_\t" TAB_STRING, 0, 0, NULL },
+	{ "`abcdefghijklmnopqrstuvwxyz{|}~\f" FORMFEED_STRING,
+	  "`abcdefghijklmnopqrstuvwxyz{|}~\f" FORMFEED_STRING, 0, 0, NULL },
+#if (TAB_PAD_CHAR & 128)	/* Not already included? */
+	{ TAB_PAD_STRING, TAB_PAD_STRING, 0, NULL },
+#endif
+	{ "\r\n" CONTIN_STRING, "\n\n" CONTIN_STRING, 0, 0, NULL },
+
+	/* Occasionally these just get inserted as glitches */
+	{ ".,'`", NULL, 5, 10, FilterNearBlanks },
+	/* This is now pretty infrequent */
+	{ "-_", "_-", 0, 10, FilterAfterRepeat },
+
+	/*
+	 * Capitalization errors are common in some cases
+	 * c/C, s/S, u/U are fucked up all the time.
+	 * Also o/O, v/V and w/W.  x, y and z also give some problems.
+	 */
+	{ "cilmopsuvwxyz", "CILMOPSUVWXYZ", 7, 13, FilterNearLower },
+	{ "CILMOPSUVWXYZ", "cilmopsuvwxyz", 7, 13, FilterNearUpper },
+	/* Other errors */
+	{ "g9aaiji;xX00Si", "9gg2ji;i%%oO3f", 10, 0, NULL },
+	/* This seems to happen a lot */
+	{ "c", "r", 9, 0, NULL },
+
+	{ "j", ";", 9, 0, NULL },
+	{ "' ", "``", 10, 0, NULL },
+
+	/* Uncommon errors */
+
+	/* Wierd stuff that's happened in the checksum part */
+	/* A highish weight is okay here */
+	{ "sSEdJl", "554437", 15, 0, NULL },
+	{ "LESsPZ", "bb8a22", 15, 0, NULL },
+
+	/* Wierd stuff that has happened */
+	{ "BasAeaeRoooo", "3334a@QQpqbd", 5, 15, FilterIsBinary },
+	{ "oooo", "pqbd", 0, 15, FilterIsBinary },
+	{ "ttTCCflO", "iff{[lfG", 12, 0, NULL },
+#if 0
+	/* If the line-breaks get screwed up, use these */
+	{ " ", "\n", 10, COST_INFINITY, FilterChecksumFollows },
+	{ "\n", " ", COST_INFINITY, 10, FilterChecksumFollows },
+	{ "\n", NULL, COST_INFINITY , 11, FilterChecksumFollows },
+#endif
+
+{ NULL, NULL, 0, 0, NULL }
+};
+
+/* The many-to-many substitutions */
+struct RawSubst const substMultiples[] = {
+	{ "''", "\"", 2, 0, NULL },
+	{ "``", "\"", 2, 0, NULL },
+	{ ",'", "\"", 2, 0, NULL },
+	{ "',", "\"", 2, 0, NULL },
+	{ ",,", "\"", 2, 0, NULL },
+	/* Extra inserted spaces are common */
+	{ " ", " ", COST_INFINITY,  0, FilterFollowsSpace },
+	{ " ", "", 0, 15, FilterFollowsSpace },
+	{ "\t", " ", COST_INFINITY,  0, FilterFollowsSpace },
+	{ "\t", "", 0, 10, FilterFollowsSpace },
+	/* Convert between SPACE_CHAR dots and periods */
+	{ ".", SPACE_STRING, 1, COST_INFINITY, FilterFollowsSpace },
+	{ ".", " "SPACE_STRING, COST_INFINITY, 10, FilterFollowsSpace },
+	{ SPACE_STRING, ".", 15, 5, FilterFollowsSpace },
+	{ SPACE_STRING, " "SPACE_STRING, COST_INFINITY, 5, FilterFollowsSpace },
+
+	/* Replace "unknown" by zero - it often is */
+	{ UNRECOG_STRING, "0", 1, 0, NULL },
+	{ UNRECOG_STRING, "_", 2, 0, NULL },
+	{ UNRECOG_STRING, ")", 3, 0, NULL },
+	{ UNRECOG_STRING, "^", 4, 0, NULL },
+	/* Except that these glitches are common */
+	{ UNRECOG_STRING"'", "\\\"", 0, 0, NULL },
+	{ UNRECOG_STRING"'", "\"", 1, 0, NULL },
+	{ "'"UNRECOG_STRING, "\"", 0, 0, NULL },
+	{ UNRECOG_STRING UNRECOG_STRING , "\"", 0, 0, NULL },
+	/* Something else that has been seen */
+	{ "V'", "\\\"", 5, 0, NULL },
+
+	/* A common transposition */
+	{ "\"'", "'\"", 5, 0, NULL },
+	{ "'\"", "\"'", 5, 0, NULL },
+	/* These also happen fairly often */
+	{ " \"", "''", 5, 0, NULL },
+	{ "\" ", "''", 5, 0, NULL },
+
+	/* Common glitches */
+	{ "\t.\n", "\n", 5, 0, NULL },
+	{ "\t,\n", "\n", 5, 0, NULL },
+	{ "\t-\n", "\n", 5, 0, NULL },
+	{ "\t_\n", "\n", 5, 0, NULL },
+	{ "\t'\n", "\n", 5, 0, NULL },
+	{ "\t`\n", "\n", 5, 0, NULL },
+	{ "\t~\n", "\n", 5, 0, NULL },
+	{ "\t:\n", "\n", 5, 0, NULL },
+	{ "\t"SPACE_STRING"\n", "\n", 5, 0, NULL },
+
+	/* Less common */
+	{ " .\n", "\n", 10, 0, NULL },
+	{ " ,\n", "\n", 10, 0, NULL },
+	{ " -\n", "\n", 10, 0, NULL },
+	{ " _\n", "\n", 10, 0, NULL },
+	{ " '\n", "\n", 10, 0, NULL },
+	{ " `\n", "\n", 10, 0, NULL },
+	{ " ~\n", "\n", 10, 0, NULL },
+	{ " :\n", "\n", 10, 0, NULL },
+	{ " "SPACE_STRING"\n", "\n", 10, 0, NULL },
+
+	/* Even less common */
+	{ ".\n", "\n", 15, 0, NULL },
+	{ ",\n", "\n", 15, 0, NULL },
+	{ "-\n", "\n", 15, 0, NULL },
+	{ "_\n", "\n", 15, 0, NULL },
+	{ "'\n", "\n", 15, 0, NULL },
+	{ "`\n", "\n", 15, 0, NULL },
+	{ "~\n", "\n", 15, 0, NULL },
+	{ ":\n", "\n", 15, 0, NULL },
+	{ SPACE_STRING"\n", "\n", 15, 0, NULL },
+
+	/* Wierd stuff that has happened */
+	{ "lJ", "U", 10, 0, NULL },
+	{ "ll", "U", 10, 0, NULL },
+	{ "l1", "U", 10, 0, NULL },
+	{ "il", "U", 10, 0, NULL },	/* Fairly common, actually */
+	{ "li", "U", 10, 0, NULL },
+	{ "l)", "U", 10, 0, NULL },
+	{ "Ll", "U", 10, 0, NULL },
+	{ "LI", "U", 10, 0, NULL },
+	{ "L1", "U", 10, 0, NULL },
+
+	{ "lo", "b", 10, 0, NULL },
+	{ "cl", "d", 10, 0, NULL },
+	{ "cliff", "diff", 2, 0, NULL },
+	{ "*\n", "*/\n", 10, 0, NULL },
+
+	/* That big black block has odd things happen to it */
+	{ "d", CONTIN_STRING, 10, 0, NULL },
+	{ "d\n", CONTIN_STRING"\n", 3, 0, NULL },
+	{ "S", CONTIN_STRING, 10, 0, NULL },
+	{ "S\n", CONTIN_STRING"\n", 3, 0, NULL },
+
+	/* Tab-stop wonders */
+	{ TAB_STRING, TAB_STRING"", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING" ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"  ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"   ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"    ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"     ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"      ", 0, 0, TabFilter },
+	{ TAB_STRING, TAB_STRING"       ", 0, 0, TabFilter },
+	/* Some scan errors */
+	{ "D ", TAB_STRING"", 1, 5, TabFilter },
+	{ "D ", TAB_STRING" ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"  ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"   ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"    ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"     ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"      ", 1, 5, TabFilter },
+	{ "D ", TAB_STRING"       ", 1, 5, TabFilter },
+#if TAB_PAD_CHAR != ' '
+#error Fix those tab patterns!
+#endif
+{ NULL, NULL, 0, 0, NULL }
+};
--- a/tools/subst.h
+++ b/tools/subst.h
@ -0,0 +1,66 @@
+/*
+ * subst.h -- Header for repair substitutions
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Colin Plumb
+ *
+ * $Id: subst.h,v 1.9 1997/11/03 22:12:00 colin Exp $
+ */
+
+/*
+ * Give up if the list of pending changes to attempt grows to this many
+ * elements.  Each element is 32 bytes, so 128K is 8 MB of memory.
+ * (Other than this, repair's memory usage is fairly modest.)
+ */
+#define MAX_HEAP (1<<17)
+
+/*
+ * There is a hack in the code to find a single substitution that will fix a
+ * line, even if it's not in the tables.  It gets added to the tables "on
+ * probation", with an infinite cost, and if it leads to a successful
+ * correction of the entire page, is "learned" for future use and its
+ * cost reduced to something finite.
+ * (This is not remembered across runs of the program, though.
+ * Edit the tables in the source to fix it.)
+ */
+#define DYNAMIC_COST_LEARNED 15
+
+/*
+ * This negative-cost bonus for passing the end of a line with the right
+ * CRC makes the search engine reluctant to backtrack past a correct CRC,
+ * greatly improving efficiency.  It's rather a hack, though.  Think of
+ * this in terms of "how many errors should be considered in the current
+ * line before considering the possibility of errors in the previous line?"
+ *
+ * This bonus is halved for lines that are the result of a correction
+ * that was computed from the checksum, since a correct checksum is
+ * much less significant in such a case.
+ */
+#define COST_LINE -30
+
+/* The cost of a full-line nastyline substitution. */
+#define NASTY_COST 5
+
+/* Type describing filter functions used in substitutions */
+struct ParseNode;
+struct Substitution;
+#include "heap.h"
+typedef HeapCost FilterFunc(struct ParseNode *parent, char const *limit,
+	struct Substitution const *subst);
+FilterFunc TabFilter,              FilterFollowsSpace, FilterNearBlanks;
+FilterFunc FilterNearUpper,        FilterNearLower,    FilterNearXDigit;
+FilterFunc FilterAfterRepeat,      FilterCharConst,    FilterChecksumFollows;
+FilterFunc FilterLikelyUnderscore, FilterIsDynamic,    FilterIsBinary;
+
+/* The external substitution format */
+typedef struct RawSubst {
+	char const *input;
+	char const *output;
+	HeapCost cost, cost2;
+	FilterFunc *filter;
+} RawSubst;
+
+/* The substitutions to make */
+extern struct RawSubst const substSingles[];
+extern struct RawSubst const substMultiples[];
--- a/tools/unmunge.c
+++ b/tools/unmunge.c
@ -0,0 +1,666 @@
+/*
+ * unmunge.c -- Program to convert a munged file to original form
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Designed by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann
+ * Written by Mark H. Weaver
+ *
+ * $Id: unmunge.c,v 1.13 1997/11/13 23:27:08 mhw Exp $
+ */
+
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+/*#include <direct.h>   teun: MS VC wants direct.h for mkdir */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <ctype.h>
+#include <stdlib.h>
+#include <assert.h>
+
+#include "util.h"
+
+typedef struct UnMungeState
+{
+	char const *	mungedFileName;
+	char			dirName[128];
+	char			fileName[128];
+	char *			fileNameTail;
+	int				binaryMode, tabWidth;
+	long			productNumber, fileNumber, pageNumber, lineNumber;
+	long			manifestLineNumber;
+	word16			hdrFlags;
+	CRC				pageCRC, seenPageCRC;
+	FILE *			manifest;
+	FILE *			file;
+	FILE *			out;
+} UnMungeState;
+
+
+/* Returns number of characters decoded, or -1 on error */
+static int
+Decode4(char const src[4], byte dest[3])
+{
+	int		i, length;
+	byte	srcVal[4];
+
+	for (i = 0; i < 4 && src[i] != RADIX64_END_CHAR; i++)
+		if ((srcVal[i] = Radix64DigitValue(src[i])) == (byte) -1)
+			return 1;
+
+	length = i - 1;
+	if (length < 1)
+		return -1;
+
+	for (; i < 4; i++)
+		srcVal[0] = 0;
+
+	dest[0] = (srcVal[0] << 2) | (srcVal[1] >> 4);
+	dest[1] = (srcVal[1] << 4) | (srcVal[2] >> 2);
+	dest[2] = (srcVal[2] << 6) | (srcVal[3]);
+
+	return length;
+}
+
+/*
+ * Return number of characters decoded, or -1 on error
+ */
+static int
+DecodeLine(char const *src, char *dest, int srclength)
+{
+	int destlength = 0;
+	int result;
+
+	if (srclength % 4 || !srclength)
+		return -1;	/* Must be a multiple of 4 */
+
+	while (srclength -= 4) {
+		if (Decode4(src, dest + destlength) != 3)
+			return -1;
+		src += 4;
+		destlength += 3;
+	}
+	result = Decode4(src, dest + destlength);
+	if (result < 1)
+		return -1;
+	return destlength + result;
+}
+
+int PrintFileError(UnMungeState *state, char const *message)
+{
+	fprintf(stderr, "%s, %s line %ld\n", message,
+			state->mungedFileName, state->lineNumber);
+	return 1;
+}
+
+int ReadManifest(UnMungeState *state, long fileNumberWanted,
+				 char const *fileTailPrefix, long prefixLen)
+{
+	long		fileNumber = 0;
+	long		firstMissingFileNum = 0, lastMissingFileNum = 0;
+	char		buffer[512];
+	char *		p;
+
+	if (state->manifest == NULL)
+	{
+		if (fileNumberWanted != 0)
+		{
+			assert(fileTailPrefix != NULL);
+			strncpy(state->fileName, fileTailPrefix, sizeof(state->fileName));
+			state->fileName[sizeof(state->fileName) - 1] = '\0';
+			state->fileNameTail = state->fileName;
+		}
+		return 0;
+	}
+	while (fgets(buffer, sizeof(buffer), state->manifest))
+	{
+		if ((p = strchr(buffer, '\n')) != NULL)
+			*p = '\0';
+		state->manifestLineNumber++;
+		if (buffer[0] == 'D')
+		{
+			if (buffer[1] != ' ')
+				goto invalidManifest;
+			strncpy(state->dirName, buffer + 2, sizeof(state->dirName));
+			if (state->dirName[sizeof(state->dirName) - 1] != '\0')
+				goto invalidManifest;
+		}
+		else
+		{
+			fileNumber = strtol(buffer, &p, 10);
+			if (p == buffer || *p != ' ')
+				goto invalidManifest;
+			p++;
+
+			if (fileNumberWanted == 0 || fileNumber < fileNumberWanted)
+			{
+				if (firstMissingFileNum == 0)
+					firstMissingFileNum = fileNumber;
+				lastMissingFileNum = fileNumber;
+				continue;
+			}
+			else if (fileNumber > fileNumberWanted)
+				break;
+			else
+			{
+				size_t		len;
+
+				len = strlen(state->dirName);
+				assert(sizeof(state->fileName) >= sizeof(state->dirName));
+				memcpy(state->fileName, state->dirName, len);
+				strncpy(state->fileName + len, p,
+						sizeof(state->fileName) - len);
+				if (strncmp(p, fileTailPrefix, prefixLen) != 0)
+				{
+					fprintf(stderr, "Mismatched filename, headers say '%s',\n"
+							"  manifest says '%s'\n",
+							fileTailPrefix, p);
+					return 1;
+				}
+				p = state->dirName;
+				while ((p = strchr(p, '/')) != NULL)
+				{
+					*p = '\0';
+					mkdir(state->dirName, 0777);
+					*p++ = '/';
+				}
+				state->fileNameTail = state->fileName + len;
+				break;
+			}
+		}
+	}
+	if (firstMissingFileNum != 0)
+	{
+		fprintf(stderr, "Missing files %ld-%ld\n",
+				firstMissingFileNum, lastMissingFileNum);
+	}
+	if (fileNumberWanted != 0 && fileNumber != fileNumberWanted)
+	{
+		fprintf(stderr, "Can't find file %ld in manifest file\n",
+				fileNumberWanted);
+		return 1;
+	}
+	return 0;
+
+invalidManifest:
+	fprintf(stderr, "Error parsing manifest file, line %ld\n",
+			state->manifestLineNumber);
+	return 1;
+}
+
+int UnMungeFile(char const *mungedFileName, char const *manifestFileName,
+				int forceOverwrite, int forcePartialFiles)
+{
+	UnMungeState *	state;
+	EncodeFormat const *	fmt = NULL;
+	char			buffer[512];
+	char			outbuf[BYTES_PER_LINE+1];
+	char *			line;
+	char *			lineData;
+	char *			p;
+	int				length;
+	int				result = 0;
+	int				skipPage = 0;
+	CRC				lineCRC;
+	word32			num;
+
+	state = (UnMungeState *)calloc(1, sizeof(*state));
+	state->mungedFileName = mungedFileName;
+
+	if (manifestFileName != NULL)
+	{
+		if ((state->manifest = fopen(manifestFileName, "r")) == NULL)
+			goto errnoError;
+	}
+
+	if ((state->file = fopen(state->mungedFileName, "r")) == NULL)
+		goto errnoError;
+
+	while (!feof(state->file))
+	{
+		if (fgets(buffer, sizeof(buffer), state->file) == NULL)
+		{
+			if (feof(state->file))
+				break;
+			goto fileError;
+		}
+
+		state->lineNumber++;
+
+		line = buffer;
+		/* Strip leading whitespace */
+		while (isspace(*line))
+			line++;
+		if (*line == '\0')
+			continue;
+
+		/* Strip trailing whitespace */
+		p = line + strlen(line);
+		while (p > line && (byte)p[-1] < 128 && isspace(p[-1]))
+			p--;
+
+		lineData = line + PREFIX_LENGTH;
+
+		/* Pad up to at least PREFIX_LENGTH */
+		while (p < lineData)
+			*p++ = ' ';
+		*p++ = '\n';
+		*p = '\0';
+		length = p - lineData;
+
+		if (line[0] == HDR_PREFIX_CHAR)
+		{
+			fmt = FindFormat(line[1]);
+			if (!fmt)
+			{
+				result = PrintFileError(state, "ERROR: Invalid header type");
+				goto error;
+			}
+		}
+
+		lineCRC = CalculateCRC(fmt->lineCRC, 0, (byte const *)lineData, length);
+
+		p = line + EncodedLength(fmt, fmt->runningCRCBits);
+		if (DecodeCheckDigits(fmt, p, NULL, fmt->lineCRC->bits, &num)
+				|| lineCRC != num)
+		{
+			result = PrintFileError(state, "ERROR: Line CRC failed");
+			goto error;
+		}
+
+		if (line[0] == HDR_PREFIX_CHAR)
+		{
+			int			formatVersion;
+			int			flags;
+			CRC			seenPageCRC;
+			int			tabWidth;
+			long		productNumber;
+			long		fileNumber;
+			long		pageNumber;
+			char *		fileNameTail;
+			int			skipNextPage = 0;
+			char *		p;
+			EncodeFormat const *	hFmt = &hexFormat;
+
+			/* Parse header line */
+			p = lineData;
+
+			if (DecodeCheckDigits(hFmt, p, &p, HDR_VERSION_BITS, &num))
+			{
+			invalidHeader:
+				result = PrintFileError(state, "ERROR: Invalid header");
+				goto error;
+			}
+			formatVersion = num;
+
+			if (DecodeCheckDigits(hFmt, p, &p, HDR_FLAG_BITS, &num))
+				goto invalidHeader;
+			flags = num;
+
+			if (DecodeCheckDigits(hFmt, p, &p, fmt->pageCRC->bits, &num))
+				goto invalidHeader;
+			seenPageCRC = num;
+
+			if (DecodeCheckDigits(hFmt, p, &p, HDR_TABWIDTH_BITS, &num))
+				goto invalidHeader;
+			tabWidth = num;
+
+			if (DecodeCheckDigits(hFmt, p, &p, HDR_PRODNUM_BITS, &num))
+				goto invalidHeader;
+			productNumber = num;
+
+			if (DecodeCheckDigits(hFmt, p, &p, HDR_FILENUM_BITS, &num))
+				goto invalidHeader;
+			fileNumber = num;
+
+			if (sscanf(p, " Page %ld of ", &pageNumber) < 1)
+				goto invalidHeader;
+
+			if (formatVersion > 0)
+			{
+				result = PrintFileError(state,
+										"ERROR: Format too new for "
+											"this version of unmunge");
+				goto error;
+			}
+
+			p = strstr(p, " of ");
+			if (p == NULL)
+				goto invalidHeader;
+
+			fileNameTail = p + 4;
+			p = fileNameTail + strlen(fileNameTail);
+			if (p < fileNameTail + 3 || p[-1] != '\n')
+				goto invalidHeader;
+			else
+				p[-1] = '\0';
+
+			if (state->out != NULL && state->pageCRC != state->seenPageCRC)
+			{
+				result = PrintFileError(state,
+								"ERROR: Page CRC mismatch on page before");
+				goto error;
+			}
+
+			if ((state->hdrFlags & HDR_FLAG_LASTPAGE) && state->out != NULL)
+			{
+				fclose(state->out);
+				state->out = NULL;
+			}
+
+			if (state->out != NULL)
+			{
+				if (pageNumber != state->pageNumber + 1 ||
+						fileNumber != state->fileNumber ||
+						productNumber != state->productNumber ||
+						tabWidth != state->tabWidth ||
+						strcmp(fileNameTail, state->fileNameTail) != 0)
+				{
+					if (fileNumber == state->fileNumber &&
+							pageNumber > state->pageNumber + 1)
+					{
+						(void)PrintFileError(state,
+									"ERROR: Missing pages of this file");
+						if (forcePartialFiles && !state->binaryMode)
+						{
+							fputs("\n\n@@@@@@ Missing pages here! @@@@@@\n\n",
+								  state->out);
+						}
+						else
+						{
+							skipNextPage = 1;
+							fclose(state->out);
+							state->out = NULL;
+							remove(state->fileName);
+						}
+					}
+					else
+					{
+						(void)PrintFileError(state,
+									"ERROR: Missing pages of previous file");
+						if (forcePartialFiles && !state->binaryMode)
+						{
+							fputs("\n\n@@@@@@ Missing pages here! @@@@@@\n\n",
+								  state->out);
+							/* Make it non-fatal, though... */
+							fclose(state->out);
+							state->out = NULL;
+						}
+						else
+						{
+							fclose(state->out);
+							state->out = NULL;
+							remove(state->fileName);
+						}
+					}
+				}
+			}
+			if (state->out == NULL)
+			{
+				if (pageNumber != 1 && !skipPage)
+					(void)PrintFileError(state,
+							 "ERROR: File doesn't begin with page 1");
+
+				state->binaryMode = (tabWidth == 0);
+
+				if (pageNumber != 1 && (state->binaryMode
+										|| !forcePartialFiles))
+				{
+					skipNextPage = 1;
+				}
+				else
+				{
+					/* TODO: Use global filelist to get pathname */
+					result = ReadManifest(state, fileNumber, fileNameTail,
+										  strlen(fileNameTail));
+					if (result != 0)
+						goto error;
+
+					if (!forceOverwrite)
+					{
+						FILE *	file;
+
+						/* Make sure file doesn't already exist */
+						file = fopen(state->fileName, "r");
+						if (file != NULL)
+						{
+							fclose(file);
+							fprintf(stderr, "ERROR: %s already exists\n",
+									state->fileName);
+							result = 1;
+							goto error;
+						}
+					}
+
+					state->out = fopen(state->fileName,
+									   state->binaryMode ? "wb" : "w");
+					if (state->out == NULL)
+						goto errnoError;
+
+					if (pageNumber != 1)
+						fputs("\n\n@@@@@@ Missing pages here! @@@@@@\n\n",
+							  state->out);
+				}
+			}
+
+			state->pageCRC = 0;
+			state->seenPageCRC = seenPageCRC;
+			state->hdrFlags = (word16)flags;
+			state->pageNumber = pageNumber;
+			state->fileNumber = fileNumber;
+			state->productNumber = productNumber;
+			state->tabWidth = tabWidth;
+			skipPage = skipNextPage;
+		}
+		else if (!skipPage)
+		{
+			if (state->out == NULL)
+			{
+				result = PrintFileError(state, "ERROR: Missing header line");
+				goto error;
+			}
+
+			/* Normal data line */
+			state->pageCRC = CalculateCRC(fmt->pageCRC, state->pageCRC,
+											   (byte const *)lineData,
+											   length);
+			line[2] = '\0';
+			if (DecodeCheckDigits(fmt, line, NULL, fmt->runningCRCBits, &num)
+				|| RunningCRCFromPageCRC(fmt, state->pageCRC) != num)
+			{
+				result = PrintFileError(state, "ERROR: Running CRC failed");
+				goto error;
+			}
+
+			if (state->binaryMode)
+			{
+				length = DecodeLine(lineData, outbuf, length-1);
+				if (length < 0 || length > BYTES_PER_LINE) {
+					result = PrintFileError(state,
+									"ERROR: Corrupt radix-64 data");
+					goto error;
+				}
+				fwrite(outbuf, 1, length, state->out);
+			}
+			else
+			{
+				p = lineData;
+				while (*p != '\0')
+				{
+					if (*p == TAB_CHAR)
+					{
+						p++;
+						putc('\t', state->out);
+						while ((p - lineData) % state->tabWidth)
+						{
+							if (*p == '\n')
+								break;
+							else if (*p == ' ')
+								p++;
+							else
+							{
+								result = PrintFileError(state,
+												"ERROR: Not enough spaces "
+												"after a tab character");
+								goto error;
+							}
+						}
+					}
+					else if (*p == FORMFEED_CHAR)
+					{
+						p++;
+						if (*p != '\n')
+						{
+							result = PrintFileError(state,
+											"ERROR: Formfeed character "
+											"not at end of line");
+							goto error;
+						}
+						p++;	/* Skip newline */
+						putc('\f', state->out);
+					}
+					else if (*p == CONTIN_CHAR)
+					{
+						p++;
+						if (*p != '\n')
+						{
+							result = PrintFileError(state,
+											"ERROR: Continuation character "
+											"not at end of line");
+							goto error;
+						}
+						p++;	/* Skip newline */
+					}
+					else if (*p == SPACE_CHAR)
+					{
+						putc(' ', state->out);
+						p++;
+					}
+					else
+					{
+						putc(*p, state->out);
+						p++;
+					}
+				}
+			}
+		}
+	}
+	if (state->out != NULL)
+	{
+		if (!(state->hdrFlags & HDR_FLAG_LASTPAGE))
+		{
+			result = PrintFileError(state, "ERROR: Missing pages");
+			goto error;
+		}
+		if (state->pageCRC != state->seenPageCRC)
+		{
+			result = PrintFileError(state,
+							"ERROR: Page CRC failed on previous page");
+			goto error;
+		}
+	}
+
+	/* Check for missing files at the end */
+	result = ReadManifest(state, 0, NULL, 0);
+	goto done;
+
+errnoError:
+	result = errno;
+	goto printError;
+
+fileError:
+	result = ferror(state->file);
+
+printError:
+	fprintf(stderr, "ERROR: %s\n", strerror(result));
+
+error:
+done:
+	if (state != NULL)
+	{
+		if (state->out != NULL)
+			fclose(state->out);
+		if (state->file != NULL)
+			fclose(state->file);
+		if (state->manifest != NULL)
+			fclose(state->manifest);
+		free(state);
+	}
+	return result;
+}
+
+void UsageAndExit(int result)
+{
+	fprintf(stderr,
+			"Usage: unmunge [-fp] <file> [<manifest>]\n"
+			"  -f  Force overwrites of existing files\n"
+			"  -p  Force unmunge of partial files\n");
+	exit(result);
+}
+
+int main(int argc, char *argv[])
+{
+	int		result = 0;
+	int		forceOverwrite = 0;
+	int		forcePartialFiles = 0;
+	char *	fileName = NULL;
+	char *	manifestFileName = NULL;
+	int		i, j;
+
+	InitUtil();
+
+	for (i = 1; i < argc && argv[i][0] == '-'; i++)
+	{
+		if (0 == strcmp(argv[i], "--"))
+		{
+			i++;
+			break;
+		}
+		for (j = 1; argv[i][j] != '\0'; j++)
+		{
+			if (argv[i][j] == 'h')
+				UsageAndExit(0);
+			else if (argv[i][j] == 'f')
+				forceOverwrite = 1;
+			else if (argv[i][j] == 'p')
+				forcePartialFiles = 1;
+			else
+			{
+				fprintf(stderr, "ERROR: Unrecognized option -%c\n", argv[i][j]);
+				UsageAndExit(1);
+			}
+		}
+	}
+
+	if (i < argc)
+		fileName = argv[i++];
+	if (i < argc)
+		manifestFileName = argv[i++];
+	if (fileName == NULL || i < argc)
+		UsageAndExit(1);
+
+	if ((result = UnMungeFile(fileName, manifestFileName,
+							  forceOverwrite, forcePartialFiles)) != 0)
+	{
+		/* If result > 0, message should have already been printed */
+		if (result < 0)
+			fprintf(stderr, "ERROR: %s\n", strerror(result));
+		exit(1);
+	}
+
+	return 0;
+}
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
+
--- a/tools/util.c
+++ b/tools/util.c
@ -0,0 +1,198 @@
+/*
+ * util.c -- Miscellaneous shared code/data
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Mark H. Weaver
+ *
+ * $Id: util.c,v 1.11 1997/11/07 00:44:10 mhw Exp $
+ */
+
+#include <stdlib.h>
+#include "util.h"
+
+char const hexDigits[] = "0123456789abcdef";
+char const radix64Digits[] =
+#if 0	/* Standard */
+	"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
+#else	/* Modified form that avoids hard-to-OCR characters */
+	"ABCDEFGHIJKLMNPQRSTVWXYZabcdehijklmnpqtuwy145689\\^!#$%&*+=/:<>?@";
+#endif
+
+signed char hexDigitsInv[256];
+signed char radix64DigitsInv[256];
+
+/* teun: moved intitialisation of all three CRCPoly's to initUtil() */
+
+/* CRC-CCITT: x^16 + x^12 + x^5 + 1 */
+CRCPoly	crcCCITTPoly;
+/*
+ * PRZ's magic 24-bit polynomial - (x+1) * (irreducible of degree 23)
+ * x^24 +x^23 +x^18 +x^17 +x^14 +x^11 +x^10 +x^7 +x^6 +x^5 +x^4 +x^3 +x +1
+ * (Developed by Neal Glover).  Note: this is bit-reversed from the form
+ * used in PGP, 0x1864cfb.
+ */
+CRCPoly	crc24Poly;
+/* CRC-32: x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 */
+CRCPoly	crc32Poly;
+
+EncodeFormat const	hexFormat =
+{
+	NULL,				/* nextFormat */
+	'-',				/* headerTypeChar */
+	hexDigits,			/* digits */
+	hexDigitsInv,		/* digitsInv */
+	4,					/* bitsPerDigit */
+	16,					/* radix */
+	&crcCCITTPoly,		/* lineCRC */
+	&crc32Poly,			/* pageCRC */
+	8,					/* runningCRCBits */
+	24,					/* runningCRCShift */
+	0xFF				/* runningCRCMask */
+};
+
+EncodeFormat const	radix64Format =
+{
+	&hexFormat,			/* nextFormat */
+	'A',				/* headerTypeChar */
+	radix64Digits,		/* digits */
+	radix64DigitsInv,	/* digitsInv */
+	6,					/* bitsPerDigit */
+	64,					/* radix */
+	&crc24Poly,			/* lineCRC */
+	&crc32Poly,			/* pageCRC */
+	12,					/* runningCRCBits */
+	20,					/* runningCRCShift */
+	0xFFF				/* runningCRCMask */
+};
+
+EncodeFormat const *	firstFormat = &radix64Format;
+
+
+static void InitCRCPoly(CRCPoly *poly)
+{
+	int		i, oneBit;
+	CRC		crc = 1;
+
+	poly->table[0] = 0;
+	for (oneBit = 0x80; oneBit > 0; oneBit >>= 1) {
+		crc = (crc >> 1) ^ ((crc & 1) ? poly->poly : 0);
+		for (i = 0; i < 0x100; i += 2 * oneBit)
+			poly->table[i + oneBit] = poly->table[i] ^ crc;
+	}
+}
+
+CRC CalculateCRC(CRCPoly const *poly, CRC crc,
+				 byte const *buffer, size_t length)
+{
+	while (length--)
+		crc = (crc >> 8) ^ poly->table[(crc & 0xFF) ^ (*buffer++)];
+	return crc;
+}
+
+CRC ReverseCRC(CRCPoly const *poly, CRC crc, byte b)
+{
+	int		i, highBit = poly->highBit;
+
+	for (i = 0; i < 8; i++) {
+		if (crc & highBit)		/* highBit is 2^(poly->bits-1) */
+			crc = ((crc ^ poly->poly) << 1) ^ 1;
+		else
+			crc <<= 1;
+	}
+	return crc ^ b;
+}
+
+static void InitDigitsInv(char const *digits, signed char *digitsInv)
+{
+	int		i;
+
+	for (i = 0; i < 256; i++)
+		digitsInv[i] = -1;
+	for (i = 0; digits[i]; i++)
+		digitsInv[(byte)digits[i]] = i;
+}
+
+/* Returns the number of chars encoded */
+int EncodeCheckDigits(EncodeFormat const *fmt, word32 num,
+					  int numBits, char *dest)
+{
+	int		destLen = EncodedLength(fmt, numBits);
+	word32	digitMask = fmt->radix - 1;
+	int		i;
+
+	for (i = destLen - 1; i >= 0; i--)
+	{
+		dest[i] = EncodeDigit(fmt, num & digitMask);
+		num >>= fmt->bitsPerDigit;
+	}
+	return destLen;
+}
+
+/* Returns 1 if there's an error */
+int DecodeCheckDigits(EncodeFormat const *fmt, char const *src, char **endPtr,
+					  int numBits, word32 *valuePtr)
+{
+	word32	value = 0;
+	int		digitValue;
+	int		i = EncodedLength(fmt, numBits);
+
+	while (i--)
+	{
+		digitValue = DecodeDigit(fmt, *src++);
+		if (digitValue < 0)
+		{
+			/* Invalid digit found */
+			*valuePtr = 0;
+			if (endPtr)
+				*endPtr = NULL;
+			return 1;
+		}
+		value = (value << fmt->bitsPerDigit) | digitValue;
+	}
+	*valuePtr = value;
+	if (endPtr)
+		*endPtr = (char *)src;
+	return 0;
+}
+
+EncodeFormat const *FindFormat(char headerTypeChar)
+{
+	EncodeFormat const *	fmt = firstFormat;
+
+	while (fmt && fmt->headerTypeChar != headerTypeChar)
+		fmt = fmt->nextFormat;
+	return fmt;
+}
+
+void InitUtil()
+{
+	/* teun: removed "{ }" for MS VC compile */
+
+	crcCCITTPoly.bits = 16;
+	crcCCITTPoly.poly = 0x8408;
+	crcCCITTPoly.highBit = 0x8000;
+
+	crc24Poly.bits = 24;
+	crc24Poly.poly = 0xdf3261;
+	crc24Poly.highBit = 0x800000;
+
+	crc32Poly.bits = 32;
+	crc32Poly.poly = 0xedb88320;
+	crc32Poly.highBit = 0x80000000;
+
+	InitCRCPoly(&crcCCITTPoly);
+	InitCRCPoly(&crc24Poly);
+	InitCRCPoly(&crc32Poly);
+	InitDigitsInv(hexDigits, hexDigitsInv);
+	InitDigitsInv(radix64Digits, radix64DigitsInv);
+}
+
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
--- a/tools/util.h
+++ b/tools/util.h
@ -0,0 +1,149 @@
+/*
+ * util.h -- Miscellaneous defines
+ *
+ * Copyright (C) 1997 Pretty Good Privacy, Inc.
+ *
+ * Written by Mark H. Weaver
+ *
+ * $Id: util.h,v 1.23 1997/11/12 23:28:56 mhw Exp $
+ */
+
+#ifndef UTIL_H
+#define UTIL_H 1
+
+typedef unsigned long	word32;
+typedef unsigned short	word16;
+typedef unsigned char	byte;
+
+#define FMT32	"%08lx"
+#define FMT16	"%04x"
+#define FMT8	"%02x"
+
+#define TAB_CHAR		'\244'	/* Currency symbol, like o in top of x */
+#define TAB_STRING		"\244"
+#define TAB_PAD_CHAR	' '		/* The fact that this is space has leaked. */
+#define TAB_PAD_STRING	" "		/* It may not be freely changed. */
+#define FORMFEED_CHAR	'\245'	/* Yen symbol, like = on top of Y */
+#define FORMFEED_STRING	"\245"
+#define SPACE_CHAR		'\267'	/* Middle dot, or bullet */
+#define SPACE_STRING	"\267"
+#define CONTIN_CHAR		'\266'	/* Pilcrow (paragraph symbol) */
+#define CONTIN_STRING	"\266"
+
+#define BYTES_PER_LINE	60		/* When using radix 64 */
+
+#define LINES_PER_PAGE	72		/* Exclusive of 2 header lines */
+#define LINE_LENGTH		80
+#define PREFIX_LENGTH	7		/* Length of prefix, including the space */
+
+#define HDR_PREFIX_CHAR		'-'
+#define RADIX64_END_CHAR	'-'
+
+typedef struct EncodeFormat		EncodeFormat;
+typedef word32					CRC;
+typedef word16					CRCFragment;
+
+typedef struct
+{
+	CRC			table[256];
+	int			bits;
+	CRC			poly;
+	CRC			highBit;
+} CRCPoly;
+
+struct EncodeFormat
+{
+	EncodeFormat const *nextFormat;
+	char				headerTypeChar;
+	char const *		digits;
+	signed char const *	digitsInv;
+	int					bitsPerDigit;
+	int					radix;
+	CRCPoly const *		lineCRC;
+	CRCPoly	const *		pageCRC;
+	int					runningCRCBits;
+	int					runningCRCShift;
+	int					runningCRCMask;
+};
+
+
+#define HDR_ENC_LENGTH		19		/* Length of encoded prefix on header */
+
+#define HDR_VERSION_BITS	4
+#define HDR_FLAG_BITS		8
+/* Page CRC bits omitted, since it's not constant */
+#define HDR_TABWIDTH_BITS	4
+#define HDR_PRODNUM_BITS	12
+#define HDR_FILENUM_BITS	16
+
+
+/* Enough to hold one whole page of munged data */
+/* There is no point making this excessively too large */
+#define PAGE_BUFFER_SIZE	8192
+
+#if PAGE_BUFFER_SIZE < (LINES_PER_PAGE + 2) * (LINE_LENGTH + PREFIX_LENGTH + 2)
+#error PAGE_BUFFER_SIZE is too small
+#endif
+
+
+/* Header flags */
+#define HDR_FLAG_LASTPAGE	0x01	/* Indicates last page of file */
+
+
+#define elemsof(array) (sizeof(array)/sizeof(*(array)))
+
+
+extern char const	hexDigits[];
+extern char const	radix64Digits[];
+
+extern signed char	hexDigitsInv[256];
+extern signed char	radix64DigitsInv[256];
+
+extern CRCPoly		crcCCITTPoly, crc24Poly, crc32Poly;
+
+extern EncodeFormat const		hexFormat, radix64Format;
+extern EncodeFormat const *		firstFormat;
+
+
+#define HexDigitValue(ch)		hexDigitsInv[(byte)(ch)]
+#define Radix64DigitValue(ch)	radix64DigitsInv[(byte)(ch)]
+
+/* Returns the number of chars needed to encode the given number of bits */
+#define EncodedLength(fmt, numBits)	\
+		(((numBits) + (fmt)->bitsPerDigit - 1) / (fmt)->bitsPerDigit)
+#define EncodeDigit(fmt, value)		((fmt)->digits[value])
+#define DecodeDigit(fmt, digit)		((fmt)->digitsInv[(byte)digit])
+
+#define AdvanceCRC(poly, crc, b)	\
+		((crc) >> 8) ^ (poly)->table[((crc) ^ (b)) & 0xFF]
+
+#define RunningCRCFromPageCRC(fmt, pageCRC)	\
+		(((pageCRC) >> (fmt)->runningCRCShift) & (fmt)->runningCRCMask)
+
+
+CRC CalculateCRC(CRCPoly const *poly, CRC crc,
+				 byte const *buffer, size_t length);
+CRC ReverseCRC(CRCPoly const *poly, CRC crc, byte b);
+
+/* Returns the number of chars encoded */
+int EncodeCheckDigits(EncodeFormat const *fmt, word32 num,
+					  int numBits, char *dest);
+
+/* Returns 1 if there's an error */
+int DecodeCheckDigits(EncodeFormat const *fmt, char const *src, char **endPtr,
+					  int numBits, word32 *valuePtr);
+
+EncodeFormat const *FindFormat(char headerTypeChar);
+
+void InitUtil();
+
+
+#endif /* !UTIL_H */
+
+/*
+ * Local Variables:
+ * tab-width: 4
+ * End:
+ * vi: ts=4 sw=4
+ * vim: si
+ */
--- a/tools/yapp
+++ b/tools/yapp
@ -0,0 +1,286 @@
+#!/usr/bin/perl
+#
+# Yet another preprocessor
+#
+# $Id: yapp,v 1.5 1997/10/24 07:51:05 mhw Exp $
+#
+
+%vars = ('' => '$');
+@incPath = (".");
+
+sub Error
+{
+	print STDERR $_[0], "\n";
+	exit(1);
+}
+
+sub VarSubst
+{
+	my ($varName, $undefOkay) = @_;
+
+	if (defined($vars{$varName}))
+	{
+		return $vars{$varName};
+	}
+	elsif (!$undefOkay)
+	{
+		&Error("Undefined variable '$varName' in $fileName line $.");
+	}
+}
+
+sub NullFilter
+{
+	0;
+}
+
+sub IfFilter
+{
+	local $_ = $_[0];
+
+	if (/^##else(\s+.*)?/)
+	{
+		return 1;
+	}
+	elsif (/^##endif(\s+.*)?/)
+	{
+		return 2;
+	}
+	else
+	{
+		return 0;
+	}
+}
+
+sub DoFile
+{
+    local $fileName = $_[0];
+	my $path;
+	local *FILE;
+
+	if ($fileName =~ m|^/|)
+	{
+		$path = $fileName;
+	}
+	else
+	{
+		for $dir (@incPath)
+		{
+			if (-e "$dir/$fileName")
+			{
+				$path = "$dir/$fileName";
+				last;
+			}
+		}
+	}
+	if ($path eq "")
+	{
+		&Error("Can't find '$fileName', from $fileName line $.");
+	}
+
+	open(FILE, "<$path") || &Error("Can't open $path: $!");
+	&DoOpenFile(*FILE, *NullFilter, 0);
+	close(FILE) || die;
+	0;
+}
+
+sub DoPrepass
+{
+	local ($_, $skipFlag) = @_;
+
+	return "" if /^###/;
+	s/\s*###.*//;								# Strip comments
+	s/\${(\w+)}/&VarSubst($1, $skipFlag)/eg;	# Do variable substitutions
+	$_;
+}
+
+sub DoOpenFile
+{
+	local *FILE = $_[0];
+	local *filter = $_[1];
+	my $skipFlag = $_[2];
+	my $result;
+	local $_;
+
+	while (<FILE>)
+	{
+		$_ = &DoPrepass($_, $skipFlag);
+		if ($result = &filter($_))
+		{
+			return $result;
+		}
+		elsif (/^##(\w*)(\s+(.*))?/)
+		{
+			my ($cmd, $params) = ($1, $3);
+
+			if ($cmd =~ /^if/)
+			{
+				my $condition;
+				my $ifStartLine = $.;
+
+				if ($cmd eq "if")
+				{
+					if ($params =~ /^(\d+)\s*$/)
+					{
+						$condition = int($1);
+					}
+					elsif ($params =~ /^(\d+)\s*([=!]=|[<>]=?)\s*(\d+)\s*$/)
+					{
+						my ($left, $op, $right) = ($1, $2, $3);
+
+						$condition = eval($left . $op . $right);
+					}
+					elsif ($params =~ /^(\S+)\s*(eq|ne)\s*(\S+)\s*$/)
+					{
+						my ($left, $op, $right) = ($1, $2, $3);
+
+						$left =~ s/([\\'])/\\$1/g;
+						$right =~ s/([\\'])/\\$1/g;
+						$condition = eval("'$left' $op '$right'");
+					}
+					else
+					{
+						&Error("Invalid ##if params: '$params' " .
+							   "in $fileName line $.");
+					}
+				}
+				elsif ($cmd =~ /^ifn?def$/)
+				{
+					if ($params =~ /^(\w+)\s*$/)
+					{
+						$condition = defined($vars{$1});
+						$condition = !$condition if ($cmd eq "ifndef");
+					}
+					else
+					{
+						&Error("Invalid ##$cmd param: '$params' " .
+							   "in $fileName line $.");
+					}
+				}
+
+				# Do main body of if
+				$result = &DoOpenFile(*FILE, *IfFilter,
+									  $skipFlag || !$condition);
+
+				if ($result == 1)	# an '##else' was found
+				{
+					# Handle else
+					$result = &DoOpenFile(*FILE, *IfFilter,
+										  $skipFlag || $condition);
+				}
+
+				if ($result == 1)	# a second '##else' was found
+				{
+					&Error("Two ##else's in a row in $fileName line $.");
+				}
+				elsif ($result == 0)	# EOF was encountered
+				{
+					&Error("Unterminated ##if " .
+						   "in $fileName line $ifStartLine");
+				}
+			}
+			elsif ($cmd eq "include")
+			{
+				if ($skipFlag)
+				{
+				}
+				elsif ($params =~ /^"(.*)"\s*$/)
+				{
+					my $incFile = $1;
+
+					&DoFile($incFile);
+				}
+				else
+				{
+					&Error("Invalid ##include params: '$params'");
+				}
+			}
+			elsif ($cmd eq "set")
+			{
+				if ($params =~ /^(\w+)=<<(")(.*)"\s*$/ or
+					$params =~ /^(\w+)=<<(')(.*)'\s*$/)
+				{
+					my $varName = $1;
+					my $quoteChar = $2;
+					my $endTag = $3 . "\n";
+					my $value;
+
+					while (<FILE>)
+					{
+						if ($_ eq $endTag)
+						{
+							chop $value;
+							last;
+						}
+						else
+						{
+							if ($quoteChar eq '"')
+							{
+								$_ = &DoPrepass($_, $skipFlag);
+							}
+							$value .= $_;
+						}
+					}
+					if (!$skipFlag)
+					{
+						$vars{$varName} = $value;
+					}
+				}
+				elsif ($params =~ /^(\w+)="(.*)"\s*$/ or
+					   $params =~ /^(\w+)=(\S*)\s*$/)
+				{
+					if (!$skipFlag)
+					{
+						$vars{$1} = $2;
+					}
+				}
+				else
+				{
+					&Error("Invalid ##set command: '$params'");
+				}
+			}
+			else
+			{
+				&Error("Unrecognized command: '$_'");
+			}
+		}
+		elsif (!$skipFlag)
+		{
+			print;
+		}
+	}
+	return 0;
+}
+
+$optEnable = 1;
+
+foreach (@ARGV)
+{
+	if ($optEnable and /^-/)
+	{
+		if (/^--$/)
+		{
+			$optEnable = 0;
+		}
+		elsif (/^-D(\w+)=(.*)$/)
+		{
+			$vars{$1} = $2;
+		}
+		elsif (/^-I(.*)$/)
+		{
+			unshift @incPath, $1;
+		}
+		else
+		{
+			&Error("Unrecognized option: '$_'");
+		}
+	}
+	else
+	{
+		&DoFile($_);
+	}
+}
+
+#
+# vi: ai ts=4
+# vim: si
+#
--- a/tools/yapp.doc
+++ b/tools/yapp.doc
@ -0,0 +1,48 @@
+YAPP is a simple macro preprocessor designed to do minor tweaking to
+another program's inputs.
+
+In its input, anything of the form ${foo} is expanded with the variable
+named foo.  It is an error if ${foo} is not defined.
+If you need to escape a dollar sign for some reason, the variable
+with the empty string name , ${}, has the value "$".
+
+The result of macro expansion is *not* re-expanded.  Expansion is done only
+when definitions are made.
+
+After variable expansion, lines are checked to see if they are control lines.
+Control lines begin with ## (after optional leading whitespace)  All such lines are deleted and
+do not appear in the output.  ### is a comment.  Other options
+are:
+
+##set variable=value
+
+value may have one of the following forms:
+token:  Trailing whitespace is stripped.  The token may not contain
+any whitespace.  Use quotes if it's complicated.
+"string":  The string may have embedded quotes, and whitespace after
+	the closing quote.
+<<"DELIM":  This is a here-document, and the value is all of the following
+lines up until, but not including, the newline that precedes a line
+that consists soley of DELIM, for any DELIM string.
+The Delim must be in quotes.  You have two options:
+"DELIM": Expand macros in the body of the here-document.
+'DELIM': Do not expand macros in the here-document.
+
+##include "filename": Insert the named file in place of the current line.
+
+##if num == num
+##if num != num
+##if num < num
+##if num > num
+##if num <= num
+##if num >= num
+##if token eq token
+##if token ne token
+##ifdef symbol
+##ifndef symbol
+##else
+##endif
+You can figure this one out.  Macros in between are expanded as usual
+(so the ##else or ##endif may be in a macro expansion), but the result
+is ignored.  String comparison is allowed only between simple words.
+#ifdef symbol is true if ${symbol} is defined.