troff and Unix Text Processing
This page aims to be a catalog of all of the freely available information about the typesetting program troff and a guide to text processing using traditional Unix tools. The one notable exception is TeX and LaTeX, which are a world unto themselves.
What is troff?
troff, pronounced tee-roff and short for "typesetter roff", is a typesetting program that takes text mixed with a markup language and outputs a typeset version in one of several file formats, usually PostScript or its binary cousin PDF.
It has its origins in a program called RUNOFF, an archaism meaning "to make copies," written by Jerome H. Saltzer for the CTSS operating system in the 1960s. A number of increasingly sophisticated ports to other platforms were made over the next several years, resulting in roff and nroff ("newer roff"). Joe F. Ossanna of Bell Labs created the first version known as troff to utilize a Graphics Systems CAT phototypesetter. He also produced nroff, which is responsible for rendering man pages on Unix and Linux systems to the present day.
Ossanna died before a device-independent version could be written, so Brian Kernighan (co-creator of the C programming language and Unix itself) ported it from PDP-11 assembly language to C. This version was known as ditroff for "device independent troff" and is the canonical form of troff duplicated and extended by more recent implementations, of which there are several.
troff is an extremely powerful typesetting tool, but also very low level. (Kernighan likened it to an assembly language for typesetting.) As a result, a number of now-standard macro packages were written for it to make it easier for casual users. Additionally, there are standard set of preprocessors that translate high-level specifications of things like diagrams and tables into native troff instructions.
Today, troff and its more popular but vastly larger and slower competitor TeX are the only actively maintained full-featured freely available typesetting languages. Books and papers are still written for troff, and man pages are still output to terminal windows by nroff. There are five actively maintained open source implementations sporting various extensions of the original version.
Resources
Implementations
- GNU troff (groff) is the most widely used implementation by virtue of its inclusion in every Linux distribution.
- Heirloom troff is an enhanced derivative of the OpenSolaris version. It notably handles PostScript, OpenType, and TrueType fonts, has microtypography features, and has hyphenation support for a large number of languages.
- Neatroff is a new implementation by Ali Gholami Rudi that clones most of the extensions of the other ports and adds support for bidirectional text and advanced font handling.
- mandoc is a dedicated troff subset for the man and mdoc macro packages.
- unroff is an implementation written in Scheme and extensible in the same language which supports HTML output.
Documentation
General documentation for troff and its variants appears here. Some of this, especially the Bell Labs docs, go back decades but remain useful because current implementations are fully backward compatible.
- Troff User's Manual - This is the official user manual from Bell Labs, written by Joseph Ossanna and Brian Kernighan in 1992.
- A TROFF Tutorial - An introduction to using raw troff by Brian Kernighan, 1978.
- A Typesetter-independent TROFF - Brian Kernighan's 1982 paper describing ditroff.
- The GNU Troff Manual is available both in PDF and HTML.
- The Groff and Friends HOWTO - An introductory tutorial on groff by Dean Allen Provins, 2001.
- Heirloom Documentation Tools Nroff/Troff User's Manual - An extension to the Ossanna/Kernighan original by Gunnar Ritter.
- Basic Formatting with troff/nroff - From the internet edition of UNIX Unleashed.
Macro Packages
man
The man macro package is, perhaps unsurprisingly, for formatting manual pages ("man pages") on *nix systems.
- man manual page (GNU groff)
mdoc
An enhancement and extension of man, mdoc is also intended for writing man pages.
- mdoc manual page (GNU groff)
me
The me macro package is designed for formatting technical research papers.
- me manual page (GNU groff)
mm
The mm macro package was designed for writing technical memoranda at Bell Labs.
- mm manual page (GNU groff)
mom
The mom macros are Peter Schaffter's effort to provide a more modern and easier to use macro package for authors to use with groff.
ms
The ms macros are suitable for reports, letters, books, and technical documentation.
- Typing Documents on the UNIX System: Using the -ms Macros with Troff and Nroff - M.E. Lesk's 1978 paper on the ms macros.
- ms manual page (GNU groff)
www (GNU groff)
The www macro package is a (currently alpha) attempt to provide HTML output for GNU groff.
- www manual page (GNU groff)
Preprocessors
chem
- CHEM - A Program for Typesetting Chemical Diagrams: User Manual by Jon Bentley, Lynn Jelinski, and Brian Kernighan, 1992.
dformat
eqn
- A System for Typesetting Mathematics - Brian Kernighan and Lorinda Cherry's original 1975 paper describing eqn.
- Typesetting Mathematics: User's Guide, 2nd edition - Brian Kernighan and Lorinda Cherry's 1978 user manual for eqn.
- eqn man page, Heirloom version.
grap
- Grap: A Language for Typesetting Graphics, Tutorial and User Manual - Jon Bentley and Brian Kernighan's 1991 paper on grap.
- Ted Faber's implementation of grap.
- grap man page, Heirloom version.
grn
ideal
- IDEAL User's Manual by Christopher J. Van Wyk, 1981.
pic
- PIC: A Graphics Language for Typesetting - Brian Kernighan's user manual for Bell Labs pic.
- Making Pictures with GNU PIC - Eric S. Raymond's user manual for GNU pic.
- pic man page, Heirloom version.
- Examples of pic Macros - W. Richard Stevens demonstrates some of the pic macros from his books.
refer
- refer man page, Heirloom version.
- Some Applications of Inverted Indexes on the UNIX System - M.E. Lesk's paper on refer.
tbl
- Tbl: A Program to Format Tables - M.E. Lesk's 1977 manual for tbl
- tbl man page, Heirloom version.
External Tools
- Tools for Printing Indexes - Jon Bentley and Brian Kernighan present a collection of awk scripts for producing indexes, source code included.
- troff2page is a troff to HTML converter written in Lua by Dorai Sitaram.