Best Practices for TEI in Libraries

A guide for mass digitization, automated workflows, and promotion of interoperability with XML using the TEI

Durable URLs:

Best Practices for TEI in Libraries (version 4.0.0, published September 2018) is the fourth version of a document formerly known as TEI Text Encoding in Libraries: Guidelines for Best Encoding Practices, which has been updated to comply with the Text Encoding Initiative’s Guidelines for Text Encoding and Interchange (P5). These guidelines were originally created for use in large, library-based digitization projects but are useful as a way of approaching digitization and encoding as a whole. This version of the Best Practices for TEI in Libraries was created by a workgroup of TEI community members.

There are many different library text digitization projects, serving a variety of purposes. With this in mind, these Best Practices are meant to be as inclusive as possible by specifying five encoding levels. These levels are meant to allow for a range of practice, from wholly automated text creation and encoding, to encoding that requires expert content knowledge, analysis, and editing. The encoding levels are not strictly cumulative: while higher levels tend to build upon lower levels by including more elements, higher levels are not supersets because some elements used at lower levels are not used at higher levels—often because more specific elements replace generic elements.

The Best Practices are maintained in a set of ODD files and are constructed as a TEI customization. The ODD files allow you to produce prose documentation and schemas in various formats. These ODDs are stored in a GitHub repository.

See files for the current and past versions of this document:

Version Documentation XML Source, ODD files, and/or Schemas
4.0.0 documentation
3.1.0a documentation local copy of generated schemas and documentation
3.0 documentation tagged release of ODD files in GitHub
2.1 documentation TEI version
2.0 documentation n/a
1.0 documentation n/a

Some users are particularly interested in the recommendations for the TEI header found in the Best Practices. Users who want to incorporate only the header recommendations from the Best Practices into another TEI customization should copy the relevant portions of bptl-header.odd: the explanatory table is contained in <div type="elrecs" xml:id="headertable">, and the element specification is contained in <specGrp xml:id="libHeadSpec" xml:lang="en">.

Users might also be interested in Thutmose II, an XSLT stylesheet for translating MARCXML records into TEI headers according to version 3.0 of the Best Practices.

Please use the GitHub issue tracker to submit bug reports and feature requests.