The Alexander Architectural Archives are undergoing some renovations—and not just in the East Wing: the Archives’ EAD-encoded finding aids are being converted to schema-compliant documents. As one of the University of Texas’s smaller campus repositories, we at the Architectural Archives are serving as a test case for a larger university-wide (and broader TARO) effort to adopt the XML Schema standard. During this transitional period, my job has been to fix any errors that result from the batch-conversion process; however, as part of the larger scope of this project, my job also entails modifying certain elements in our finding aids to better reflect current descriptive practices with respect to use policy, sponsorship, and materials stored within Texas ScholarWorks (formerly the University of Texas Digital Repository).
If you’re still wondering what “Schema-compliance” means, XML schemas (note the little “s”) define the grammar of XML documents, since XML by definition has no set tag vocabulary or structure; these schemas fall into different families. The two families that we’re primarily concerned with are the DTD and XML Schema (big “S”). Encoded Archival Description (EAD) is the standard data vocabulary (with a tag library maintained by the Library of Congress) for describing archival records and the schema (whether it conforms to a DTD or a Schema document) controls the ordering and structure of your XML instance. Although XSD (XML Schema Definition) has been around for a while and has been a W3C recommendation since 2001, many of our local UT archives have used DTDs to define the structure and semantics of their XML-encoded finding aids. The reticence to use Schema is owing to the fact that the technology hasn’t necessarily been fully supported by XML parsers and that the EAD schema for use in finding aids was not released until 2012.
Yet the question remains: why convert now?
One motivation for switching to XSD-compliant XML is that it’s namespace aware, meaning it imposes more restrictions and offers more detailed enforcement on values (date, language, and repository encodings, for example). Being namespace aware also means that an XML file can refer to specific structured vocabularies when linking elements—this is what enables large-scale interoperability and data synchronization. The XLink capacities of XSD-compliant documents also allow for more complex links between digital objects. Additionally, the newest EAD tag library (EAD3), which we haven’t adopted yet, gravitates toward deprecating ambiguous elements. These EAD emendations form part of a larger trend toward standardizing archival description: by reducing EAD’s flexibility, we reduce inconsistencies among repositories by discouraging idiosyncratic institutional practices. Finally, looking forward, the EAD tag library revisions and the widespread adoption of XSD as our standard document grammar is part of the movement toward linked data and the semantic web.