Working with the EAD
From DLXS Documentation
Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Working with the EAD
Contents |
[edit] EAD 2002 DTD Overview
These instructions assume that you have already encoded your finding aids files in the XML-based EAD 2002 DTD. If you have finding aids encoded using the older EAD 1.0 standard or are using the SGML version of EAD2002, you will need to convert your files to the XML version of EAD2002. When converting from SGML to XML a number of character set issues may arise. See Data Conversion and Preparation: Unicode,XML, and Normalization
Resources for converting from EAD 1.0 to EAD2002 and/or from SGML EAD to XML EAD are available from:
- The Society of American Archivists EAD Tools page: http://www.archivists.org/saagroups/ead/tools.html
- Library of Congress EAD conversion tools: http://lcweb2.loc.gov/music/eadmusic/eadconv12/ead2002_r.htm
Other good sources of information about EAD encoding practices and practical issues involved with EADs are:
- Library of Congress EAD page http://www.loc.gov/ead/ (This is the home of the EAD standard
- EAD2002 tag library http://www.loc.gov/ead/tglib/index.html
- The Society of American Archivists EAD Help page: http://www.archivists.org/saagroups/ead/
- Various EAD Best Practice Guidelines listed on the Society of American Archivists EAD essentials page: http://www.archivists.org/saagroups/ead/essentials.html (the links to BPGs are at the bottom of the page)
- The EAD listserv http://listserv.loc.gov/listarch/ead.html
[edit] Practical EAD Encoding Issues
The EAD standard was designed as a loose standard in order to accommodate the large variety in local practices for paper finding aids and make it easy for archives to convert from paper to electronic form. As a result, conformance with the EAD standard still allows a great deal of variety in encoding practices.
The DLXS software is primarily designed as a system for mounting University of Michigan collections. In the case of finding aids, the software has been designed to accommodate the encoding practices of the Bentley Historical Library. The more similar your data and setup is to the Bentley’s, the easier is will be to integrate your finding aids collection with DLXS. If your practices differ significantly from the Bentley’s, you will probably need to do some preprocessing of your files and/or make changes to DLXS.
More information on the Bentley's encoding practices and workflow:
- Overview of Bentley's workflow process for Finding Aids http://bentley.umich.edu/EAD/eadproject.php
- Description of Bentley Finding Aids and their presentation on the web http://bentley.umich.edu/EAD/system.php
- Bentley MS Word EAD templates and macros http://bentley.umich.edu/EAD/bhlfiles.php
- Description of EAD tags used in Bentley EADs http://bentley.umich.edu/EAD/bhltags.php
[edit] Types of changes to accomodate differing encoding practices and/or interface changes
- Custom preprocessing
- Add dummy EAD to data
- Modify prep scripts (Makefile, preparedocs.pl, validateeach.csh)
- Modify *inp files (DOCTYPE declarations and entities)
- Modify fabricated regions (*.extra.srch)
- Modify CollMgr entries
- Modify findaidclass.cfg (change table of contents sections)
- Subclass FindaidClass.pm
- Modify XSL
- Modify XML templates
- Modify CSS
[edit] Specific Encoding Issues
There are a number of encoding issues that may affect the data preparation, indexing, searching, and rendering of your finding aids. Some of them are:
- Preprocessing and Data Prep issues
- <eadid> should be less than about 20 characters in length
- Attribute ids must be unique within the entire collection
- If you use attribute ids and corresponding targets within your EADs preparedocs.pl may need to be modified.
- Character Encoding issues
- UTF-8 Byte Order Marks (BOM) should be removed from EADs prior to concatenation
- XML processing instructions should be removed from EADs prior to concatenation
- Multiline DOCTYPE declarations are currently not properly handled by the data prep scripts (There is a patch available to fix this problem: http://www.dlxs.org/products/archive-by-CDROM/13/Patches/24August2007/ )
- If your DOCTYPE declaration contains entities, you need to modify the appropriate *inp files accordingly (An example is included in the patch samplefa.text.entity.example.inp )
- Out-of-the-box <dao> handling may need to be modified for your needs
- Fabricated region issues (some of these involve XSL as well)
- If your <unititle> element precedes your <origination> element in the top level <did>, you will have to modify the maintitle fabricated region query in *.extra.srch See Troubleshooting:Title of Finding Aid does not show up
- If you do not use a <frontmatter> element, you will either have to either a) create and populate frontmatter elements in your EADs manually, or b) run your EADs through some preprocessing XSL to create and populate frontmatter elements, or c) you will have to create a fabricated region to provide an appropriate "Title Page" region based on the <eadheader> and you may also need to change the XSL and/or subclass FindaidClass to change the code that handles the Title Page region.
- Table of Contents and Focus Region issues
- If you do not use a <frontmatter> element you may have to make the changes mentioned above to get the title page to show in the table of contents and when the user clicks on the "Title Page" link in the table of contents
- If your encoding practices for <biohist> differ from the Bentley's, you may need to make changes in findaidclass.cfg or create a subclass of FindaidClass and override FindaidClass:: GetBioghistTocHead, and/or change the appropriate XSL files.
- If you want <relatedmaterial> and/or <separatedmaterial> to show up in the table of contents (TOC) on the left hand side of the Finding Aids, you may have to modify findaidclass.cfg and make other modifications to the code. This also applies if there are other sections of the finding aid not listed in the out-of-the-box findaidclass.cfg %gSectHeadsHash.
- See also Customizing Findaid Class: Working with the table of contents
- XSL issues
- If you have encoded <unitdate>s as siblings of <unittitle>s, you may have to modify the appropriate XSL templates.
- If you want the middleware to use the <head> element for labeling sections instead of the default hard-coded values in findaidclass.cfg, you may need to change fabricated regions and/or make changes to the XSL and/or possibly modify findaidclass.cfg or subclass FindaidClass.