Mounting a Finding Aids Collection
From DLXS Documentation
(→Working with the EAD) |
(→Working with the EAD) |
||
Line 66: | Line 66: | ||
---- | ---- | ||
+ | === Types of changes to accomodate differing encoding practices and/or interface changes === | ||
+ | |||
+ | * Custom preprocessing | ||
+ | * Add dummy EAD to data | ||
+ | * Modify prep scripts (Makefile, preparedocs.pl, validateeach.csh) | ||
+ | * Modify *inp files (DOCTYPE declarations and entities) | ||
+ | * Modify fabricated regions (*.extra.srch) | ||
+ | * Modify CollMgr entries | ||
+ | * Modify findaidclass.cfg (change table of contents sections) | ||
+ | * Subclass FindaidClass.pm | ||
+ | * Modify XSL | ||
+ | * Modify XML templates | ||
+ | * Modify CSS | ||
==[[Findaid Class Behaviors Overview]]== | ==[[Findaid Class Behaviors Overview]]== |
Revision as of 10:32, 14 August 2007
Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection
This topic describes how to mount a Findaid Class collection.
Workshop materials are located here.
Overview
Examples
Overview of Data Preparation and Indexing Steps
Data Preparation
- [#DataPrepStep1 validating the files individually] against the EAD 2002 DTD
make validateeach
- [#DataPrepStep2 concatenating the files into one larger XML file]
make prepdocs
- [#DataPrepStep3 validating the concatenated file] against the dlxsead2002 DTD:
make validate
- [#DataPrepStep4 "normalizing" the concatenated file.]
make norm
- [#DataPrepStep5 validating the normalized concatenated file against the dlxsead2002 DTD]
make validate
The end result of these steps is a file containing the concatenated EADs wrapped in a <COLL> element which validates against the dlxsead2002 and is ready for indexing:
<COLL>
<ead><eadheader><eadid>1</eadid>...</eadheader>... content</ead>
<ead><eadheader><eadid>2</eadid>...</eadheader>... content</ead>
<ead><eadheader><eadid>3</eadid>...</eadheader>... content</ead>
</COLL>
WARNING! If are extra characters or some other problem with the part of the program that strips out the xml declaration and the docytype declearation the file will end up like:
<COLL>
baddata<ead><eadheader><eadid>1</eadid>...</eadheader>... content</ead>
baddata<ead><eadheader><eadid>2</eadid>...</eadheader>... content</ead>
baddata<ead><eadheader><eadid>3</eadid>...</eadheader>... content</ead>
</COLL>
In this case you will get "character data not allowed" or similar errors during the make validate step. You can troubleshoot by looking at the concatenated file and/or checking your original EADs.
Indexing
- make singledd indexes words for texts that have been concatenated into on large file for a collection. This is the recommended process.
- make xml indexes the XML structure by reading the DTD. Validates as it indexes.
- make post builds and indexes fabricated regions based on the XPAT queries stored in the workshopfa.extra.srch file.
Working with the EAD
EAD 2002 DTD Overview
These instructions assume that you have already encoded your finding aids files in the XML-based EAD 2002 DTD. If you have finding aids encoded using the older EAD 1.0 standard or are using the SGML version of EAD2002, you will need to convert your files to the XML version of EAD2002. When converting from SGML to XML a number of character set issues may arise. These are pretty much the same issues that were described for text class see [../conversion/index.html Data Conversion: Unicode, XML, and Normalization] .
Resources for converting from EAD 1.0 to EAD2002 and/or from SGML EAD to XML EAD are available from:
- The Society of American Archivists EAD Tools page:http://www.archivists.org/saagroups/ead/tools.html
- Library of Congress EAD conversion toolshttp://lcweb2.loc.gov/music/eadmusic/eadconv12/ead2002_r.html
Other good sources of information about EAD encoding practices and practical issues involved with EADs are:
- Library of Congress EAD page http://www.loc.gov/ead/ (This is the home of the EAD standard
- EAD2002 tag library http://www.loc.gov/ead/tglib/index.html
- The Society of American Archivists EAD Help page: http://www.archivists.org/saagroups/ead/
- Various EAD Best Practice Guidelines listed on the Society of American Archivists EAD essentials page: http://www.archivists.org/saagroups/ead/essentials.html (the links to BPGs are at the bottom of the page)
- The EAD listserv http://listserv.loc.gov/listarch/ead.html
The EAD standard was designed as a ´loose¡ standard in order to accommodate the large variety in local practices for paper finding aids and make it easy for archives to convert from paper to electronic form. As a result, conformance with the EAD standard still allows a great deal of variety in encoding practices.
The DLXS software is primarily designed as a system for mounting University of Michigan collections. In the case of finding aids, the software has been designed to accommodate the encoding practices of the Bentley Historical Library. The more similar your data and setup is to the Bentley’s, the easier is will be to integrate your finding aids collection with DLXS. If your practices differ significantly from the Bentley’s, you will probably need to do some preprocessing of your files and/or modifications to various files in DLXS. We have found that the largest number of issues in implementing Findaid Class for member institutions is dealing with differences in encoding practices. We will cover various issues that commonly arise.
More information on the Bentley's encoding practices and workflow:
- Overview of Bentley's workflow process for Finding Aids http://bentley.umich.edu/EAD/eadproj.htm
- Description of Bentley Finding Aids and their presentation on the web http://bentley.umich.edu/EAD/findaids.htm
- Bentley MS Word EAD templates and macros http://bentley.umich.edu/EAD/bhlfiles.htm
- Description of EAD tags used in Bentley EADs http://bentley.umich.edu/EAD/bhltags.htm
Types of changes to accomodate differing encoding practices and/or interface changes
- Custom preprocessing
- Add dummy EAD to data
- Modify prep scripts (Makefile, preparedocs.pl, validateeach.csh)
- Modify *inp files (DOCTYPE declarations and entities)
- Modify fabricated regions (*.extra.srch)
- Modify CollMgr entries
- Modify findaidclass.cfg (change table of contents sections)
- Subclass FindaidClass.pm
- Modify XSL
- Modify XML templates
- Modify CSS