Mounting a Finding Aids Collection

From DLXS Documentation

Revision as of 19:49, 28 June 2007 by Cboulay (Talk | contribs)
Jump to: navigation, search

Contents

Mounting a Finding Aids Collection

These instructions assume that you have already encoded your finding aids files in the XML-based EAD 2002 DTD. Only you, after looking at your texts and your encoding practices, can do the intellectual work required to encode your finding aids in XML using the EAD 2002 DTD. While DLPS does not have any quick and easy tools for this stage of data preparation, we do have a recommended model: the Bentley Historical Library. The more similar your data and setup is to the Bentley’s, the easier is will be to integrate your finding aids collection with DLXS. DLXS does not support the process of creating new finding aids files: how you do this is you to you. The Bentley Historical Library, however, provides a discussion of some specific tools and a workflow process you might use to create these files.

This topic includes the following sections:

  • Findaid Class and Behaviors Overview
  • Setting Up the Collection: Practical Issues
  • Preparing Data and Directories
  • Building the Index with XPAT
  • Working with Fabricated Regions in FindaidClass
  • Mounting the Collection Online
  • Linking from Finding Aids Using the ID Resolver
  • Using Findaid Class Graphics Files
  • Findaid Class Processing Instructions


Findaid Class and Behaviors Overview

The Findaid Class consists of EAD2002-encoded Finding Aids. You can learn more about the ead2002 DTD and ead2002 in general at the Library of Congress ead2002 web site. The Findaid Class relies on a single XML Document Type Definition (DTD) file to deliver all collections in the class. This file is essentially identical to the ead2002 DTD with one extra wrapping element added. DLXS then uses XPAT to index the XML and the Findaid Class middleware makes it possible for users to search the resources on the web.

The behavior of the Finding Aids Class is similar in many ways to that of Text Class. Access minimally includes full-text searching across collections or within a particular collection of Finding Aids, viewing Finding Aids in a variety of display formats, and the creation of personal collections (a “bookbag”) of Finding Aids. The general characteristics of the Finding Aids class are the following:

  • Allows search and retrieval of ead2002-endcoded Finding Aids and portions thereof
  • Allows searching across multiple collections of Finding Aids simultaneously
  • Allows searching of each collection independently
  • Allows bookmarking of individual Finding Aids
  • Requires minimal administrative data
  • Uses a single data model and shared middleware for all collections in the system
  • Permits access restrictions at the collection level

The Finding Aids class provides no functionality for creating and managing electronic texts in SGML.

Findaid Class Behaviors

The Findaid Class is typically used for either campus or public access. Its behaviors include the following:

  • Cross-collection searching in any combination of collections
  • Selection of collections
  • Collection-specific searching
  • Collection-specific browsing of
  • Simple and Boolean searching
  • Searching within a user-selected Finding Aid
  • Ability to review and revise previous searches
  • Viewing of sections of a Finding Aid or the full text in HTML, and display in context of search terms found
  • Ability to select particular Finding Aids for saving in a session-based personal collection, or bookbag, and to download or email these
  • Keeping a record of user search history during a session

Representative Resources

Bentley Historical Library Finding Aids

University of Michigan Special Collections Finding Aids

Setting Up the Collection: Practical Issues

1. Choosing Unique Collection IDs

There are two areas of practice that can affect on your online collection outside of hands-on encoding or the conversion of word processed finding aids. First is the use of IDs as attributes on elements. This does not refer to the EADID, but to collection IDs within the finding aids used to identify the element so that it can be referred to, or referenced from, somewhere else. Each ID within a document must be unique (and the DTD enforces this). Before you assign IDs to any part of the collection, you should consider the consequences of joining all your finding aids into one collection. In this case, IDs will need to be unique across the entire collection. One way to ensure uniqueness is to prefix ID values with the eadid for a given document. At this time, there is no functionality in DLXS that requires you to have IDs on any elements, but you may want to use them for your own internal purposes.

For example, you might want to put IDs on an attribute that you want to then link to with a target attribute. If the attribute is the full name of an organization commonly abbreviated, and you would like to target the abbreviated name back to the full name, the full corpname element would have an ID attribute and the abbreviated corpname element would have a target containing the ID value. From the point of view of developers at UM DLXS, this is a fairly typical problem that we ourselves have encountered.

We recommend that you choose a unique collection ID that is unique across all collections, regardless of class. Even if you do not want to implement cross-class searching at this point, choosing a unique ID makes this kind of search possible to implement in the future. It also allows you to store your collections on the same server.

Next, request authorization for the collection ID. Will the collection be public, restricted, or item-level restricted?

2. Working with Fabricated Regions

The second area of practice that can affect your online collection are the decisions you make about fabricated regions. Briefly, fabricated regions are ways of grouping elements you have marked up. When you are ready to index your data, XPAT looks at a file named extra.srch which groups queries into discrete regions. For example, you might have lots of different tags related to names: <persname>, <famname>, <corpname>, etc. DLXS groups these queries into the fabricated region <name> for convenient, faster indexing. If you decide not to use all the possible markup options, you should take out the empty regions. Some of these regions, however, are cached in many other places in the user interface, not all of which are documented. General recommendations for thinking about markup and fabricated regions include the following:

  1. Think about what elements you want to markup. You may want to markup more elements than you think you need to any not display all the searchable options at this point. If you only markup your texts to a basic level, however, you may have less options later.
  2. If you have empty regions, you should remove them from the extra.srch file, but remember to take them out of the other files that reference them. We are working on documenting these locations.
  3. what else?

See Working with Fabricated Regions in FindaidClass, and Working with Fabricated Regions (TextClass) for more information.

Preparing Data and Directories

The rules that govern FindaidClass and the ways UM DLXS organizes finding aids were decided by the larger finding aids community. We have provided the DTDs and other files you need to use to mount your collection. While you can, theoretically, alter these files, we don’t recommend it. Of the four DLXS classes, it is most difficult to change options in FindaidClass. This section describes how to set up your directories and data to mount your collection in the following topics:

  • Setting Up Directories
  • Preparing Your Data
  • Cleaning Up Your Data
  • Validating the Concatenated File
  • Possible Validation Errors

The data preparation process includes the following steps:

  1. Validating the files individually against the EAD 2002 DTD.
  2. Concatenating the files into one larger XML file.
  3. Validating the concatenated file against the dlxsead2002 DTD.
Personal tools