Mounting Finding Aids: Release 14/Workshop working copy

From DLXS Documentation

(Difference between revisions)
Jump to: navigation, search
Line 654: Line 654:
[[#top|Top]]
[[#top|Top]]
-
==[[Mounting the Collection Online]]==
+
==Mounting the Collection Online==
 +
[[DLXS Wiki|Main Page]] > [[Mounting Collections: Class-specific Steps]] > [[Mounting a Finding Aids Collection]] > Mounting the Collection Online
 +
 
 +
These are the final steps in deploying an Findaid Class collection online. Here the '''Collection Manager''' will be used to create and edit a '''Collection Database''' entry for '''workshopfa''' . The '''Collection Manager''' will also be used to check the '''Group Database'''. Finally, we need to work with the collection map and the set up the collection's web directory.
 +
 
 +
 
 +
=== Create and edit an entry in the Collection Database for your collection with CollMgr ===
 +
 
 +
Each collection has a record in the collection database that holds collection specific configurations for the middleware. CollMgr (Collection Manager) is a web based interface to the collection database that provides functionality for editing each collection's record. Collections can be checked-out for editing, checked-in for testing, and released to production. In general, a new collection needs to have a CollMgr record created from scratch before the middleware can be used.
 +
 
 +
Step 1.  Create a workshopfa Collmgr entry by copying from samplefa.
 +
 
 +
'''A.  Login to Collmgr.'''  The URL should be:
 +
<nowiki>http://path_to_cgi/cgi/c/collmgr </nowiki>
 +
 
 +
The collmgr page is usually set up to use apache basic authorization.  The username and password should have been set up when you set up your virtual host in apache. ([[sample apache virtual host]]
 +
)
 +
 
 +
'''B. Select Manage Collections:Findaid Class:'''
 +
 
 +
[[Image:collmgr1.png|alt text]]
 +
[[Image:collmgr2.png|alt text]]
 +
 +
'''C.  Select samplefa and click on "copy a collection" '''(Note: In the image below workshopfa already exists, but in your clean install it will not exist)
 +
 
 +
[[Image:collmgr3.png|alt text]]
 +
 
 +
'''D. Enter your collection id''' (workshopfa)
 +
[[Image:collmgr4.png|alt text]]
 +
 
 +
'''E. Change all occurances of "samplefa" to "workshopfa"'''  For example in the section below the webdir should be changed from "s/samplefa" to "w/workshopfa" (And you need to copy and rename the appropriate files from $DLXSROOT/web/s/samplefa to $DLXSROOT/web/w/workshopfa)
 +
 
 +
'''WARNING! If you forget to change one of the entries it can lead to very confusing results.''' For example if you forget to change the "dd" file entry from "/idx/s/samplefa/samplefa.dd" to /idx/w/workshopfa/workshopfa.dd", the middleware will try to search the samplefa collection but all the rest of the configuration information will point to workshopfa, which will result in erratic behavior and potentially confusing error messages.
 +
 
 +
'''F. Change the entry for the subclassmodule from "/FindaidClass/SamplefaFC" to "FindaidClass".'''  This means that this collection will use the default FindaidClass.pm instead of the SampleFC subclass.
 +
(Unless you want to subclass Findaid Class in which case you would replace "SamplefaFC with the name of your collection-specific subclass)
 +
 
 +
[[Image:collmgr6.png|alt text]]
 +
 
 +
 
 +
'''G. Set the containerdepth field to the depth of containers in your collection'''
 +
 
 +
[[Image:collmgr5.png|alt text]]
 +
 
 +
For example if you have levels c01 to c05 set the containerdepth to 5.  You can use the xpat command {ddinfo regionnames} to look at your data and look for the highest c level to determine what number to put here.
 +
 
 +
xpatu $DLXSROOT/idx/s/samplefa/samplefa.dd
 +
?>> {ddinfo regionnames}
 +
 
 +
If you have containerdepth  set to a number that is higher than what is in your data, xpat will try to search for the missing c0x level elements and will produce errors.  This can occur whenever xpat tries to query the 'c0xheads" fabricated region.  For example we set the continer depth to 7 for the samplefa collection (the samplefa collection only has c01-c06) and then got the following error message when we tried to view a kwic (search terms in context) view for the Post Family Papers in our web browser:
 +
 
 +
Message: Query error in samplefa, samplefa.dd, query=pr.region.c0xhead
 +
(region "c0xhead" ^ ( region "c07" incl *detailslicesearch ));,
 +
Error=No information for region c07 in the data dictionary. syntax error before: ))
 +
 
 +
You will also probably want to edit:
 +
 
 +
*fields related to the dynamic browse page (See [[#Create_a_browse_page |Create a browse page]])
 +
*fields related to searching and sorting in the user interface:  regionsearch, termsearch, sortfields (Note that these need to match the entries in your [[#Make_Collection_Map |map file]]
 +
 
 +
''More Documentation''
 +
 
 +
* [[Collection Manager Field Descriptions]]
 +
 
 +
=== Review the Groups Database Entry with CollMgr ===
 +
 
 +
Another function of CollMgr allows the grouping of collections for cross-collection searching. Any number of collection groups may be created for Findaid Class. Findaid Class supports a group with the groupid "all". It is not a requirement that all collections be in this group, though that's the basic idea. Groups are created and modified using CollMgr.
 +
 
 +
=== Make Collection Map ===
 +
 
 +
Collection mapper files exist to identify the regions and operators used by the middleware when interacting with the search forms. Each collection will need one, but most collections can use a fairly standard map file, such as the one in the '''samplefa''' collection. The map files for all Findaid Class collections are stored in $DLXSROOT/misc/f/findaid/maps
 +
 
 +
You can find an example map file for the sample finding aids collection at DLXSROOT/misc/f/findaid/maps/samplefa.map. Rather than modifying this file, you should copy it so that you always have a blank copy to which to refer.
 +
 
 +
You can use the following commands to copy the samplefa.map file to use as a basis for your collection:
 +
 
 +
  cd $DLXSROOT/misc/f/findaid/maps
 +
  cp samplefa.map workshopfa.map
 +
 
 +
 
 +
Map files contain mapped items where one term or name for the item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire finding aid") can be mapped to an XPAT searchable region (e.g., EAD). For more general background on map files, see [[Working with Map Files]]
 +
 
 +
 
 +
Currently, the format of the map files is XML and each collection map file conforms to a simple DTD (we have considered implementation of other possible ways of mapping terms, such as a database where one could map from one column's data to another). The middleware reads the map file into a TerminologyMapper object after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a <MAPPING> element.
 +
 
 +
Each mapping element in a map file consists of the following:
 +
;label
 +
: This element determines what will display in the user's browser when constructing searches. It must match the value used in the collmgr. (See step 2.)
 +
 
 +
;synthetic
 +
: This element contains the variable name as it is used in the cgi.
 +
 
 +
;native
 +
: The "native" element provides an appropriate XPAT search that the system will use to discover the appropriate content. The search may be simple (e.g., region EADID) or complex (e.g., ((region DID within region ARCHDESC) not within region DSC))
 +
 
 +
;nativeregionname
 +
: The element name itself, as it is indexed, without terms used in the XPAT search.
 +
 
 +
Map files take language that is used in the forms and translates it into language for the cgi and for XPAT. For example, if you want your users to be able to search within names, you need to add a mapping for how you want headings and categories to appear in the search interface (case is important, as is pluralization!), how the cgi variable is set (usually in all caps, and not stepping on an existing variable), and how XPAT will identify and retrieve this natively (in XPAT search language).
 +
The first part of the map file is operator mapping, for the form, the cgi, and XPAT, and the second part is for region mapping. You might note that some of the fields that are defined in the map file correspond to some of the fabricated regions.
 +
Note: The larger the map file, the slower your site will run, so you don’t necessarily want to map everything, such as variations of singular and plural fields.
 +
 
 +
=== ''More Documentation'' ===
 +
 
 +
* [[Working with Map Files]]
 +
 
 +
----
 +
 
 +
=== Set Up the Collection's Web Directory ===
 +
 
 +
Each collection may have a <span class="unixcommand">web</span> directory with custom Cascading Style Sheets, interface templates, graphics, and javascript. The default is for a collection to use the web templates at<span class="unixcommand"> $DLXSROOT/web/f/findaid</span>. Of course, collection specific templates and other files can be placed in a collection specific web directory, and it is necessary if you have any customization at all. ''DLXS Middleware uses [../ui/index.html#fallback fallback] to find HTML related templates, chunks, graphics, js and css files.''
 +
 
 +
For a minimal collection, you will want two files: index.html and <span class="unixcommand">FindaidClass-specific.css</span>.
 +
 
 +
<blockquote>
 +
 
 +
 +
mkdir -p $DLXSROOT/web/w/workshopfa
 +
cp $DLXSROOT/web/s/samplefa/index.html $DLXSROOT/web/w/workshopfa/index.html
 +
cp $DLXSROOT/web/s/samplefa/findaidclass-specific.css $DLXSROOT/web/w/workshopfa/findaidclass-specific.css
 +
 
 +
</blockquote>
 +
 
 +
As always, we'll need to change the collection name and paths. You might want to change the look radically, if your HTML skills are up to it.
 +
 
 +
Note that the browse link on the index.html page is hard-coded to go to the samplefa hard-coded browse.html page. You may want to change this to point to a dynamic browse page (see below). The url for the dynamic browse page is ".../cgi/f/findaid/findaid-idx?c=workshopfa;page=browse".
 +
 
 +
If you would prefer a dynamic home page, you can copy and modify the home.xml and home.xsl files from $DLXSROOT/web/f/findaid/. Note that they are currently set up to be the home page for all finding aids collections, so you will have to do some considerable editing. However they contain a number of PIs that you may find useful. In order to have these pages actually be used by DLXS, they have to be present in your $DLXSROOT/web/w/workshopfa/ directory and '''there can't be an index.html page in that directory.''' The easiest thing to do, if you have an existing index.html page is to rename it to "index.html.foobar" or something. <br />
 +
 
 +
=== Create a browse page ===
 +
 
 +
See the documentation: [[Setting up Dynamic Browsing]]
 +
 
 +
=== Try It Out ===
 +
 
 +
<nowiki>http://$DLXSROOT/cgi/f/findaid/findaid-idx?c=workshopfa</nowiki>
 +
 
 +
[[#top|Top]]
==[[Troubleshooting Finding Aids]]==
==[[Troubleshooting Finding Aids]]==

Revision as of 11:34, 8 July 2008

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection

Release 14 Working Copy

Contents

This page is under construction and will be in flux until the workshop

This topic describes how to mount a Findaid Class collection.

Workshop materials are located at http://www.dlxs.org/training/workshop200707/findaidclass/fcoutline.html

Overview

The Finding Aids Class is in many ways similar in behavior to Text Class. Access minimally includes full text searching across collections or within a particular collection of Finding Aids, viewing Finding Aids in a variety of display formats, and creation of personal collections ("bookbag") of Finding Aids.

To mount a Finding Aids Collection, you will need to complete the following steps:

  1. Prepare your data and set up a directory structure
  2. Validate and normalize your data
  3. Build the Index
  4. Mount the collection online

Findaid Class Behaviors Overview

This section describes the basic Findaid Class behaviors.

Examples of Findaid Class Implementations and Practices

This section contains links to public implementations of DLXS Findaid Class as well as documentation on workflow and implementation issues. If you are a member of DLXS and have a collection or resource you would like to add, or wish to add more information about your collection, please edit this section.

University of Michigan, Bentley Historical Library Finding Aids
Out-of-the-box DLXS 13 implementation.
Overview of Bentley's workflow process for Finding Aids
See also the links in Practical EAD Encoding Issues for background on the Bentley EAD workflow and encoding practices
Unversity of Tennesee Special Collections Libraries
DLXS Findaid Class version ?
University of Pittsburgh, Historic Pittsburgh Finding Aids
DLXS Findaid Class version ?
Background on Pittsburgh Finding Aids workflow
University of Wisconsin, Archival Resources in Wisconsin: Descriptive Finding Aids
DLXS Findaid Class version ?
University of Minnesota Libraries, Online Finding Aids
DLXS Findaid Class version ?
EAD Implementation at the University of Minnesota
Getty Research Institute Special Collections Finding Aids
Heavily customized DLXS11a. Background on Getty customization and user interface changes to DLXS
J. Paul Getty Trust Institutional Archives Finding Aids
Heavily customized DLXS11a.

Working with the EAD

Preparing Data and Directories

Finding Aids Data Preparation

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Finding Aids Data Preparation

Preprocessing

Validating and Normalizing Your Data

Step 1: Validating the files individually against the EAD 2002 DTD

cd $DLXSROOT/bin/w/workshopfa
make validateeach


The Makefile runs the following command:

% $DLXSROOT/prep/w/workshopfa/validateeach.csh


What's happening: The makefile is running the c-shell script validateeach.csh
Release 14validateeach.sh
in the prep directory. The script creates a temporary file without the public DOCTYPE declaration, and then runs onsgmls on each of the resulting XML files in the data subdirectory to make sure they conform with the EAD 2002 DTD. If validation errors occur, error files will be in the data subdirectory with the same name as the finding aids file but with an extension of .err. If there are validation errors, fix the problems in the source XML files and re-run.

Check the error files by running the following commands

 ls -l $DLXSROOT/prep/w/workshopfa/data/*err

if there are any *err files, you can look at them with the following command:

 less  $DLXSROOT/prep/w/workshopfa/data/*err
Common error messages and solutions:
onsgmls: Command not found
path to your installation of the onsgmls binary incorrect in $DLXSROOT/prep/s/samplefa/validateeach.csh
onsgmls:/l1/dev/tburtonw/misc/sgml/xml.dcl:1:W: SGML declaration was not implied
This is a warning (note the :W:) not an error and can be ignored. This warning can be silenced by changing line 6 of $DLXSROOT/prep/s/samplefa/validateeach.csh (or your customized version)

from:

onsgmls -s -f $file.err $DLXSROOT/misc/sgml/xml.dcl $DLXSROOT/prep/s/samplefa/samplefa.text.inp $file.tmp

to:

onsgmls -wxml -w no-explicit-sgml-decl -s -f $file.err $DLXSROOT/misc/sgml/xml.dcl $DLXSROOT/prep/s/samplefa/samplefa.text.inp $file.tmp
entityref errors such as "general entity 'foobar' not defined"
If you use entityrefs in your EADs, you may see errors relating to problems resolving entities. Example entityref errors. The solution is to add the entityref declarations to the doctype declaration in these two files:
$DLXSROOT/prep/s/samplefa/samplefa.text.inp
This is the doctype declaration used by the validateeach.csh script that points to the EAD2002 DTD.
$DLXSROOT/prep/s/samplefa/samplefa.xml.inp
This is the doctype declaration that points to the dlxs2002 dtd. The dlxs2002 dtd is used by the "make validate" target of the Makefile to validate the concatenated file containing all of your EADs.

Step 2: Concatentating the files into one larger XML file (and running some preprocessing commands)

cd $DLXSROOT/bin/w/workshopfa make prepdocs
The Makefile runs the following command:
$DLXSROOT/bin/w/workshopfa/preparedocs.pl $DLXSROOT/prep/w/workshopfa/data $DLXSROOT/obj/w/workshopfa/workshopfa.xml $DLXSROOT/prep/w/workshopfa/logfile.txt
This runs the preparedocs.pl script on all the files in the specified data directory and writes the output to the workshopfa.xml file in the appropriate /obj subdirectory. It also outputs a logfile to the /prep directory:

The Perl script does two sets of things:

  1. Concatenates all the files
  2. Runs a number of preprocessing steps on all the files

Concatenating the files

The script finds all XML files in the data subdirectory,and then strips off the XML declaration and doctype declaration from each file before concatenating them together. It also wraps the concatenated EADs in a <COLL> tag . The end result looks like:


<COLL>
<ead><eadheader><eadid>1</eadid>...</eadheader>... content</ead>
<ead><eadheader><eadid>2</eadid>...</eadheader>... content</ead>
<ead><eadheader><eadid>3</eadid>...</eadheader>... content</ead>
</COLL>

WARNING! If are extra characters or some other problem with the part of the program that strips out the xml declaration and the docytype declearation the file will end up like:


<COLL>
baddata<ead><eadheader><eadid>1</eadid>...</eadheader>... content</ead>
baddata<ead><eadheader><eadid>2</eadid>...</eadheader>... content</ead>
baddata<ead><eadheader><eadid>3</eadid>...</eadheader>... content</ead>
</COLL>

This will cause the document to be invalid since the dlxsead2002.dtd does not allow anything between the closing tag of one </ead> and the opening tag of the next one <ead>

Some of the possible causes of such a problem are:

  • UTF-8 Byte Order Marks at the beginning of the file
  • DOCTYPE declaration on more than one line
  • XML processing instructions

Preprocessing steps

The perl program also does some preprocessing on all the files. These steps are customized to the needs of the Bentley. You should look at the perl code and modify it so it is appropriate for your encoding practices.

The preprocessing steps are:

  • finds all id attributes and prepends a number to them
  • adds a prefix string "dao-bhl" to all DAO links (You probably will want to change this)
  • removes empty persname, corpname, and famname elements

The output of the combined concatenation and preprocessing steps will be the one collection named xml file which is deposited into the obj subdirectory.

If your collections need to be transformed in any way, or if you do not want the transformations to take place (the DAO changes, for example), edit preparedocs.pl file to effect the changes. Some changes you may want to make include:

  • Changing the algorithm used to make id attibute unique. For example if your encoding practices use id attributes and targets, the out-of-the-box algorithm will remove the relationship between the attributes and targets. One possible modification might be to modify the algorithm to prepend the eadid or filename to all id and target attributes.
  • Modifying the program to read a list of files or list of eadids so that the files are concatenated in a particular order. The default sort order for search results is in occurance order, which translates to the order in which the eads are concatenated. If you write a script which looks at the eads for some element that you want to sort by and then outputs a list of filenames sorted by that order, you could then pass that file to a modified preparedocs.pl so it would concatenate the files in the order listed.

Step 3: Validating the concatenated file against the dlxsead2002 DTD

make validate

The Makefile runs the following command:

onsgmls -wxml -s -f $DLXSROOT/prep/w/workshopfa/workshopfa.errors $DLXSROOT/misc/sgml/xml.dcl   $DLXSROOT/prep/w/workshopfa/workshopfa.xml.inp $DLXSROOT/obj/w/workshopfa/workshopfa.xml

This runs the onsgmls command against the concatenated file using the dlxs2002dtd, and writes any errors to the workshopfa.errors file in the appropriate subdirectory in $DLXSROOT/prep/c/collection.. | More details

Note that we are running this using workshopfa.xml.inp not workshop.text.inp. The workshopfa.xml.inp file points to $DLXSROOT/misc/sgml/dlxsead2002.ead which is the dlxsead2002 DTD. The dlxsead2002 DTDis exactly the same as the EAD2002 DTD, but adds a wrapping element, <COLL>, to be able to combine more than one ead element, more than one finding aid, into one file. It is, of course, a good idea to validate the file now before going further.


Run the following command

 ls -l $DLXSROOT/prep/w/workshopfa/workshopfa.errors

If there is a workshopfa.errors file then run the following command to look at the errors reported

 less $DLXSROOT/prep/w/workshopfa/workshopfa.errors


Common common causes of error messages and solutions
make: onsgmls: Command not found
OSGMLNORM variable in Makefile does not point to correct location of onsgmls for your installation or openSP is not installed.
If there were no errors when you ran "make validateeach" but you are now seeing errors
there was very likely a problem with the preparedocs.pl processing.
  • The DOCTYPE declaration did not get completely removed. (The current scripts don't always remove multiline DOCTYPE declarations)
  • There was a UTF-8 Byte Order Mark at the begginning of one or more of the concatenated files
onsgmls
/l1/dev/tburtonw/misc/sgml/xml.dcl:1:W: SGML declaration was not implied
The above error can be ignored, but if you see any other errors STOP! You need to determine the cause of the problem, fix it, and rerun the steps until there are no errors from make validate. If you continue with the next steps in the process with an invalid xml document, the errors will compound and it will be very difficult to trace the cause of the problem. To avoid seeing this error add the "-w no-explicit-sgml-decl" flag to the Makefile on line 83. Change line 83 of the Makefile

from:

onsgmls -wxml -s -f $(PREPDIR)$(NAMEPREFIX).errors $(XMLDECL) $(XMLDOCTYPE) $(XMLFILE)

to:

onsgmls -wxml -w no-explicit-sgml-decl -s -f $(PREPDIR)$(NAMEPREFIX).errors $(XMLDECL) $(XMLDOCTYPE) $(XMLFILE)

There is a patch available which will strip off Byte Order Marks, remove XML processing instructions, removes multiline DOCTYPE declarations and also implements the change to the onsgmls warning flag noted above DLXS13 August 24 Findaid Class Patch

Step 4: Normalizing the concatenated file

make norm

The Makefile runs a series of copy statements and two main commands:


1.)   /l/local/bin/osgmlnorm -f $DLXSROOT/prep/s/samplefa/samplefa.errors $DLXSROOT/misc/sgml/xml.dcl $DLXSROOT$DLXSROOT/prep/s/samplefa/samplefa.xml.inp $DLXSROOT/obj/s/samplefa/samplefa.xml.prenorm > /l1/dev/tburtonw/obj/s/samplefa/samplefa.xml.postnorm
2.)  /l/local/bin/osx -bUTF-8 -xlower -xempty -xno-nl-in-tag -f /l1/dev/tburtonw/prep/s/samplefa/samplefa.errors /l1/dev/tburtonw/misc/sgml/xml.dcl /l1/dev/tburtonw/prep/s/samplefa/samplefa.xml.inp /l1/dev/tburtonw/obj/s/samplefa/samplefa.xml.postnorm > /l1/dev/tburtonw/obj/s/samplefa/samplefa.xml.postnorm.osx 


These commands ensure that your collection data is normalized. What this means is that any attributes are put in the order in which they were defined in the DTD. Even though your collection data is XML and attribute order should be irrelevant (according to the XML specification), due to a bug in one of the supporting libraries used by xmlrgn (part of the indexing software), attributes must appear in the order that they are definded in the DTD. If you have "out-of-order" attributes and don't run make norm, you will get "invalid endpoints" errors during the make post step.

Step one, which normalizes the document writes its errors to $DLXSROOT/prep/s/samplefa/samplefa.errors. Be sure to check this file.

Step 2, which runs osx to convert the normalized document back into XML produces lots of error messages which are written to standard output. These are caused because we are using an XML DTD (the EAD 2002 DTD) and osx is using it to validate against the SGML document created by the osgmlnorm step. These are the only errors which may generally be ignored. However, if the next recommended step, which is to run "make validate" again reveals an invalid document, you may want to rerun osx and look at the errors for clues. (Only do this if you are sure that the problem is not being caused by XML processing instructions in the documents as explained below)

Step 5: Validating the normalized file against the dlxsead2002 DTD

make validate

We run this step again to make sure that the normalization process did not produce an invalid document. This is necessary because under some circumstances the "make norm" step can result in invalid XML. One known cause of this is the presense of XML processing instructions. For example: "<?Pub Caret1?>". Although XML processing instructions are supposed to be ignored by any XML application that does not understand them, the problem is that when we use sgmlnorm and osx, which are SGML tools, they end up munging the output XML. The recommended workaround is to add a preprocessing step to remove any XML processing instructions from your EADs before you run "make prepdocs", or to include some code in preparedocs.pl that will strip out XML priocessing instructions prior to concatenating the EADs.

Building the Index

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Building the Index

Indexing Overview

After you have followed all the steps to set up your directories and prepare your files, as described in Validating and Normalizing Your Data, indexing the collection is fairly straightforward. To create an index for use with the Findaid Class interface, you will need to index the words in the collection, then index the XML (the structural metadata, if you will), and then finally "fabricate" regions based on a combination of elements (for example, defining what the "main entry" is, without adding a <MAINENTRY> tag around the appropriate <AUTHOR> or <TITLE> element).

The main work in the indexing step is making sure that the fabricated regions in the workshopfa.extra.srch file match the characteristics of your collection.

Note: If the final "make validate" step in Validating the normalized file against the dlxsead2002 DTD produced errors, you will need to fix the problem before running the indexing steps. Attempting to index an invalid document will lead to indexing problems and/or corrupt indexes.

The Makefile in the $DLXSROOT/bin/c/collection directory contains the commands necessary to build the index, and can be executed easily.

To create an index for use with the Findaid Class interface, you will need to index the words in the collection, then index the XML (the structural metadata, if you will), and then finally "fabricate" structures based on a combination of elements (for example, defining who the "main author" of a finding aid is, without adding a <mainauthor> tag around the appropriate <author> in the eadheader element).

The Makefile should be in the $DLXSROOT/bin/c/collection directory.

cd $DLXSROOT/bin/c/collection

The following commands can be used to make the index:


make singledd indexes words for texts that have been concatenated into one large file for a collection.

make xml indexes the XML structure by reading the DTD. It validates as it indexes.

make post builds and indexes fabricated regions based on the XPAT queries stored in the workshopfa.extra.srch file. Because every collection is different, the *extra.srch file will probably need to be adapted for your collection. If you try to index/build fabricated regions from elements not used in your finding aids collection, you will see errors like:

Error found: <Error>syntax error before: ")</Error>  

when you use the make post command

Step by Step Instructions for Indexing

Step 1: Indexing the text

 cd $DLXSROOT/bin/w/workshopfa
 make singledd

The make file runs the following commands:

 cp /l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.blank.dd
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 /l/local/xpat/bin/xpatbld -m 256m -D /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 cp /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 	/l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.presgml.dd

Step 2: Indexing the the XML

 make xml

The makefile runs the following commands:

 cp /l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.presgml.dd
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 /l/local/xpat/bin/xmlrgn -D /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 	/l1/workshop/test02/dlxs/misc/sgml/xml.dcl
 	/l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.inp
 	/l1/workshop/test02/dlxs/obj/w/workshopfa/workshopfa.xml
 
 cp /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.prepost.dd


Step 3: Configuring fabricated regions

Fabricated regions are set up in the $DLXSROOT/prep/c/collection/collection.extra.srch file. The sample file $DLXSROOT/prep/s/samplefa/samplefa.extra.srch was designed for use with the Bentley's encoding practices. If your encoding practices differ from the Bentley's, or if your collection does not have all the elements that the samplefa.extra.srch xpat queries expect, you will need to edit your *.extra.srch file.

We recommend a combination of the following:

  1. Iterative work to insure make post does not report errors
  2. Iterative work to insure that searching and rendering work properly with your encoding practices.
  3. Up front analysis

Run the "make post" and iterate until there are no errors reported.

Run the "make post" step and look at the errors reported. Then modify *.extra.srch and rerun "make post". Repeat this until "make post" does not report any errors.

The most common cause of "make post" errors related to fabricated regions result from a fabricated region being defined which includes an element which is not in your collection.

For example if you do not have any <corpname> elements in any of the EADs in your collection and you are using the out-of-the-box samplefa.extra.srch, you will see an error message when xpat tries to index the mainauthor region using this rule:

(
     (region "persname" + region "corpname" + region "famname" + region "name")
      within 
       (region "origination" within 
          ( region "did" within 
               (region "archdesc")
          )
       )
      ); 
{exportfile /l1/workshop/user11/dlxs/idx/s/samplefa/mainauthor.rgn"}; export;~sync "mainauthor"; 


If you don't expect to ever use an element, then you can eliminate it from the fabricated region definitions. An alternative that is useful if you have only a small sample of the EADs you will be mounting and you expect that some of the EADs you will be getting later might have the element that is currently missing from your collection, is to add a "dummy" EAD to your collection. The "dummy" ead should contains all the elements you will ever expect to use (or that are required by the *.extra.srch file). The "dummy" EAD should have all elements except the <eadid> empty.

Exercise the web user interface

Once make post does not report errors, you can follow the rest of the steps to put your collection on the web. Then carefully exercise the web user interface looking for the following symptoms:

  • Searches that don't work properly because they depend on fabricated regions that don't match your encoding practices.
  • Rendering that does not work properly. An example is that the name/title of the finding aid may not show up if your <unititle> element precedes your <origination> element in the top level <did>. See also Title of finding aid does not show up.

For more information on regions used for searching and rendering see

Analysis of your collection

You may be able to analyze your collection prior to running make post and determine what changes you want to make in the fabricated regions. If your analysis misses any changes, you can find this out by using the two previous techniques.

  • Once you have run "make xml", but before you run "make post", start up xpatu running against the newly created indexes:
 xpatu $DLXSROOT/idx/s/samplefa/samplefa.dd

then run the command

 >> {ddinfo regionnames}

This will give you a list of all the XML elements, and attributes

Alternatively you can create a file called xpatregions and insert the following text:

{ddinfo regionnames}

Then run this command

$ xpatu /l1/dev/tburtonw/idx/s/samplefa/samplefa.dd < xpatregions > regions.out

Then you use the "regions.out" file you just created to sort and examine the list of fabricated regions which occur in your finding aids and compare them to the fabricated region queries in your copy of samplefa.extra.srch.

Step 4: Indexing fabricated regions

 make post

The makefile runs the following commands:

 cp /l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.prepost.dd
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 touch /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.init
 /l/local/xpat/bin/xpat -q /l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd
 	< /l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.extra.srch
 	| /l1/workshop/test02/dlxs/bin/t/text/output.dd.frag.pl
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/
 	> /l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.extra.dd
 /l1/workshop/test02/dlxs/bin/t/text/inc.extra.dd.pl
 	/l1/workshop/test02/dlxs/prep/w/workshopfa/workshopfa.extra.dd
 	/l1/workshop/test02/dlxs/idx/w/workshopfa/workshopfa.dd


If you get an "invalid endpoints" message from "make post", the most likely cause is XML processing instructions or some other corruption. The second "make validate" step should have caught these. Other possible causes of errors during the "make post" step include syntax errors in workshopfa.extra.srch, or the absense of a particular region that is listed in the *.extra.srch file but not present in your collection. For example if you do not have any <corpname> elements in any of the EADs in your collection and you are using the out-of-the-box samplefa.extra.srch, you will see an error message when xpat tries to index the mainauthor region using this rule:

((region "persname" + region "corpname" + region "famname" + region "name") within (region "origination" within ( region "did" within (region "archdesc")))); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/mainauthor.rgn"}; export; ~sync "mainauthor";

The easiest solution is to modify *extra.srch to match the characteristics of your collection. An alternative is to include a "dummy" EAD that contains all the elements that you expect in your collection with no content.

Warning! If "make post" produces errors, you need to fix them. Otherwise searching and display of your finding aids may produce inconsistant results and crashes of the cgi script. See also Working with Fabricated Regions in Findaid Class


Testing the index

At this point it is a good idea to do some testing of the newly created index. Strategically, it is good to test this from a directory other than the one you indexed in, to ensure that relative or absolute paths are resolving appropriately. Invoke xpat with the following command

xpatu $DLXSROOT/idx/w/workshopfa/workshopfa.dd

For more information about searching, see the XPAT manual.

Try searching for some likely regions. Its a good idea to test some of the fabricated regions. Here are a few sample queries:

>> region "ead"
  1: 3 matches

>> region "eadheader"
  2: 3 matches

>> region "mainauthor"
  3: 3 matches

>> region "maintitle"
  4: 3 matches

>> region "admininfo"
  5: 3 matches

Top

Working with Fabricated Regions in Findaid Class

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Working with Fabricated Regions in Findaid Class

Overview

When you use XPAT in combination with xmlrgn and a DTD, you are identifying the elements and attributes in the DTD or tags file as "regions," containers of content rather like fields in a database. These separate regions are built into the regions file (collid.rgn) and are identified in the data dictionary (collid.dd). This is what is happening when you are running xmlrgn.

However, sometimes the things you want to identify collectively aren't so handily identified as elements in the DTD. For example, the Findaid Class search interface can allow the user to search in Names regions. Perhaps for your collection you want Names to include persname, corpname, geoname. By creating an XPAT query that ORs these regions, you can have XPAT index all the regions that satisfy the OR-ed query. For example:

(region "name" + region "persname" + region "corpname" + region "geoname" +
region "famname")

Once you have a query that produces the results you want, you can add an entry to the *.extra.srch file which (when you run the "make post" command) will run the query, create a file for export, export it, and sync it:

{exportfile "$DLXSROOT/idx/c/collid/names.rgn"} export ~sync "names"

Why Fabricate Regions?

Why fabricate regions? Why not just put these queries in the map file and call them names? While you could, it's probably worth your time to build these succinctly-named and precompiled regions; query errors are more easily identified in the index building than in the CGI, and XPAT searches can be simpler and quicker for terms within the prebuilt regions.

The middleware for Findaid Class uses a number of fabricated regions in order to speed up xpat queries and simplify coding and configuration.

Findaid Class uses fabricated regions for several purposes

  1. To share code with Text Class (e.g. region main)
  2. Fabricated regions for searching (e.g. region names)
  3. Fabricated regions to produce the Table of Contents and to implement display of EAD sections as focused regions such as the "Title Page" or "Arrangement" ( See Working with the table of contents for more information on the use of fabricated regions for the table of contents.)
  4. Other regions specifically used in a PI (region maintitle is used by the PI <?ITEM_TITLE_XML?> used to display the title of a finding aid at the top of each page)

The fabricated region "main" is set to refer to <ead> in FindaidClass with:

(region ead); {exportfile "/l1/idx/b/bhlead/main.rgn"}; export; ~sync "main";

whereas in TextClass "main" can refer to <TEXT>. Therfore, both FindaidClass and TextClass can share the Perl code, in a higher level subclass, that creates searches for "main".

Other fabricated regions are used for searching such as the maintitle and mainauthor regions.

Fabricated Regions in the UI

All of the search links in the dropdown menu for the basic search (see below) are based on indexes for fabricated regions.

Image:Basic_search.png

These are the default regions used for searching and the names used in the menu:

archdesc
Entire Finding Aid
names
Names
places
Places
subjects
Subjects
callnum
Call Number
maintitle
Collection Title
repository
Repository

(The relationship between the region and the name in the menu is set in the map file. See Make Collection Map )


The majority of the fabricated regions for Findaid Class are used for the creation and display of the left hand table of contents in the "outline" view. The findaidclass.cfg file contains a hash called %gSectHeadsHash which is normally loaded into FindaidClass.pm's tocheads hash in the FindaidClass::_initialize method. The elements of the hash and the corresponding fabiricated regions are used to create the table of contents and to output the XML for the corresponding section of the EAD when one of the TOC links is clicked on by a user. The fabricated regions are used so XPAT can have binary indexes ready to use for fast retrieval of these EAD sections. See Customizing Findaid Class: Working with the table of contents for more information on the use of fabricated regions for the table of contents.

Working with extra.srch

Fabricated regions within the Findaid Class can be found in the extra.srch file for the sample collection at $DLXSROOT/prep/s/samplefa/samplefa.extra.srch. As with any other elements used in the interface for a given collection, fabricated regions used in the user interface, such as the names of searches available in the dropdown menu of the search box, must also be represented in the collmgr entry and the map file for that collection.

Some of the more interesting regions extracted from the samplefa.extra.srch file are listed below.

One of these regions is the add. This used to be <ADD> in the EAD 1.0 DTD, but now, is created based on the ead2002 DTD's <descgrp> tag which contains a type attribute of add.

A number of issues related to varying encoding practices can be resolved by the appropriate edits to the *.extra.srch file. (Although some of them may require changes to other files as well)

  • If your <unititle> element precedes your <origination> element in the top level <did>, you will have to modify the "maintitle" fabricated region query in *.extra.srch
  • If you do not use a <frontmatter> element, you will have to make modifications to various files including modifying *.extra.srch to provide an appropriate "Title Page" region based on the <eadheader>
  • If your encoding practices for <biohist> differ from the Bentley's, you may need to make changes in the <bioghist> fabricated region although changes to other files may be suffient. The changes might include: modifying findaidclass.cfg or creating a subclass of FindaidClass and override FindaidClass:: GetBioghistTocHead, and/or changing the appropriate XSL files.
  • If you want sections of the finding aid that are not completely within a well-defined element such as <relatedmaterial>or <separatedmaterial> to show up in the table of contents, you may have to create a fabricated region using the appropriate xpat query and then modify findaidclass.cfg and make other modifications to the code.

 
 
 
   (region ead); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/main.rgn"}; export; ~sync "main";
    
     ##
     (((region "<c01".."</did>" + region "<c02".."</did>" + region "<c03".."</did>" + region "<c04".."</did>" + region "<c05".."</did>" + region "<c06".."</did>" + region "<c07".."</did>" + region "<c08".."</did>" + region "<c09".."</did>") not incl ("level=file" + "level=item")) incl "level="); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/c0xhead.rgn"}; export; ~sync "c0xhead";
        ##
     ((region "<origination".."</unittitle>") within ((region did within region archdesc) not within region dsc)); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/maintitle.rgn"}; export; ~sync "maintitle";
     ##
        
     ((region "persname" + region "corpname" + region "famname" + region "name") within (region "origination" within ( region "did" within (region "archdesc")))); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/mainauthor.rgn"}; export; ~sync "mainauthor";
     ##
    
     (region "abstract" within ((region did within region archdesc) not within region "c01")); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/mainabstract.rgn"}; export; ~sync "mainabstract";
        ##
        ((region unitdate incl "encodinganalog=245$f") within ((region did within region archdesc) not within region dsc)); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/colldate.rgn"}; export; ~sync "colldate";
     ##
     
     (region dsc); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/contentslist.rgn"}; export; ~sync "contentslist";
     ##
      ########## admininfo ########
     admininfot = (region "descgrp-T" incl (region "A-type" incl "admin")); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/admininfo-t.rgn"}; export; ~sync "admininfo-t";
     ##
     ## ########## add ######
     addt = (region "descgrp-T" incl (region "A-type" incl "add")); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/add-t.rgn"}; export; ~sync "add-t";
   ## ########## frontmatter/titlepage ########
   frontmattert = region "frontmatter-T"; {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/frontmatter-t.rgn"}; export; ~sync "frontmatter-t";
     ##
     # frontmatter itself not needed as fabricated region since it exists
     # as a regular xml region
     ##
   ## ########## bioghist ########
     bioghist = ((region "bioghist" within region "archdesc") not within region "dsc"); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/bioghist.rgn"}; export; ~sync "bioghist";
     
   ##bioghisthead = ((region "<bioghist" .. "</head>" within region "archdesc") not within region "dsc"); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/bioghisthead.rgn"}; export; ~sync "bioghisthead";
     ##
   ((region did within region archdesc) not within region dsc); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/summaryinfo.rgn"}; export; ~sync "summaryinfo";;
     ##
   ##
   #############################
   (region "subject" + region "corpname" + region "famname" + region "name" + region "persname" + region "geogname"); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/subjects.rgn"}; export; ~sync "subjects";
   (region "corpname" + region "famname" + region "name" + region "persname"); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/names.rgn"}; export; ~sync "names";
   
    
   #(region "odd-T" ^ (region odd not within region dsc)); {exportfile "/l1/workshop/user11/dlxs/idx/s/samplefa/odd-t.rgn"}; export; ~sync "odd-t";  
 

See samplefa.extra.srch for all of the fabricated regions used with the samplefa collection.

Fabricated regions required in Findaid Class

  • main
  • maintitle
  • mainauthor
  • mainabstract
  • colltitle
  • colldate
  • callnum
  • contentslist
  • contentslist-t
  • admininfo
  • admininfo
  • admininfo-t
  • frontmatter-t
  • bioghist-t
  • arrangement-t
  • controlaccess-t
  • controlaccess
  • scopecontent-t
  • summaryinfo-t
  • summaryinfo

Fabricated regions commonly found in Findaid Class

  • subjects
  • names

Top

Customizing Findaid Class

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Customizing Findaid Class

Working with the table of contents

The table of contents on the left-hand side of the finding aid display is based on fabricated regions set up in *.extra.srch and configured either in a configuration file or in a subclass of FindaidClass.pm

If a subclass is not being used to override the FindaidClass::_initialize method, the configuration file will be used. It is:

$DLXSROOT/cgi/f/findaidclass/findaidclass.cfg 

The configuration file sets up a hash called %gSectHeadsHash. The relevant section of the findaidclass.cfg file is:

# **********************************************************************
# Hash of section heads that XPAT should search for.  A reference to
# this hash is added as member data keyed by 'tocheads' to the
# FindaidClass object at initialization time. Comment out those that
# are missing in your finding aids.
# **********************************************************************
%gSectHeadsHash = (
                  'bioghist-t'      =>  {
                                         'collection' => qq{Biography},
                                         'recordgrp' => qq{History},
                                        },
                  'controlaccess-t' => qq{Subject Terms},
                  'frontmatter-t'   => qq{Title Page},
                  'arrangement-t'   => qq{Arrangement},
                  'scopecontent-t'  => qq{Collection Scope and Content Note},
                  'summaryinfo-t'   => qq{Summary Information},
                  'contentslist-t'  => qq{Contents List},
                  'admininfo-t'     => qq{Access and Use},
                  'add-t'           => qq{Additional Descriptive Data},
                 );


The %gSectHeadsHash is normally loaded read from the configuration file and loaded into a hash called tocheads in the FindaidClass::_initialize method when the FindaidClass object is created. If you wish to change the table of contents on a collection-specific basis, you can override the FindaidClass::_initialize method in a collection-specific subclass.

For an example of using a subclass to override the default table of contents see: $DLXSROOT/cgi/f/findaid/FindaidClass/SamplefaFC.pm


Note that the default setting in the Collection Manager for the samplefa collection is to use this subclass:

image of CollMgr setting for subclass of Findaid Class


The diagram below shows the fabricated region and the corresponding EAD element tags for the out-of-the-box table of contents

Image:Tochead2.jpg

Changing the labels in the table of contents

If you want to change the labels for all of your Findaid Class collections, you can change the strings in the %gSectHeadsHash hash in $DLXSROOT/cgi/f/findaid/findaidclass.cfg. If you want to change the labels on a collection by collection basis, you will probably want to subclass and override the FindaidClass::_initialize method as is done in the sample file: $DLXSROOT/cgi/f/findaid/FindaidClass/SamplefaFC.pm

Adding sections to the table of contents

Changing the Bioghist labels to use the appropriate <head> elemements

Top

Mounting the Collection Online

Main Page > Mounting Collections: Class-specific Steps > Mounting a Finding Aids Collection > Mounting the Collection Online

These are the final steps in deploying an Findaid Class collection online. Here the Collection Manager will be used to create and edit a Collection Database entry for workshopfa . The Collection Manager will also be used to check the Group Database. Finally, we need to work with the collection map and the set up the collection's web directory.


Create and edit an entry in the Collection Database for your collection with CollMgr

Each collection has a record in the collection database that holds collection specific configurations for the middleware. CollMgr (Collection Manager) is a web based interface to the collection database that provides functionality for editing each collection's record. Collections can be checked-out for editing, checked-in for testing, and released to production. In general, a new collection needs to have a CollMgr record created from scratch before the middleware can be used.

Step 1. Create a workshopfa Collmgr entry by copying from samplefa.

A. Login to Collmgr. The URL should be:

http://path_to_cgi/cgi/c/collmgr 

The collmgr page is usually set up to use apache basic authorization. The username and password should have been set up when you set up your virtual host in apache. (sample apache virtual host )

B. Select Manage Collections:Findaid Class:

alt text alt text

C. Select samplefa and click on "copy a collection" (Note: In the image below workshopfa already exists, but in your clean install it will not exist)

alt text

D. Enter your collection id (workshopfa) alt text

E. Change all occurances of "samplefa" to "workshopfa" For example in the section below the webdir should be changed from "s/samplefa" to "w/workshopfa" (And you need to copy and rename the appropriate files from $DLXSROOT/web/s/samplefa to $DLXSROOT/web/w/workshopfa)

WARNING! If you forget to change one of the entries it can lead to very confusing results. For example if you forget to change the "dd" file entry from "/idx/s/samplefa/samplefa.dd" to /idx/w/workshopfa/workshopfa.dd", the middleware will try to search the samplefa collection but all the rest of the configuration information will point to workshopfa, which will result in erratic behavior and potentially confusing error messages.

F. Change the entry for the subclassmodule from "/FindaidClass/SamplefaFC" to "FindaidClass". This means that this collection will use the default FindaidClass.pm instead of the SampleFC subclass. (Unless you want to subclass Findaid Class in which case you would replace "SamplefaFC with the name of your collection-specific subclass)

alt text


G. Set the containerdepth field to the depth of containers in your collection

alt text

For example if you have levels c01 to c05 set the containerdepth to 5. You can use the xpat command {ddinfo regionnames} to look at your data and look for the highest c level to determine what number to put here.

xpatu $DLXSROOT/idx/s/samplefa/samplefa.dd
?>> {ddinfo regionnames}

If you have containerdepth set to a number that is higher than what is in your data, xpat will try to search for the missing c0x level elements and will produce errors. This can occur whenever xpat tries to query the 'c0xheads" fabricated region. For example we set the continer depth to 7 for the samplefa collection (the samplefa collection only has c01-c06) and then got the following error message when we tried to view a kwic (search terms in context) view for the Post Family Papers in our web browser:

Message: Query error in samplefa, samplefa.dd, query=pr.region.c0xhead 
(region "c0xhead" ^ ( region "c07" incl *detailslicesearch ));, 
Error=No information for region c07 in the data dictionary. syntax error before: ))

You will also probably want to edit:

  • fields related to the dynamic browse page (See Create a browse page)
  • fields related to searching and sorting in the user interface: regionsearch, termsearch, sortfields (Note that these need to match the entries in your map file

More Documentation

Review the Groups Database Entry with CollMgr

Another function of CollMgr allows the grouping of collections for cross-collection searching. Any number of collection groups may be created for Findaid Class. Findaid Class supports a group with the groupid "all". It is not a requirement that all collections be in this group, though that's the basic idea. Groups are created and modified using CollMgr.

Make Collection Map

Collection mapper files exist to identify the regions and operators used by the middleware when interacting with the search forms. Each collection will need one, but most collections can use a fairly standard map file, such as the one in the samplefa collection. The map files for all Findaid Class collections are stored in $DLXSROOT/misc/f/findaid/maps

You can find an example map file for the sample finding aids collection at DLXSROOT/misc/f/findaid/maps/samplefa.map. Rather than modifying this file, you should copy it so that you always have a blank copy to which to refer.

You can use the following commands to copy the samplefa.map file to use as a basis for your collection:

 cd $DLXSROOT/misc/f/findaid/maps
 cp samplefa.map workshopfa.map


Map files contain mapped items where one term or name for the item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire finding aid") can be mapped to an XPAT searchable region (e.g., EAD). For more general background on map files, see Working with Map Files


Currently, the format of the map files is XML and each collection map file conforms to a simple DTD (we have considered implementation of other possible ways of mapping terms, such as a database where one could map from one column's data to another). The middleware reads the map file into a TerminologyMapper object after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a <MAPPING> element.

Each mapping element in a map file consists of the following:

label
This element determines what will display in the user's browser when constructing searches. It must match the value used in the collmgr. (See step 2.)
synthetic
This element contains the variable name as it is used in the cgi.
native
The "native" element provides an appropriate XPAT search that the system will use to discover the appropriate content. The search may be simple (e.g., region EADID) or complex (e.g., ((region DID within region ARCHDESC) not within region DSC))
nativeregionname
The element name itself, as it is indexed, without terms used in the XPAT search.

Map files take language that is used in the forms and translates it into language for the cgi and for XPAT. For example, if you want your users to be able to search within names, you need to add a mapping for how you want headings and categories to appear in the search interface (case is important, as is pluralization!), how the cgi variable is set (usually in all caps, and not stepping on an existing variable), and how XPAT will identify and retrieve this natively (in XPAT search language). The first part of the map file is operator mapping, for the form, the cgi, and XPAT, and the second part is for region mapping. You might note that some of the fields that are defined in the map file correspond to some of the fabricated regions. Note: The larger the map file, the slower your site will run, so you don’t necessarily want to map everything, such as variations of singular and plural fields.

More Documentation


Set Up the Collection's Web Directory

Each collection may have a web directory with custom Cascading Style Sheets, interface templates, graphics, and javascript. The default is for a collection to use the web templates at $DLXSROOT/web/f/findaid. Of course, collection specific templates and other files can be placed in a collection specific web directory, and it is necessary if you have any customization at all. DLXS Middleware uses [../ui/index.html#fallback fallback] to find HTML related templates, chunks, graphics, js and css files.

For a minimal collection, you will want two files: index.html and FindaidClass-specific.css.

mkdir -p $DLXSROOT/web/w/workshopfa cp $DLXSROOT/web/s/samplefa/index.html $DLXSROOT/web/w/workshopfa/index.html cp $DLXSROOT/web/s/samplefa/findaidclass-specific.css $DLXSROOT/web/w/workshopfa/findaidclass-specific.css

As always, we'll need to change the collection name and paths. You might want to change the look radically, if your HTML skills are up to it.

Note that the browse link on the index.html page is hard-coded to go to the samplefa hard-coded browse.html page. You may want to change this to point to a dynamic browse page (see below). The url for the dynamic browse page is ".../cgi/f/findaid/findaid-idx?c=workshopfa;page=browse".

If you would prefer a dynamic home page, you can copy and modify the home.xml and home.xsl files from $DLXSROOT/web/f/findaid/. Note that they are currently set up to be the home page for all finding aids collections, so you will have to do some considerable editing. However they contain a number of PIs that you may find useful. In order to have these pages actually be used by DLXS, they have to be present in your $DLXSROOT/web/w/workshopfa/ directory and there can't be an index.html page in that directory. The easiest thing to do, if you have an existing index.html page is to rename it to "index.html.foobar" or something.

Create a browse page

See the documentation: Setting up Dynamic Browsing

Try It Out

http://$DLXSROOT/cgi/f/findaid/findaid-idx?c=workshopfa

Top

Troubleshooting Finding Aids

Linking from Finding Aids Using ID Resolver

Workshop Materials

Working with the User Interface

General user interface customizations, such as changing rendering style (CSS) or making changes to the XSL are covered in Customizing the User Interface. Specific user-interface issues related to Findaid Class are discussed in the following sections:

Findaid Class Graphics Files

Are there findaid class specific graphics files? The existing html docs actually point to a ../t/text/ directory and it appears that the graphics are generic and not at all specific to findaid class.

Findaid Class Processing Instructions

These are some current processing instructions for Finding Aids Class, but the DLXS group will not maintain this section.

Top

Personal tools