Finding Aids Workshop Quick Reference
From DLXS Documentation
Finding Aids Workshop Quick Reference.
Return to main workshop page: http://dev.umdl.umich.edu/d/dlxs/training/workshop200808 ( redo to point to production)
Prepare Directories and Copy Files
Set up directories and files for Data Preparation
For more details see: Step by step instructions for setting up Directories for Data Preparation
To check your $DLXSROOT, type the following command at the command prompt:
echo $DLXSROOT
Create your prep and prep/data directories
mkdir -p $DLXSROOT/prep/w/workshopfa/data cd $DLXSROOT/prep/w/workshopfa
Copy data to your data directory
cp $DLXSROOT/prep/s/samplefa/data/*.xml $DLXSROOT/prep/w/workshopfa/data/.
Copy doctype declaration files:
cp $DLXSROOT/prep/s/samplefa/samplefa.ead2002.dcl $DLXSROOT/prep/w/workshopfa/workshopfa.ead2002.dcl cp $DLXSROOT/prep/s/samplefa/samplefa.concat.ead.dcl $DLXSROOT/prep/w/workshopfa/workshopfa.concat.ead.dcl
Create the obj and bin directories and copy files to your bin directory:
mkdir -p $DLXSROOT/obj/w/workshopfa mkdir -p $DLXSROOT/bin/w/workshopfa cp $DLXSROOT/bin/s/samplefa/preparedocs.pl $DLXSROOT/bin/w/workshopfa/preparedocs.pl cp $DLXSROOT/bin/s/samplefa/Makefile $DLXSROOT/bin/w/workshopfa/Makefile
Make sure you changed your copy of the Makefile to reflect /w/workshopfa instead of /s/samplefa and that your $DLXSROOT is set correctly in the Makefile. You will want to change lines 1-3 accordingly
1 DLXSROOT = /l1 2 NAMEPREFIX = samplefa 3 FIRSTLETTERSUBDIR = s
Set Up Directories and Files for XPAT Indexing
For more details see:Set Up Directories and Files for XPAT Indexing
mkdir -p $DLXSROOT/idx/w/workshopfa
cp $DLXSROOT/prep/s/samplefa/samplefa.blank.dd $DLXSROOT/prep/w/workshopfa/workshopfa.blank.dd cp $DLXSROOT/prep/s/samplefa/samplefa.extra.srch $DLXSROOT/prep/w/workshopfa/workshopfa.extra.srch
Both of these files need to be edited to reflect the new collection name and the paths to your particular directories.
cd $DLXSROOT/prep/w/workshopfa
Edit the files to change all samplefa and s/samplefa to workshopfa w/workshopfa
After editing the files, you can check to make sure you changed all the "samplefa" strings with the following command:
grep -l "samplefa" $DLXSROOT/prep/w/workshopfa/*
Data Preparation
Validating and Normalizing Your Data
Step 1: Validating the files individually against the EAD 2002 DTD
cd $DLXSROOT/bin/w/workshopfa make validateeach
Check the error files by running the following commands
ls -l $DLXSROOT/prep/w/workshopfa/data/*err
if there are any *err files, you can look at them with the following command:
less $DLXSROOT/prep/w/workshopfa/data/*err
Step 2: Concatentating the files into one larger XML file (and running some preprocessing commands)
cd $DLXSROOT/bin/w/workshopfa make prepdocs
Step 3: Validating the concatenated file against the dlxsead2002 DTD
make validate
Check for errors by running the following command
ls -l $DLXSROOT/prep/w/workshopfa/workshopfa.errors
If there is a workshopfa.errors file then run the following command to look at the errors reported
less $DLXSROOT/prep/w/workshopfa/workshopfa.errors
Step 4: Normalizing the concatenated file
make norm
Check for normalization errors:
less $DLXSROOT/prep/w/workshopfa/workshopfa.osgmlnorm.errors
Step 5: Validating the normalized file against the dlxsead2002 DTD
make validate2
Check the resulting error file:
less $DLXSROOT/prep/w/workshopfa/workshopfa.errors2
Indexing
Step by Step Instructions for Indexing
Step 1: Indexing the text
Index all the words in the file of concatenated EADs with the following command:
cd $DLXSROOT/bin/w/workshopfa make singledd
Step 2: Indexing the the XML
Index all the elements and attributes listed in the ead DTD that occur in the file of concatenated EADs by running the following command:
make xml
After running this step, if you wish, you can see the indexed regions by issuing the following commands:
xpatu $DLXSROOT/w/workshopfa/workshopfa.dd >> {ddinfo regionnames} >> quit
You can also test out the xpat queries in your workshopfa.extra.srch file. See Testing Fabricated Regions
Step 3: Configuring fabricated regions
- Once you have run "make xml", but before you run "make post", start up xpatu running against the newly created indexes:
xpatu $DLXSROOT/idx/w/workshopfa/workshopfa.dd
then run the command
>> {ddinfo regionnames}
This will give you a list of all the XML elements, and attributes
Step 4: Indexing fabricated regions
Index the fabricated regions specified in your workshopfa.extra.srch that occur in the file of concatenated EADs with the following command:
make post