DLXS to DC
From DLXS Documentation
Main Page > Ancillary Resources > OAI Provider > DLXS to DC
Contents |
[edit] Converting DLXS data to DC
This page provides some instruction for converting DLXS collections to Dublin Core (DC) so that they can be loaded into the OAI database table for the UMProvider. Depending on your data, these scripts and stylesheets may require some tweaking to work for you. The stylesheets and scripts can be found in $DLXSROOT/bin/o/oai/provider/ (after release 14).
[edit] Extract Headers
ExtractHeaders.pl will extract the headers from XPAT for given collections and places the extracted data in $DLXSROOT/bin/o/oai/headres/.
# ./ExtractHeaders.pl USAGE: -h this usage message -c the full path of xml file that contains a list of collections to convert (required). -u update time: only process collections which have been updated since this given time The xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS>
[edit] ConvertToDc.pl
The ConvertToDc.pl script takes a XSLT file, the XML file containing the data and a XML file containing the list of collections that should be converted.
# ./ConvertToDc.pl USAGE: -h this usage message -t the full path of the xsl file that does the Text class to dc transformation (required) -b the full path of the xsl file that does the Bib class to dc transformation (required) -a the full path of the xsl file that does the transformation from text class to articles to dc (required) -c the full path of xml file that contains a list of collections to convert (required). -d the directory that contains the header XML files to parse (required). The collections xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS>. The headers xml file is assumed to be in the format <RSet>..<HEADER>...</HEADER>...</RSet>. (It will contain extra xpat wrappers between the <RSet> and <HEADER> tags.) Converted files (in DC) are stored in '$DLXSROOT/prep/o/oai/provider' Example: ./ConvertToDc.pl -c listOfColls.xml -t textClassToDc.xsl -b bibClassToDc.xsl -a articlesToDc.xsl -d /l1/prep/o/oai/headers/
The resulting data should end up in $DLXSROOT/prep/o/oai/provider/. From there you can use the LoadOai.pl script to load that XML data into the OAI tables.
USAGE: -h this usage message -c the full path of xml file that contains a list of collections to convert (required). -u update time: only process collections which have been updated since this given time
The xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS> </pre>
[edit] XSLT
The XSLT stylesheets that we have as examples to use are:
- textClassToDc.xsl - Stylesheet that does the XML transformation from Text Class to DC
- bibClassToDc.xsl - Stylesheet that does the XML transformation from Bib Class to DC. This is used to static collections. It takes the series title and puts in a dc:source tag.
- articlesToDc.xsl - Stylesheet that does XML transformation from Text Class serial collections to DC.