DLXS to DC

From DLXS Documentation

(Difference between revisions)
Jump to: navigation, search
Line 9: Line 9:
<pre>
<pre>
# ./ExtractHeaders.pl
# ./ExtractHeaders.pl
-
USAGE:
 
-
-h this usage message
 
-
-c the full path of xml file that contains a list of collections to convert (required).
 
-
-u update time: only process collections which have been updated since this given time
 
- 
-
The xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS>
 
-
</pre>
 
- 
-
===XSLT===
 
- 
-
The XSLT stylesheets that we have as examples to use are:
 
- 
-
* textClassToDc.xsl - Stylesheet that does the XML transformation from Text Class to DC
 
-
* bibClassToDc.xsl - Stylesheet that does the XML transformation from Bib Class to DC. This is used to static collections. It takes the series title and puts in a dc:source tag.
 
-
* articlesToDc.xsl - Stylesheet that does XML transformation from Text Class serial collections to DC.
 
===ConvertToDc.pl===
===ConvertToDc.pl===
Line 48: Line 33:
The resulting data should end up in '''$DLXSROOT/prep/o/oai/provider/'''. From there you can use the [[OAI_Provider#LoadOai.pl|LoadOai.pl]] script to load that XML data into the OAI tables.
The resulting data should end up in '''$DLXSROOT/prep/o/oai/provider/'''. From there you can use the [[OAI_Provider#LoadOai.pl|LoadOai.pl]] script to load that XML data into the OAI tables.
 +
 +
USAGE:
 +
-h this usage message
 +
-c the full path of xml file that contains a list of collections to convert (required).
 +
-u update time: only process collections which have been updated since this given time
 +
 +
The xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS>
 +
</pre>
 +
 +
===XSLT===
 +
 +
The XSLT stylesheets that we have as examples to use are:
 +
 +
* textClassToDc.xsl - Stylesheet that does the XML transformation from Text Class to DC
 +
* bibClassToDc.xsl - Stylesheet that does the XML transformation from Bib Class to DC. This is used to static collections. It takes the series title and puts in a dc:source tag.
 +
* articlesToDc.xsl - Stylesheet that does XML transformation from Text Class serial collections to DC.

Revision as of 16:09, 5 June 2008

Converting DLXS data to DC

Release_14:

This page provides some instruction for converting DLXS collections to Dublin Core (DC) so that they can be loaded into the OAI database table for the UMProvider. Depending on your data, these scripts and stylesheets may require some tweaking to work for you. The stylesheets and scripts can be found in $DLXSROOT/bin/o/oai/provider/ (after release 14).

Extract Headers

ExtractHeaders.pl will extract the headers from XPAT for given collections and places the extracted data in $DLXSROOT/bin/o/oai/headres/.

# ./ExtractHeaders.pl

===ConvertToDc.pl===
The '''ConvertToDc.pl''' script takes a XSLT file, the XML file containing the data and a XML file containing the list of collections that should be converted.

<pre>
# ./ConvertToDc.pl
USAGE:
	-h this usage message
	-t the full path of the xsl file that does the Text class to dc transformation (required)
	-b the full path of the xsl file that does the Bib class to dc transformation (required)
	-a the full path of the xsl file that does the transformation from text class to articles to dc (required)
	-c the full path of xml file that contains a list of collections to convert (required).
	-d the directory that contains the header XML files to parse (required).

The collections xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS>.
The headers xml file is assumed to be in the format <RSet>..<HEADER>...</HEADER>...</RSet>.
(It will contain extra xpat wrappers between the <RSet> and <HEADER> tags.)
Converted files (in DC) are stored in '$DLXSROOT/prep/o/oai/provider'

Example: ./ConvertToDc.pl -c listOfColls.xml -t textClassToDc.xsl -b bibClassToDc.xsl -a articlesToDc.xsl -d /l1/prep/o/oai/headers/

The resulting data should end up in $DLXSROOT/prep/o/oai/provider/. From there you can use the LoadOai.pl script to load that XML data into the OAI tables.

USAGE: -h this usage message -c the full path of xml file that contains a list of collections to convert (required). -u update time: only process collections which have been updated since this given time

The xml file is assumed to be in the format <COLLS><COLL>collectionName</COLL></COLLS> </pre>

XSLT

The XSLT stylesheets that we have as examples to use are:

  • textClassToDc.xsl - Stylesheet that does the XML transformation from Text Class to DC
  • bibClassToDc.xsl - Stylesheet that does the XML transformation from Bib Class to DC. This is used to static collections. It takes the series title and puts in a dc:source tag.
  • articlesToDc.xsl - Stylesheet that does XML transformation from Text Class serial collections to DC.
Personal tools