Release Notes for Current DLXS Release

From DLXS Documentation

(Difference between revisions)
Jump to: navigation, search
(ImageClass)
(Oai)
Line 96: Line 96:
'''cgi/b/bib'''
'''cgi/b/bib'''
-
===Oai===  
+
===Oai===
-
 
+
-
'''cgi/o/oai/'''
+
-
* oai, oai.cfg, oai_conf.xml, sample_config.xml, UMProvider.pm
+
-
** new -- new OAI data provider and accompanying scripts
+
-
 
+
-
'''bin/o/oai/'''
+
-
 
+
-
* ConfirmPublicDomain.pl, loadOai.pl, mbooks_harvest_cron.pl, mbooks_update.pl, oai_conf.xml, OaiList.pl, OaiToDb.pl, RepositoryConfig.xml, revertOaiTbl.pl, updateMbooksOai.pl
+
-
** new -- new OAI data provider and accompanying scripts
+
-
 
+
-
'''bin/o/oai/provider'''
+
-
 
+
-
* AddCollToSource.pl, articlesToDc.xsl, bibClassToDc.xsl, CheckModifiedColls.pl, collmgrFullList.txt, ConvertToDc.pl, ConvertToUTF8.pl, dlxs_to_oaidc_cron.pl, exampleColls.xml, ExtractHeaders.pl, GenerateReport.pl, GetNewCollections.pl, LoadDB.pl, NewOAICollection.jpg, oai_new_coll_history.txt, oai_update_history_text.txt, oai_update.pl, README.txt, RunDlpsOaiConversion.pl, RunOaiConversion.jpg, RunOaiConversion.pl, textClassToDc.xsl, umprovider_flow_diagrams_and_examples.ppt, UMProviderFlow.jpg, updateCollConfig.pl
+
-
** new -- scripts to transform text, bib, and image class collections into oai_dc for use by the OAI data provider. See the README.txt file in this directory for a brief description of each file
+
-
 
+
-
'''bin/o/oai/provider/logs'''
+
-
 
+
-
* This directory houses the logs generated by the scripts in bin/o/oai/provider
+
-
 
+
-
'''bin/o/oai/provider/reports-new-colls'''
+
-
 
+
-
* This directory houses the reports generated by bin/o/oai/provider/GetNewCollections.pl
+
-
 
+
-
'''bin/o/oai/provider/reports-oai-update'''
+
-
 
+
-
* This directory houses the reports generated by bin/o/oai/provider/GenerateReport.pl
+
-
 
+
-
'''bin/i/image/'''
+
-
 
+
-
* image2oai_dc.pl
+
-
** new -- script for loading OAI tables with Image Class data in MySQL
+
-
 
+
-
'''bin/o/oaister/'''
+
-
 
+
-
* RepositoryConfig.cfg
+
-
** new -- contains baseURLs and OAITransFixer.pm subscript calls
+
-
 
+
-
* oaitransform/CollsObj.cfg
+
-
** new -- config for CollsObj.pm
+
-
 
+
-
* oaitransform/CollsObj.pm
+
-
** new -- used for FMPro connection for OAIster
+
-
 
+
-
* oaitransform/DataConditioning.pm
+
-
** Get rid of any remaining stray sp0t tags
+
-
 
+
-
* oaitransform/MODSTransform
+
-
** Fix bug where MODS record was not ignored for re-exposure if no URL in record
+
-
** Changed the tree removing metadata from the re-exposed XML, changed the language code mapping, added setSpec to about container
+
-
** Swap dates in about container and change MODS loc.gov URL
+
-
** Remove NS prefix and NS from metadata element
+
-
** Remove all xmlns attributes as well
+
-
** Add all orig setSpec to provenance element for MODS re-exposure
+
-
** Hack to fix xsi: problem for Northwestern
+
-
** Change to pick up URLs in identifier type=uri
+
-
 
+
-
* oaitransform/OAITransFixer.pm
+
-
** new -- perl subscripts for fixing harvested data
+
-
 
+
-
* oaitransform/OAITransform
+
-
** Added flag to skip DTD validation change, the PostXslt process to be after each small file, not the final large one
+
-
** Added code to regenerate the perl config files from the XML config files each time this is run
+
-
** Comment out LoadRepositoryLookTable() and changed SetCurrentArchive() to use XML config value
+
-
** Move removing attribute to after data conditioning
+
-
** Patch to fix latin1 regex problem
+
-
** Unescape hex and dec entity refs
+
-
** Added fixer call and other minor changes
+
-
** Skip deleted records
+
-
** Add skipped and deleted to repo count
+
-
** Get the return from PreprocessXML bug fix
+
-
** Semi opt for hex and dec ent refs
+
-
** Added CollsDB
+
-
 
+
-
* oaitransform/browse.tpl
+
-
** new -- template for building OAIster browse pages
+
-
 
+
-
* oaitransform/manageCollsDb.pl
+
-
** new -- CGI script for managing FMPro DB for oaister
+
-
 
+
-
* oaitransform/mods-bibclass.xsl
+
-
** Added copyrightDate YR
+
-
** urls in own URL tags
+
-
** Change to pick up URLs in identifier type=uri
+
-
 
+
-
* oaitransform/mods_repositoryNames.pl
+
-
** Fixed amp;
+
-
 
+
-
* oaitransform/oai-bibclass3.xsl
+
-
** Added tests for empty elements
+
-
 
+
-
* oaitransform/repositoryNames.pl
+
-
** Add & fix
+
-
 
+
-
* scripts/Batch_UMHarvest
+
-
** Add skip HTML flag for transform
+
-
** Allow -x for transform
+
-
** Added -a for incremental batch harvesting
+
-
 
+
-
* scripts/ListRecords
+
-
** Error code checking
+
-
** Do not exit on OAI error
+
-
** Fixed log reporting
+
-
** Make list of dirs, not files, to check for replacement records -- speed up incremental
+
-
 
+
-
* scripts/UMHarvester
+
-
** Changed to use the XML config file; requires XML::LibXML now
+
-
** Added wait flag
+
-
** Check for sets cfg before trying to parse it, only loop on sets for list records
+
-
** ID bug fix
+
-
 
+
-
* scripts/browse_check.pl
+
-
** new -- checking FMPro repositories against browse pages
+
-
 
+
-
* scripts/id_check.pl
+
-
** new -- checking FMPro repositories against incremental batch harvest
+
-
 
+
-
* scripts/rc_check.pl
+
-
** new -- checking FMPro repositories against RepositoryConfig.cfg
+
-
 
+
-
* scripts/restoreHarvestedData.pl
+
-
** new -- restores backup copy of repository dir
+
-
 
+
-
* scripts/startIndex.pl
+
-
** new -- runs xpat and multirgn on xml obj files
+
===IdResolver===
===IdResolver===

Revision as of 11:37, 19 October 2010

Main Page > Release Notes for Current DLXS Release

Contents

General Information

TextClass is substantially identical to release 14 except for enhancements and bug fixes as noted below. ImageClass provides new image viewing functionality. FindaidClass improves handling of the EAD DTD and includes subject browsing. BibClass is unchanged and is being de-emphasized.

Release 15 is comprised of:

Known Problems

  • None

Database Installation Notes

MySQL is the supported database type. In order to run DLXS you will need to have a MySQL server installed. Sample data is delivered in the form of a MySQL dump file which can be directly imported into a MySQL database. The database upgrade script (upgrade_6_7) operates only on a MySQL database. These issues are documented in detail in the installation instructions and the upgrade instructions.

New and Changed Functionality

XPAT

  • No changes.

Lib

TextClass

web/t/text

  • some file name
    • message

ImageClass

CGI/Middleware

Known Problems

  • None known so far.

Enhancements


Data Preparation

Enhancements

BibClass

cgi/b/bib

Oai

IdResolver

cgi/i/idresolver/

  • idresolver
    • CGI script that returns a URL (marked up in XML) for an ID
  • idresolver-nr
    • new -- CGI script that redirects a user from a persistent URL to one defined in the nameresolver database for an ID, can be used instead of cgi/b/bib/bibperm
  • idresolver-srch
    • new -- CGI script that takes a text file of ids (one per line) and checks to see if they exist in the nameresolver database
  • IDResolver.pm
    • Perl module that is used by each of the cgi scripts

bin/i/idresolver/

  • cvstag.idresolver, rdist.class

bin/n/nameresolver/

  • IdParser.pl
    • new -- script used to populate the nameresolver database
  • LoadLLMCIds.pl
    • new -- this is a sample script of how you can use a .csv file to populate the nameresolver database
  • CreateTable.pl, NRTable.sql
    • new -- scripts for creating nameresolver table in database (which is already included in the release)
  • new -- other sample files: DeepBlueShortIds.txt, getUpdatedNRfiles, InsertCrosscIssues.pl, N2TestExecution.txt, NRDevVsProdComparison.xls, README.txt, RemoveDeepBlueConflicts.pl, statusLoadingNRinProd.txt, TestIdParser.pl
    • These files were used for DLPS collections and are provided only as additional resources. They are not necessary to the Idresolver/nameresolver configuration

broker20

  • No changes.

Collmgr

  • Supports version 7.0 database for DLXS release14.
  • default browse page for Text and Findaid Class now set from first item in Browsefields list

FindaidClass

Findaid Class Summary

Prep scripts

Data prep scripts have been reorganized and renamed.

  • New functionality for Makefile and preparedocs.pl
  • New script to setup new collections

New prep script: setup_newcoll

$DLXSROOT/bin/f/findaid/setup_newcoll can be used to set up directories for new collections. For example, to set up the workshopfa collection based on samplefa (Assuming your $DLXSROOT variable is set)run this command:

 $DLXSROOT/bin/f/findaid/setup_newcoll -c workshopfa  -s $DLXSROOT/prep/s/samplefa/data 

More information on the setup_newcoll script can be found by clicking here or invoking the man page:

$DLXSROOT/bin/f/findaid/setup_newcoll --man

New options for preparedocs.pl

The $DLXSROOT/bin/s/samplefa/preparedocs.pl script now takes several new arguments.

  ./preparedocs.pl --man   will give details

of particular interest is the preparedocs.pl -i inputfilelist option which allows you to specify a file containing the full paths to the finding aids you wish to index. This option can be used instead of the previous default behavior which was to recursively search the data directory and index all files in that directory.

Changes to Makefile

Addition of these new targets:

prepdocslist
Same as prepdocs but uses the -i inputfilelist to preparedocs.pl and by default reads a file $DLXSROOT/prep/c/collection/list_of_eads.
allbutprep
convenience target for use in conjunction with prepdocslist that does all the make steps except for the make prepdocs
index
convenience method that runs all 3 indexing steps

Changes to prep scripts

$DLXSROOT/bin/s/samplefa
Moved generic bin files to f/findaid
Makefile modified
preparedocs.pl has new options
$DLXSROOT/prep/s/samplefa
Renamed files
	samplefa.text.inp  to samplefa.ead2002.dcl
	samplefa.xml.inp to samplefa.concat.ead.dcl
Removed:
	samplefa.inp
Added:
   list_of_eads
$DLXSROOT/bin/f/findaid
Removed:
	catsourcefiles.pl	
	isolat128bit.pl
	validate.pl
Added/moved from bin/s/samplefa
	fixdoctype.pl		
	stripdoctype.pl
	validateeach.sh
        setup_newcoll

web/f/findaid

  • bookbagitemsstring_debug.xsl
    • For debugging in oxygen since the Oxygen debugger uses Saxon and Saxon doesn't understand the extensions
  • browse.xsl
    • Added code for subject browse
  • browseheader.xsl
    • bulk dates labelled
  • htmlhead.xsl
    • emit full path to main XML template for <TemplateName> element
    • changed TemplateName to TemplatePath
  • text.components.xsl
    • Modified template for processing c0x's so that if there are two containers within a //c0x/did/ such as box/folder they will both show up in the proper column.
    • Highlighting fix. Replaced about 36 instances of <xsl:value-of select="."/> with <xsl:apply-templates select="*|text()"/> If value-of select gets highlighted text in the context node: "text<HIGHLIGHT>text</HIGHLIGHT>text text" it will ignore the <HIGHLIGHT> elemnts and just render all the text. On the other hand the apply-templates "*" will match the <HIGHLIGHT> element and trigger the appropriate highlighting template and the text() nodes will get passed to the template that just outputs the text
    • Fixed code in template match=unittitle mode=SimpleUnittitle because foreach was messing with context Also change code for handling notes/scopeconent in match=C01|c02... because it was an xsl:choose, but notes and scopecontent are not mutually exclusive
    • Changed code for handling notes/scopeconent in match=C01|c02... now both did/note and note and did/scopecontent and scopecontent will be rendered
    • fixed bug in template match="list" where there was a foreach and then value-of select *|text() that needed to be value-of select="."
    • space before unitdate value
    • title styling
    • optional labels for additional descriptive material
    • hide sorting title
    • sponsor
    • Restored ADD to full-text view
    • abstracts in dids
    • Change template for index mode= add so it doesn't produce 2 copies of any <head> text. Key is to limit the apply templates after we already processed the head not to process the head again: <xsl:apply-templates select="*[not(self::head)]"/>

bin/f/findaid

  • catsourcefiles.pl
    • No longer used for samplefa (replaced by preparedocs.pl) Also it contains bhl specific code. For bhl use the copy in $DLXSROOT/bin/b/bhlead which is in cvs.
  • fixdoctype.pl
    • Moved here from bin/s/samplefa No collection-specific customization These should work on any EAD that conforms to the ead2002.dtd
  • isolat128bit.pl
    • No longer used for findaids, since all findaids should be utf8 encoded
  • setup_newcoll
    • New script to set up directories ./setup_newcoll --man for details
  • stripdoctype.pl
    • This library file is now used by preparedocs.pl and validateeach.sh (through $DLXSROOT/bin/s/samplefa/fixdoctype.pl)to correctly remove multiple line DOCTYPE declarations and any entity references contained within them. It is a replacement for the one-line perl program previously used by those two programs.
  • validate.pl
    • Removed outdated file that worked on sgml files. FindaidClass now exclusively xml
  • validateeach.sh
    • Moved here from bin/s/samplefa No collection-specific customization Thise should work on any EAD that conforms to the ead2002.dtd

cgi/f/findaid

  • FindaidApp.pm
    • remove FormatGuideFrame() obsolete since the change to XML/XSL
  • FindaidAppXsltPIFiller.pm
    • Removed FormatOutlineResult_XML and FormatOutlineFrame_XML as they are not called by any code (or bound to any PIs).
  • FindaidClass.pm
    • highlight hits in layer 1 result items
    • Added highlighting to BuildItemTitle_XML so highlighting will show up in title
    • Fixed bug in FilterAllDaos_XML that would not properly process daos with real hrefs and would result in illegal xml being output when id resolver is turned on.
  • FindaidClass/ClementsmssFC.pm
    • removes "viewtextnote" speedbump
  • FindaidClass/DemofaFC.pm
    • Demo of subclassing
    • add relatedmaterial and separated material TOC heads
    • use <head> tags for bioghist instead of bentley logic for TOC heads
    • change labels for several TOC heads
  • FindaidClass/BioghistfaFC.pm
    • Demo of subclassing to use <head> tags for bioghist instead of bentley logic

XClass

  • No changes.

METS Pageturner and Collection Builder

  • Continuing development work in Pageturner. New application: Collection Builder allows users to add items to a personal collection via widgets in Pageturner. These applications are not part of DLXS. They use a different code base mainly under DLXSROOT/{web,bin,cgi}/m/mdp and DLXSROOT/lib/App. Stub routines are required in Pageturner to abstract the database connections and an installation of Solr/Lucene is required to support the collection search in Collection Builder. The user interface makes extensive use of Yahoo User Interfacen toolkit (YUI) functionality.

SRU

  • Added fielded searching to query -- not yet Level 1 or 2, though

Top

Personal tools