Release Notes for Current DLXS Release

From DLXS Documentation

Revision as of 11:30, 19 October 2010 by Pfarber (Talk | contribs)
Jump to: navigation, search

Main Page > Release Notes for Current DLXS Release

Contents

General Information

TextClass is substantially identical to release 14 except for enhancements and bug fixes as noted below. ImageClass provides new image viewing functionality. FindaidClass improves handling of the EAD DTD and includes subject browsing. BibClass is unchanged and is being de-emphasized.

Release 15 is comprised of:

Known Problems

  • None

Database Installation Notes

MySQL is the supported database type. In order to run DLXS you will need to have a MySQL server installed. Sample data is delivered in the form of a MySQL dump file which can be directly imported into a MySQL database. The database upgrade script (upgrade_6_7) operates only on a MySQL database. These issues are documented in detail in the installation instructions and the upgrade instructions.

New and Changed Functionality

XPAT

  • No changes.

Lib

TextClass

web/t/text

  • browse.xml
    • added browseextra.xsl inc
  • clipviewer.xml
    • added some xsl incl
  • header.xsl
    • added code to allow for print on demand links to amazon.com from podPermittedItems
  • htmlhead.xsl
    • emit full path to main XML template for <TemplateName> element
    • changed TemplateName to TemplatePath
    • If we're viewing a single item, put its title in the HTML head.
    • Same as above, but also for view=trgt and page=root.
  • langmap.en.xml
    • add feature PRF (Preface) to langmaps
  • langmap.fr.xml
    • add feature PRF (Preface) to langmaps
  • pageviewerheader.xsl
    • Stub template for ProcessSerialarticle.
  • results.xsl
    • No more STATUS="hidden".
  • resultsheader.xsl
    • BIBLSCOPE filtering: made like tocheader.xsl (no comma before issuetitle).
    • <xsl:value-of select="$pubinfo"/> --> <xsl:copy-of select="$pubinfo"/> to write child nodes (e.g. <div>) to the HTML.
    • No comma after <div>[issuetitle]</div> for serialissue.
    • minor change to fix missing name search bug
  • scopedivs.xsl
    • TYPE="hidden" allows selective non-display of DIVn (and all descendents) in TOC. (Note distinction from mis-named STATUS="hidden".)
    • STATUS="hidden" now actually hides the DIV. No more TYPE="hidden".
  • text.components.xsl
    • FilterNumberedNotesWithParams: wrap paragraphs after the first in <p>.
    • LG template calls template name="addRend".
    • In <xsl:template match="P">, don't normalize value of @ID.
    • More robust filtering for filterNoteWithParas.
    • Handle CELL/@ROWSPAN.
    • In <xsl:template match="REF">, removed special case for PARENT::ITEM which appeared to be abandoned code.
    • For PBs, Wrap the DIV in an OBJECT so that we still have valid XHTML in the event that we're currently inside a P.
    • OBJECT wrapper around PB text breaks in Safari; better to use span with display: block.
    • Add anchors to ITEMs which have IDs.
    • Create anchors for ID'd Ps in filterNumberedNoteWithParas.
    • In filterNumberedNoteWithParas, check whether we're in NOTE2 (in addition to NOTE1).
    • XML table elements (TABLE, ROW, CELL) get a class applied to them, obscuring any REND styles in the markup. I've changed code to apply the value of @REND to the class value, e.g. <td class="xmltd-rend-center">.
    • L: pass forward all @RENDs, nit just ones that start with "line".
  • textclass.css
    • Added rend-plain.
    • Added rend-isub.
    • Added .rend-rightjustify.
    • Changed .pbtext.
    • List style .nomarker.
    • Add margin-bottom to div.lg and the like.
    • rend-aligntop.
  • tocheader.xsl
    • BIBLSCOPE filtering.
  • viewer.utils.xsl
    • minor change to fix missing name search bug
  • viewtextnote.xsl
    • remove crash your browser
    • OCR quality

bin/t/text

  • CER.pm
    • added dagger2 &#x2021; labr 〈 long &#xAF; rabr 〉 short ˘
  • dtdalyzer.pl
    • Added ROWSPAN attribute.
    • Added DOC attr.
  • utf8chars
    • fix error checking and display data

cgi/t/text

  • CVApp.pm
    • added pgseq to cache filename and change default view to pdf
  • DlpsLocalUtils.pm
    • Refactored LocalIdResolver(), moving substantial code into CreatePicklist().
    • CreatePicklist: delete unwanted params from tempCgi that may be hanging around (possibly due to URL hacking) and that we don't want passed forward in picklist links.
  • TextAppXsltPIFiller.pm
    • print on demand links to amazon.com added
  • TextClass.pm
    • Add RemoveXMLPi for Kwic processing
    • Filter_REFsForText: treat <REF TYPE="txt"> same as <REF TYPE="ptr">.
    • GetDateParsePattern: optional hyphens between year-month-day parts of a sortdate.
    • Output FirstPageHref for layer2 serialissue results.
    • minor change to fix missing name search bug - close div at end of div1headbib
    • another change to fix missing NODE param in search error
  • textclass.cfg
    • added config to allow POD db lookups
  • ClipView/mdailyCV.pm
    • new

ImageClass

CGI/Middleware

Known Problems

  • None known so far.

Enhancements

  • Image Viewing
    • Ajax based zooming and panning of imagery. Works with JPEG2000 and MrSID.
    • getimage-idx cgi has been completely rewritten and is backward compatible with previous version.
    • XML, XSL, CSS, and Javascript for entry/image view have changed significantly.
    • Running mediaprep with purge=1 is recommended, but not required. It now stores thumbnail dimentions in the ImageClassMediaFiles table and uses them for drawing zooming reference visuals on the thumbnail adjacent to zoomable images. Image Class now requires ImageMagick and PerlMagick to be installed.
  • Portfolios
    • Longer descriptions are allowed in custom sorting display.
    • Multiple owners/editors.
    • Renaming.
    • Added documentation for end-users.
    • Fixed bug that kept anonymous user from opening their own session based portfolio.
    • Portfolio (BookBag) IDs are now generated randomly to avoid potential database replication problems.
  • Searching/Browsing
    • Relevance ranking is now an option for Image Class search results. To activate simply add "relevance" to the Collmgr field "sortflds" (first list value should be "none" and second "relevance"). Sorting of results by relevance will be the default for the collection once configured. Relevance ranking is not used when searching multiple collections.
    • Searching has been packaged and can be subclassed. A subclass for integrating search results from the ARTstor XML web service is included and can be used by ARTstor members. Support for other services, databases or database schemas can be added by subclassing the ImageSearch.pm Perl module.
    • Browsing of newly added or update media items is now possible. Requires reloading metadata for the collection. In Collmgr set brwsadds to "on", add "m_flm:::Recently Added/Updated" to field_labels, and add "m_flm" to sortflds, dfltentryflds, and dfltresentryflds.
    • All collections are selected by default in cross collection search. This is now a configurable option at the class level in imageclass.cfg.
  • Collection Size Calculation
    • Collection size counts are now stored, by image-idx, in Collmgr. See help text for new Collmgr fields "recordcount" and "mediacount".

Data Preparation

Enhancements

  • It is possible to configure a development mysql server for loading metadata. load.pl script will use production mysql server to get information from Collmgr, ImageClassMediaFiles, etc. but will populate new tables on development server. This is not generally necessary, but can help to reduce production server load with very large collections (75,000+ records).
  • mediaprep script now stores pixel dimensions of thumbnail images in ImageClassMediaFiles table. Pixel dimensions for thumbnails now appear in the relevant XML output of the middleware cgi (image-idx).

BibClass

cgi/b/bib

  • BibApp.pm
    • added size as a common param
    • add size to SID if we have one
  • BibClassPerlFilters.pm
    • BibClassPerlFilters::CollsFilter had a minor bug in the sql query used to select the collection name from the Collection table causing it to not always select the right row because it wasn't consdering the value of the DLPS_DEV environment variable. It now correctly selects the production, release, or user row.
  • BibClassUtils.pm
    • change openurl sub to be 1.0
    • change wording on openurl link
    • added support for icon in openurl links
  • bibclass.cfg
    • change openurl server

Oai

cgi/o/oai/

  • oai, oai.cfg, oai_conf.xml, sample_config.xml, UMProvider.pm
    • new -- new OAI data provider and accompanying scripts

bin/o/oai/

  • ConfirmPublicDomain.pl, loadOai.pl, mbooks_harvest_cron.pl, mbooks_update.pl, oai_conf.xml, OaiList.pl, OaiToDb.pl, RepositoryConfig.xml, revertOaiTbl.pl, updateMbooksOai.pl
    • new -- new OAI data provider and accompanying scripts

bin/o/oai/provider

  • AddCollToSource.pl, articlesToDc.xsl, bibClassToDc.xsl, CheckModifiedColls.pl, collmgrFullList.txt, ConvertToDc.pl, ConvertToUTF8.pl, dlxs_to_oaidc_cron.pl, exampleColls.xml, ExtractHeaders.pl, GenerateReport.pl, GetNewCollections.pl, LoadDB.pl, NewOAICollection.jpg, oai_new_coll_history.txt, oai_update_history_text.txt, oai_update.pl, README.txt, RunDlpsOaiConversion.pl, RunOaiConversion.jpg, RunOaiConversion.pl, textClassToDc.xsl, umprovider_flow_diagrams_and_examples.ppt, UMProviderFlow.jpg, updateCollConfig.pl
    • new -- scripts to transform text, bib, and image class collections into oai_dc for use by the OAI data provider. See the README.txt file in this directory for a brief description of each file

bin/o/oai/provider/logs

  • This directory houses the logs generated by the scripts in bin/o/oai/provider

bin/o/oai/provider/reports-new-colls

  • This directory houses the reports generated by bin/o/oai/provider/GetNewCollections.pl

bin/o/oai/provider/reports-oai-update

  • This directory houses the reports generated by bin/o/oai/provider/GenerateReport.pl

bin/i/image/

  • image2oai_dc.pl
    • new -- script for loading OAI tables with Image Class data in MySQL

bin/o/oaister/

  • RepositoryConfig.cfg
    • new -- contains baseURLs and OAITransFixer.pm subscript calls
  • oaitransform/CollsObj.cfg
    • new -- config for CollsObj.pm
  • oaitransform/CollsObj.pm
    • new -- used for FMPro connection for OAIster
  • oaitransform/DataConditioning.pm
    • Get rid of any remaining stray sp0t tags
  • oaitransform/MODSTransform
    • Fix bug where MODS record was not ignored for re-exposure if no URL in record
    • Changed the tree removing metadata from the re-exposed XML, changed the language code mapping, added setSpec to about container
    • Swap dates in about container and change MODS loc.gov URL
    • Remove NS prefix and NS from metadata element
    • Remove all xmlns attributes as well
    • Add all orig setSpec to provenance element for MODS re-exposure
    • Hack to fix xsi: problem for Northwestern
    • Change to pick up URLs in identifier type=uri
  • oaitransform/OAITransFixer.pm
    • new -- perl subscripts for fixing harvested data
  • oaitransform/OAITransform
    • Added flag to skip DTD validation change, the PostXslt process to be after each small file, not the final large one
    • Added code to regenerate the perl config files from the XML config files each time this is run
    • Comment out LoadRepositoryLookTable() and changed SetCurrentArchive() to use XML config value
    • Move removing attribute to after data conditioning
    • Patch to fix latin1 regex problem
    • Unescape hex and dec entity refs
    • Added fixer call and other minor changes
    • Skip deleted records
    • Add skipped and deleted to repo count
    • Get the return from PreprocessXML bug fix
    • Semi opt for hex and dec ent refs
    • Added CollsDB
  • oaitransform/browse.tpl
    • new -- template for building OAIster browse pages
  • oaitransform/manageCollsDb.pl
    • new -- CGI script for managing FMPro DB for oaister
  • oaitransform/mods-bibclass.xsl
    • Added copyrightDate YR
    • urls in own URL tags
    • Change to pick up URLs in identifier type=uri
  • oaitransform/mods_repositoryNames.pl
    • Fixed amp;
  • oaitransform/oai-bibclass3.xsl
    • Added tests for empty elements
  • oaitransform/repositoryNames.pl
    • Add &amp; fix
  • scripts/Batch_UMHarvest
    • Add skip HTML flag for transform
    • Allow -x for transform
    • Added -a for incremental batch harvesting
  • scripts/ListRecords
    • Error code checking
    • Do not exit on OAI error
    • Fixed log reporting
    • Make list of dirs, not files, to check for replacement records -- speed up incremental
  • scripts/UMHarvester
    • Changed to use the XML config file; requires XML::LibXML now
    • Added wait flag
    • Check for sets cfg before trying to parse it, only loop on sets for list records
    • ID bug fix
  • scripts/browse_check.pl
    • new -- checking FMPro repositories against browse pages
  • scripts/id_check.pl
    • new -- checking FMPro repositories against incremental batch harvest
  • scripts/rc_check.pl
    • new -- checking FMPro repositories against RepositoryConfig.cfg
  • scripts/restoreHarvestedData.pl
    • new -- restores backup copy of repository dir
  • scripts/startIndex.pl
    • new -- runs xpat and multirgn on xml obj files

IdResolver

cgi/i/idresolver/

  • idresolver
    • CGI script that returns a URL (marked up in XML) for an ID
  • idresolver-nr
    • new -- CGI script that redirects a user from a persistent URL to one defined in the nameresolver database for an ID, can be used instead of cgi/b/bib/bibperm
  • idresolver-srch
    • new -- CGI script that takes a text file of ids (one per line) and checks to see if they exist in the nameresolver database
  • IDResolver.pm
    • Perl module that is used by each of the cgi scripts

bin/i/idresolver/

  • cvstag.idresolver, rdist.class

bin/n/nameresolver/

  • IdParser.pl
    • new -- script used to populate the nameresolver database
  • LoadLLMCIds.pl
    • new -- this is a sample script of how you can use a .csv file to populate the nameresolver database
  • CreateTable.pl, NRTable.sql
    • new -- scripts for creating nameresolver table in database (which is already included in the release)
  • new -- other sample files: DeepBlueShortIds.txt, getUpdatedNRfiles, InsertCrosscIssues.pl, N2TestExecution.txt, NRDevVsProdComparison.xls, README.txt, RemoveDeepBlueConflicts.pl, statusLoadingNRinProd.txt, TestIdParser.pl
    • These files were used for DLPS collections and are provided only as additional resources. They are not necessary to the Idresolver/nameresolver configuration

broker20

  • No changes.

Collmgr

  • Supports version 7.0 database for DLXS release14.
  • default browse page for Text and Findaid Class now set from first item in Browsefields list

FindaidClass

Findaid Class Summary

Prep scripts

Data prep scripts have been reorganized and renamed.

  • New functionality for Makefile and preparedocs.pl
  • New script to setup new collections

New prep script: setup_newcoll

$DLXSROOT/bin/f/findaid/setup_newcoll can be used to set up directories for new collections. For example, to set up the workshopfa collection based on samplefa (Assuming your $DLXSROOT variable is set)run this command:

 $DLXSROOT/bin/f/findaid/setup_newcoll -c workshopfa  -s $DLXSROOT/prep/s/samplefa/data 

More information on the setup_newcoll script can be found by clicking here or invoking the man page:

$DLXSROOT/bin/f/findaid/setup_newcoll --man

New options for preparedocs.pl

The $DLXSROOT/bin/s/samplefa/preparedocs.pl script now takes several new arguments.

  ./preparedocs.pl --man   will give details

of particular interest is the preparedocs.pl -i inputfilelist option which allows you to specify a file containing the full paths to the finding aids you wish to index. This option can be used instead of the previous default behavior which was to recursively search the data directory and index all files in that directory.

Changes to Makefile

Addition of these new targets:

prepdocslist
Same as prepdocs but uses the -i inputfilelist to preparedocs.pl and by default reads a file $DLXSROOT/prep/c/collection/list_of_eads.
allbutprep
convenience target for use in conjunction with prepdocslist that does all the make steps except for the make prepdocs
index
convenience method that runs all 3 indexing steps

Changes to prep scripts

$DLXSROOT/bin/s/samplefa
Moved generic bin files to f/findaid
Makefile modified
preparedocs.pl has new options
$DLXSROOT/prep/s/samplefa
Renamed files
	samplefa.text.inp  to samplefa.ead2002.dcl
	samplefa.xml.inp to samplefa.concat.ead.dcl
Removed:
	samplefa.inp
Added:
   list_of_eads
$DLXSROOT/bin/f/findaid
Removed:
	catsourcefiles.pl	
	isolat128bit.pl
	validate.pl
Added/moved from bin/s/samplefa
	fixdoctype.pl		
	stripdoctype.pl
	validateeach.sh
        setup_newcoll

web/f/findaid

  • bookbagitemsstring_debug.xsl
    • For debugging in oxygen since the Oxygen debugger uses Saxon and Saxon doesn't understand the extensions
  • browse.xsl
    • Added code for subject browse
  • browseheader.xsl
    • bulk dates labelled
  • htmlhead.xsl
    • emit full path to main XML template for <TemplateName> element
    • changed TemplateName to TemplatePath
  • text.components.xsl
    • Modified template for processing c0x's so that if there are two containers within a //c0x/did/ such as box/folder they will both show up in the proper column.
    • Highlighting fix. Replaced about 36 instances of <xsl:value-of select="."/> with <xsl:apply-templates select="*|text()"/> If value-of select gets highlighted text in the context node: "text<HIGHLIGHT>text</HIGHLIGHT>text text" it will ignore the <HIGHLIGHT> elemnts and just render all the text. On the other hand the apply-templates "*" will match the <HIGHLIGHT> element and trigger the appropriate highlighting template and the text() nodes will get passed to the template that just outputs the text
    • Fixed code in template match=unittitle mode=SimpleUnittitle because foreach was messing with context Also change code for handling notes/scopeconent in match=C01|c02... because it was an xsl:choose, but notes and scopecontent are not mutually exclusive
    • Changed code for handling notes/scopeconent in match=C01|c02... now both did/note and note and did/scopecontent and scopecontent will be rendered
    • fixed bug in template match="list" where there was a foreach and then value-of select *|text() that needed to be value-of select="."
    • space before unitdate value
    • title styling
    • optional labels for additional descriptive material
    • hide sorting title
    • sponsor
    • Restored ADD to full-text view
    • abstracts in dids
    • Change template for index mode= add so it doesn't produce 2 copies of any <head> text. Key is to limit the apply templates after we already processed the head not to process the head again: <xsl:apply-templates select="*[not(self::head)]"/>

bin/f/findaid

  • catsourcefiles.pl
    • No longer used for samplefa (replaced by preparedocs.pl) Also it contains bhl specific code. For bhl use the copy in $DLXSROOT/bin/b/bhlead which is in cvs.
  • fixdoctype.pl
    • Moved here from bin/s/samplefa No collection-specific customization These should work on any EAD that conforms to the ead2002.dtd
  • isolat128bit.pl
    • No longer used for findaids, since all findaids should be utf8 encoded
  • setup_newcoll
    • New script to set up directories ./setup_newcoll --man for details
  • stripdoctype.pl
    • This library file is now used by preparedocs.pl and validateeach.sh (through $DLXSROOT/bin/s/samplefa/fixdoctype.pl)to correctly remove multiple line DOCTYPE declarations and any entity references contained within them. It is a replacement for the one-line perl program previously used by those two programs.
  • validate.pl
    • Removed outdated file that worked on sgml files. FindaidClass now exclusively xml
  • validateeach.sh
    • Moved here from bin/s/samplefa No collection-specific customization Thise should work on any EAD that conforms to the ead2002.dtd

cgi/f/findaid

  • FindaidApp.pm
    • remove FormatGuideFrame() obsolete since the change to XML/XSL
  • FindaidAppXsltPIFiller.pm
    • Removed FormatOutlineResult_XML and FormatOutlineFrame_XML as they are not called by any code (or bound to any PIs).
  • FindaidClass.pm
    • highlight hits in layer 1 result items
    • Added highlighting to BuildItemTitle_XML so highlighting will show up in title
    • Fixed bug in FilterAllDaos_XML that would not properly process daos with real hrefs and would result in illegal xml being output when id resolver is turned on.
  • FindaidClass/ClementsmssFC.pm
    • removes "viewtextnote" speedbump
  • FindaidClass/DemofaFC.pm
    • Demo of subclassing
    • add relatedmaterial and separated material TOC heads
    • use <head> tags for bioghist instead of bentley logic for TOC heads
    • change labels for several TOC heads
  • FindaidClass/BioghistfaFC.pm
    • Demo of subclassing to use <head> tags for bioghist instead of bentley logic

XClass

  • No changes.

METS Pageturner and Collection Builder

  • Continuing development work in Pageturner. New application: Collection Builder allows users to add items to a personal collection via widgets in Pageturner. These applications are not part of DLXS. They use a different code base mainly under DLXSROOT/{web,bin,cgi}/m/mdp and DLXSROOT/lib/App. Stub routines are required in Pageturner to abstract the database connections and an installation of Solr/Lucene is required to support the collection search in Collection Builder. The user interface makes extensive use of Yahoo User Interfacen toolkit (YUI) functionality.

SRU

  • Added fielded searching to query -- not yet Level 1 or 2, though

Top

Personal tools