Setting up Dynamic Browsing

From DLXS Documentation

(Difference between revisions)
Jump to: navigation, search

Revision as of 14:53, 28 August 2007

Main Page > Working with DLXS Components > Working with the Collection Metadata Database > Setting Up Dynamic Browsing


Contents

Overview

Through a combination of <a href="colldatabases.html">database</a> tables and <a href="collmgr.html">collmgr</a> field configuration, dynamic browsing is available through the DLXS middleware. When the proper CGI-based URL is received, the middleware checks for certain metadata in the database tables, packages some of it in XML, uses some of it, as is in its stored format of XML, and lets XSLT format the results. At run time, no <a href="../xpat/index.html">XPAT</a> queries are needed, only MySQL queries are run against the item browse tables in the <a href="colldatabases.html">dlxs database</a>.

After the tables are prepared, that is, populated with item metadata through the $DLXSROOT/bin/browse/updatebrowsedb.pl script (aka "database populator"), configuration for the behavior of dynamic browsing is accomplished by modifying certain <a href="collmgr.html">collmgr</a> fields.

Provisions will soon be made for creating static HTML browse pages with the database populator script.

Item browse database tables

There are three tables in the dlxs database that are specifically used for dynamic browsing in the middleware:

  • ItemColl
  • ItemBrowse
  • ItemBrowseCounts

The ItemColl table holds, for each item/document and collection combination, one row containing the following columns:

  • the idno
  • the collection id
  • the modification date of the row's information
  • XML metadata about the item: in the case of TextClass, this is simply the DLXSTEXTCLASS/HEADER element that is retrieved from an XPAT query in the "database populator" script. In the case of ImageClass, the Perl subclass used by the database populator grabs information from the MySQL or XPAT data and wraps it in specific XML before filling in this field.

The ItemBrowse table holds, for each item/document's browseable field (e.g., author, title, etc.):

  • the idno
  • the collection id
  • the field name
  • the value of the field
  • rank (not currently used)

The ItemBrowseCounts table holds, for each collection, a list of rows containing:

  • the colleciton id
  • the field name
  • the first character or the first two characters of the sortable field's value (sortable title, author's name)
  • the count of items that begin with that first character or those first two characters

Collmgr fields for configuration

The main fields in the collmgr that need to be set properly are:

  • locale
  • browsenav
  • browsefields
  • browseupdatemodule

See <a href="#Configure">Configure the collmgr fields</a> below for more information.


Preparing collections for browsing

Configure the collmgr fields

Start collmgr and change the following fields:

  • devhost: if you are running the database populator in a development environment (that is, where <a href="../program/devenvironment.html#workdirs">DLPS_DEV</a> environment variable is set), you can have the middleware use XPAT-indexed data that is on a machine different from the usual host. This can be useful for testing purposes.
  • locale: (this should be changed to a UTF-8 type of encoding, e.g., en_US.UTF-8)

  • browseable A "yes" (case-insensitive) value in this field enables the browse tab in the user interface. If a file in the collection-specific web directory for this collection contains a file named browse.html that page will be served but only if browsefields is empty. Fallback is applied to select the correct browse.html file for collection-specific customization. This supports static browse pages. If browse.html is not present a dynamic browse page will be served based on data from the browse database. When a dynamic browse page is served the browsenav field value is consulted and must be defined.

  • browsenav: enter 0, 1, or 2. If you want no paging, that is, that all items in the colleciton appear on one HTML page for the user to browse, enter 0. Enter 1, if you want "one level of browsing", that is, that a separate page be created for each first character of the value in question (e.g., title or author) and that a navigation bar be built that allows the user to navigate to each page, for example, jump to the page listing items whose value begins with "M". Entering 2 in this field will create a "two-level browse", where two navigation bars will be created. The first bar will allow the user to jump to items whose values begin with a particular first character (e.g., jump to the records that begin with "B"). The second navigation bar will allow the user to jump to items whose values begin with a particular two-character combination, (e.g., records that begin with "Bu"). This decision is left to the collection coordinator. We have found that the level is based on how many total items there are in the collection and therefore what is a reasonable number of browseable items for a single HTML page.
  • browsefields: list the browseable fields for the collection. For example, some collections may have only title browsing, others may need both title and author, etc. Leave this field empty to enable static browsing.

  • browseupdatemodule: specifies the name of the browse update Perl module that will be used by the updatebrowsedb.pl script to populate the database. This value is analogous to the appmodule and subclassmodule fields. (This field exists as of Release 12a; it supersedes a Perl configuration hash used in Release 12.)

    The module files are located in DLXSROOT/bin/browse. If a dynamic browse page is to be served this field must have a value. Specialized behavior can be obtained by subclassing the browse update modules. The currently available browse update module values are as follows:

    ImageClass</dt>
    BrowseUpdate/ImageMysqlBU</dd>
    FindaidClass</dt>
    BrowseUpdate/FindaidBU</dd>
    TextClass</dt>
    encodingtype = monograph</dt>
    BrowseUpdate/MonographBU</dd>
    encodingtype = serialissue</dt>
    BrowseUpdate/SerialIssueBU
    Note that newspapers are serialissue encodingtype.</dd>
    encodingtype = serialarticle</dt>
    BrowseUpdate/SerialArticleBU</dd>
    </dd>

Populating the item browse tables

To initially populate or to update the item browse tables, there is a script called updatebrowsedb.pl which is located in $DLXSROOT/bin/browse. Running this program will populate or update the rows necessary in each of the three ItemBrowse related tables. These tables will be queried when the user requests browsing from the middleware.

However, please note: unless you are making changes to or need to debug updatebrowsedb.pl, you should use the "wrapper" shell script provided in the same subdirectory. This wrapper is called ub and was written to ensure that updatebrowsedb.pl

  • is run from the "release" directory and not from a particular developer's directory (for more information about a development environment which uses multiple developers' directories and environments, <a href="../program/devenvironment.html">click here</a>)
  • runs with certain environment variables properly set
  • assumes the use of the "production" row, if no row is specified, when setting the host from which data will be read

$DLXSROOT/bin/browse/ub -C class -c collection [ -r row [ -h host ] ] [ -f ] [ -p ]

  • -f : is optional but if supplied, the wrapper will run the updatebrowsedb.pl script without asking for confirmation
  • -p : is used to "purge" all records from the browse tables for a particular collection without re-populating (updating) the collection's browse information
  • -r : the row is optional. Without it, the "production" row will be assumed and used.
  • -h : if row is supplied, host is optional. The script will force updatebrowsedb.pl to use the host given. If no row is supplied, host cannot be supplied.

If, for any reason, you must override the assumptions made by the "wrapper" script, you can always run the updatebrowsedb.pl directly by entering:

$DLXSROOT/bin/browse/updatebrowsedb.pl class=AAA c=BBB host=CCC row=DDD

where AAA is either "text" or "image";; BBB is the collection id of the collection you want to create browsing for; CCC is the name of the host on which resides the XPAT index for the collection (this is not relevant to ImageClass, which uses MySQL for all queries); and DDD is key to the row in the database you wish to use (production, dlxsadm, or an individual developer's id). For example, you may want to point the script at new or test data on a machine that is different from your production machine. You could accomplish this by changing the host or devhost field for the collection in the collmgr. NOTE: If <a href="../program/devenvironment.html#workdirs">DLPS_DEV</a> is set when you invoke updatebrowsedb.pl (without the wrapper ub), the devhost field will be used; otherwise, the host field will be used.

Top

Personal tools