The Civil War Diaries: A Case Study

From DLXS Documentation

Revision as of 15:10, 17 June 2010 by Sooty (Talk | contribs)
Jump to: navigation, search

Main Page > Case Studies > The Civil War Diaries: A Case Study


The Civil War Diaries is a DLXS client web site that we are providing here as a case study.

Leaving aside all the markup issues that are inherent in this project and generally outside the scope of the interface designer (but which nonetheless have major impact on usability), the changes that the client's staff asked for provide a nice primer on things that generally can be altered in the interface.

This was project initially put online with a minimal interface with minimal options. There was no browsing (as we started with one diary, it seemed pointless). The basic search was available within “full text” with searches limited by author, title, and citation. KWICs were enabled for results, and notes were shown as inline.

Contents

Add browses, change notes to pop-ups

The client immediately asked that a browse be provided and that notes not show inline. These are simple changes in the collmgr’s browseable/browsenav and displaynotesinline fields. browseable was set to yes and browsenav was set to 0 (as there will only ever be a handful of diaries, the alphabetic rulers seemed like overkill); displaynotesinline was set to no. The browse script (in /l1/bin/browse) was then run to populate the browse table.

Image:civilwar1.jpg

Additional “search within” values

The client next asked that we add subject as an option for “Search in” and “Limit to.” This requires that “subject” be added to the collmgr’s bibsearch and termsearch fields. However, for this to function properly and not simply appear in the pulldown menu, you also need a mapping in the mapfile for the collection to direct the search to the proper elements in the indexed text. civilwar1 initially used the philamer.map, but it was clear that this will be a collection with a fair amount of customization and will require a mapfile of its own (/l1/misc/t/text/maps/civilwar1.map). At this point (with only one diary in hand), the subject mapping could safely be defined as

 <mapping>
   <label>subject</label>
   <synthetic>SUBJECT</synthetic>
   <native>region TERM</native>
   <nativeregionname>TERM</nativeregionname>
 </mapping>

However, it seems that the markup in the collection is elaborate enough that TERM elements may eventually appear in places aside from the KEYWORDS element in the HEADER (the canonical location for subject terms), so it is more prudent to provide a more specific mapping.

 <mapping>
   <label>subject</label>
   <synthetic>SUBJECT</synthetic>
   <native>region subject</native>
   <nativeregionname>subject</nativeregionname>
 </mapping>

This required a fabricated region “subject” be created in the extra.srch file:

(region TERM within region KEYWORDS); {exportfile "/l1/idx/c/civilwar1/subject.rgn"}; export; ~sync "subject";

and that the “make post” indexing step be rerun.


More explicit division headings for DIVs with no HEADs

The client also asked if we could do something about the headings appearing in results lists and the TOC view. Generally, the headings are pulled from the “div1head” region et al (depending on how deeply subdivided a text is, you can have div2head, div3head, etc.), which is defined as the HEAD element in a given DIV1 or the tag itself if there is no HEAD. This diary has no HEADs, but the encoders have provided a TYPE of “entry” for each DIV1, along with the ISO date in the N attribute, so the tag looks like:

<DIV1 NODE="USCW0001.0001.001:34" TYPE="entry" N="1862-09-14">

However, general DLXS settings show only the TYPE attribute in the absence of a HEAD:

Image:civilwar2.jpg

In order to locate where such styles are handled, grepping for TYPE within the XSL stylesheets in the /l1/web/t/text directory is almost always fruitful. The proper file in this case is scopedivs.xsl, as what needs to be changed is the label for the “scoped heads” as they are generally referred to in DLXS. A local version of the scopedivs.xsl was created in the web directory for civilwar1. It imports the main scopedivs.xsl and contains only the altered “Divhead” template. The top of the file is as follows:

  <?xml version="1.0" encoding="UTF-8" ?>
  <xsl:stylesheet version="1.0" 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
   xmlns:func="http://exslt.org/functions" 
   xmlns:dlxs="http://dlxs.org"
   extension-element-prefixes="func dlxs" 
   exclude-result-prefixes="func dlxs">
   
   <xsl:import href="../../t/text/scopedivs.xsl"/>
   
   <xsl:strip-space elements="*"/>
   
   <xsl:template match="BuildDivHeadLinkLabel">

scopedivs.xsl is very complicated as a whole, as it involves printing the labels and building the links whether there are HEAD elements or not, whether we are dealing with serial articles or not, etc. However, the small portion that is relevant here is fairly straightforward, and again, searching for TYPE and realizing that you are concerned with the cases where DIVs have no HEADs will steer you to the proper portion of the template:

Image:Civilwarcode1.tiff


So, if there is no child HEAD and the TYPE is not “entry”, show the value of the TYPE attribute (example 1. above). If there is no child HEAD and the TYPE is “entry”, show the value of the TYPE attribute followed by a colon and space and then the value of the N attribute (example 2. above). The remainder of the template deals with cases where there is neither a HEAD nor a TYPE (print the word “Section”) and those cases where there are HEADs, in which case it will show each HEAD (some DIVs may have more than one) separated by non-breaking spaces ( ).

Here is the results list after the scopedivs.xsl for civilwar1 is placed into its web directory:

Image:civilwar3.jpg

Customized phrase-level markup rendering styles

The diary entries have a great deal of markup of individual words and phrases. Place names, personal names, dates, etc., are all wrapped in elements and have attributes expanding or clarifying them (creating a de facto authority file); additions and deletions (using the ADD and DEL elements of the TEI), and switches in handwriting (for example, <HI1 REND=“underlined”> or <HI1 REND=“superscript”>) are all captured as well. Such things are present in the basic DLXS package, but because the client’s encoding practices vary from our standard, more customizations needed to be made. The most straightforward is that the appropriate REND behaviors were not present; things marked as superscript or underlined appeared as plain text.

Rendering of elements gets handled in the XSL as a conversion to HTML that will be treated by CSS. That is, the <HI1 REND="superscript">st</HI1> in the XML gets converted to st (essentially, it takes the content of the element, wraps it in a span, and gives it a class of rend-hyphen-“value of the REND attribute”). Our CSS files didn’t have a class of “rend-superscript” – rend-sup and rend-super were there, though, since we tend to abbreviate such values – so a textclass-specific.css file was needed for the collection. Again, since it was clear there would be more custom styling to come, it seemed best to copy the whole “rend styles” section of the textclass.css:

Image:Civilwarcode2.tiff


All of the renderings needed were already present in the CSS; the more verbose class names merely needed to be added to the existing groups. Note superscripting of the dates on the first line:

Image:civilwar4.jpg

Note the phrase “in the morning” in red above, which is encoded with an ADD element. This was not how the client's staff envisioned additions appearing; they also wanted deletions to be shown with a strike-through, which had been considered but rejected as possibly too illegible by previous DLPS interface designers. Phrase-level markup is generally handled in text.components.xsl, and as with the changes to the div1heads, a custom version of the stylesheet was created for the collection and placed in /l1/web/c/civilwar1. Here is the beginning of the file:

   <?xml version="1.0" encoding="UTF-8" ?>
   <xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:func="http://exslt.org/functions"
   xmlns:dlxs="http://www.umdl.umich.edu/dlxs"
   extension-element-prefixes="func"
   exclude-result-prefixes="func dlxs">
   
   <xsl:import href="../../t/text/text.components.xsl"/>

The new definitions for the conversion of the XML to HTML were added to the collection-specific stylesheet:

  <xsl:template match="DEL">
      <S><xsl:apply-templates/></S>
  </xsl:template>
   
  <xsl:template match="ADD">
         <xsl:apply-templates/>
  </xsl:template>

Now added text is shown as superscript:

Image:civilwar5.jpg

Removal of redundant page labels

Because the client's staff had provided page labels in their PB attribute values, links to page images had redundant descriptive language:

Image:civilwar6.jpg

Labels like this are contained in the langmap file found in /l1/web/t/text. The label is defined there as <Item key=”headerutils.st.page”>Page </Item>. To override the labels available for all of Text Class in general, a langmapextra.en.xml file was placed in /l1/web/c/civilwar1 containing the following empty descriptor:

<Item key="headerutils.str.page"> </Item>

This provides a slightly cleaner link

Image:civilwar7.jpg

Pop-up the regularized forms of names

As mentioned before, the client's staffers have encoded many names – of battles, people, and places – to specifically identify them by providing regularized versions of the names, like

<NAME TYPE="place" REG="Jeffersonville (Ind.)"> Jeffersonville</NAME> and <NAME TYPE="person" REG="Davis, Jefferson Columbus, 1828-1879">Genl.<LB/>Jeff C. Davis</NAME>.

They wanted the regularized forms to “pop up” when the user clicked on the names. Because of our institutional commitment to minimize the use of javascript, we decided to try using CSS “tooltips” to provide this functionality. In the interface, the NAMEs are underscored with a dashed line and a text box containing the normalized value pops up when the user mouses over the NAME; the name itself is highlighted. In the example below, the third NAME is being moused over; you can see that there are two other normalized names in the entry. (The word “wagon” is highlighted in yellow because it was the search term that led to this entry.)

Image:civilwar8.jpg

Implementing the tooltips is a two-step process. First, a template for NAME needed to be added to the text.components.xsl in /l1/web/c/civilwar1:

   <xsl:template match="DIV1//NAME">
   <a class="info" href="#"><xsl:value-of select="."/><xsl:value-of select="@
   REG"/></a>
   </xsl:template>

This wraps the content of the NAME element in an <a> tag and places the content of the REG attribute in a tag. The examples from above become

<a class=”info” href="#"> Jeffersonville Jeffersonville (Ind.)</a> and <a class=”info” href="#">Genl.<LB/>Jeff C. Davis Davis, Jefferson Columbus, 1828-1879</a>.

Then, the following styles need to be added to the textclass.specific.css file in /l1/web/c/civilwar1:

   a.info{
       position:relative; /*this is the key*/
       z-index:24;
       color:#000;
       border-bottom:1px dashed #000;
       text-decoration:none}
   
   a.info:hover{z-index:25; background-color: #dad1b2;}
   
   a.info span{display: none}
   
   a.info:hover span{ /*the span will display just on :hover state*/
       display:block;
       position:absolute;
       top:2em; left:2em; width:15em;
       border:1px solid #dad1b2;
       background-color: #f5f5dc; color:#000;
       text-align: center;
       text-decoration:none}

Customized browses in addition to author/title

In addition to regularized forms of names, the client's staff had identified topics within the diary entries. This had been done using SEGs for topics. For example:

<SEG TYPE="transportation">crossed on boats</SEG>

in addition to the NAMEs mentioned above:

<NAME TYPE="battles" ID="battles4" REG="Perryville, Battle of, Perryville, Ky., 1862">battleground</NAME>.

They wanted to have indexes of these values that users could browse, as well as the usual browse list of diary authors and titles. This meant that customized browse pages needed to be built. This hybrid approach makes use of the automatic browse building, plus additional hand-coded HTML pages, with links to the other browse pages either coded in the HTML or supplied by a collection-specific browse.xsl in /l1/web/c/civilwar1.

So, in addition to my first steps in customization of setting browseable to yes and browsenav to 0 in collmgr, I created three HTML pages: browse.html, browsetopic.html, and browsename.html. browse.html is a file that links to all the browse options, including the cgi-driven option:

<A HREF="/cgi/t/text/text-idx?page=browse;c=civilwar1">Browse the Civil War Diaries by author/title</A>

browsename.html and browsetopic.html are basically canned searches that will take users to the diary entries containing those names or topics. They were created by using xpatu to pull the values out of the collection and then wrapping them in the necessary HTML and cgi values. All the data is as it was provided in the collection, with no change of capitalization, pluralization, etc.

Image:Civilwarcode4.tiff

Image:Civilwarcode5.tiff


The links to the other browse options are part of the HTML code (circled in the HTML provided above).

Image:civilwar9.jpg

In the browse by author/title page generated by running the browsebuilder, links to the hand-coded browse options are provided by placing a collection-specific browse.xsl into /l1/web/c/civilwar1:

   <xsl:stylesheet version="1.0" xmlns:xsl=http://www.w3.org/1999/XSL/Transform
   xmlns:exsl="http://exslt.org/common">
   <xsl:import href="../../t/text/browse.xsl"/>
   
   <xsl:template name="collSpecificText">
   <xsl:text> | </xsl:text><xsl:element name="a">
   <xsl:attribute name="href"><xsl:text>/c/civilwar1/browsename.html</xsl:text></xsl:attribute>
   Browse by Name</xsl:element><xsl:text> | </xsl:text>
   <xsl:element name="a">
   <xsl:attribute name="href"><xsl:text>/c/civilwar1/browsetopic.html</xsl:text></xsl:attribute>
   Browse by Topic </xsl:element></xsl:template>
   </xsl:stylesheet>

Image:civilwar10.jpg

Banners, tab colors, and customized TOC view

The client staffers wanted to make aesthetic changes to the site, with a graphical banner instead of the plain text “Civil War Diaries” and matching the tabs to the color scheme of the banner. Additionally, because the diaries are previously unpublished, many of the labels that come “out of the box” in DLXS were not quite appropriate.

Changing the banner was very simple – the banner (named banner.jpg) they provided was placed into the directory /l1/web/c/civilwar1/graphics and the collmgr primarytitle was changed to read graphic:banner.jpg.

The colors of the navigation tabs were changed in the textclass-specific.css file.

   /* STYLES FOR NAVIGATION AND MENUS   */
   td.mainnavcell {
   background-color: #A2A0AB;
   padding-left:20px;
   padding-right:20px;
   border-bottom: 1px solid #666666;}
   
   .navcolor { background-color: #8A7B90; }

Because their banner was taller than the 60 px maximum we budget for in the page layout, we needed to add information about the optimum frame spacing in pageviewer, or the navigation bars get hidden by the page images. In the directory /l1/web/c/civilwar1/ is a file called pageviewerextra.xml which contains the following information:

 <Viewer>
   <Frameset>
     <Rows>
       <Pdf>300</Pdf>
       <NonPdf>200</NonPdf>
     </Rows>
   </Frameset>
 </Viewer>

Here is the look with the new banner and color scheme:

Image:civilwar11.jpg

The client wanted to suppress some existing metadata (Print source) and show additional pieces of the metadata in the header, so a collection-specific version of tocheader.xsl was placed in the l1/web/c/civilwar1 directory. In the place of Print source, which shows metadata from the SOURCEDESC, they wanted to display their notes. A template was added for notes:

   <xsl:template match="NOTE">
       <xsl:apply-templates/>
   </xsl:template>

which was called instead of the SOURCEDESC, after AVAILABILITY:

   <xsl:variable name="availability">
     <xsl:copy-of select="HEADER/FILEDESC/PUBLICATIONSTMT/AVAILABILITY/P"/>
   </xsl:variable>
   <xsl:variable name="notesstmt">
     <xsl:copy-of select="HEADER/FILEDESC/NOTESSTMT/NOTE"/>
   </xsl:variable>

Labels for various metadata sections were also changed, as they wanted Publisher instead of Publication Info and Rights instead of Availability. This was done in the langmapextra.en.xml, which had been previously created to change the PB metadata filtering.

   <ColLookupTables>
      <Lookup id="headerutils">
   <Item key="headerutils.str.page"> </Item>
   <Item key="headerutils.str.publicationinfo">Publisher</Item>
   <Item key="headerutils.str.22">Rights</Item>
   <Item key="civil.str.notes">Notes</Item>
     </Lookup>
   </ColLookupTables>

Top

Personal tools