Working with Map Files

From DLXS Documentation

(Difference between revisions)
Jump to: navigation, search
Current revision (17:20, 14 August 2007) (edit) (undo)
(Mapping and ordering the terms for an HTML form selection's option elements)
 
(8 intermediate revisions not shown.)
Line 1: Line 1:
[[DLXS Wiki|Main Page]] > [[Working with DLXS Components]] > [[Working with the Collection Metadata Database]] > Working with Map Files
[[DLXS Wiki|Main Page]] > [[Working with DLXS Components]] > [[Working with the Collection Metadata Database]] > Working with Map Files
-
<p>This document describes what we call <i>maps</i> or <i>map files</i>. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see <a href="#label">LABEL</a> below) can be mapped to an <a href="../xpat/index.html">XPAT</a> searchable region (e.g., TEXT; see <a href="#nativeregionname">NATIVEREGIONNAME</a> below). </p>
+
==Overview==
 +
 
 +
<p>This document describes what we call <i>maps</i> or <i>map files</i>. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see [[#Mapping a set of terms to one another|LABEL]] below) can be mapped to an [[Working with XPAT|XPAT]] searchable region (e.g., TEXT; see [[#Mapping a set of terms to one another|NATIVEREGIONNAME]] below). </p>
 +
 
 +
<p>Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a [[DLXS Middleware Library Modules#TerminologyMapper|TerminologyMapper]] object during the running of the middleware after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a &lt;MAPPING&gt; element.</p>
 +
 
 +
==Semantic Contexts==
-
<p>Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a <a href="../program/libmodules.html#terminologymapper">TerminologyMapper</a> object during the running of the middleware after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a &lt;MAPPING&gt; element.</p>
 
-
<h2>Semantic Contexts</h2>
 
<p>There are two semantic contexts for MAPPINGs currently implemented.</p>
<p>There are two semantic contexts for MAPPINGs currently implemented.</p>
<ol>
<ol>
-
   <li><a href="#termtoterm">Mapping a set of terms to one another</a></li>
+
   <li>[[#Mapping a set of terms to one another|Mapping a set of terms to one another]]</li>
-
   <li><a href="#ordering">Mapping and ordering the terms for an HTML form selection's option elements</a></li>
+
   <li>[[#Mapping and ordering the terms for an HTML form selection's option elements|Mapping and ordering the terms for an HTML form selection's option elements]]</li>
</ol>
</ol>
-
  <h2><a name="termtoterm"></a>Mapping a set of terms to one another</h2>
+
 
 +
==Mapping a set of terms to one another==
<p>Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:</p>
<p>Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:</p>
<ol>
<ol>
-
   <li><a name="label"></a>LABEL: by the term that is used in the collection database  and interface</li>
+
   <li>LABEL: by the term that is used in the collection database  and interface</li>
   <li>SYNTHETIC: by the variable name that is used in the cgi program</li>
   <li>SYNTHETIC: by the variable name that is used in the cgi program</li>
   <li> NATIVE: by the language that is used by the search engine</li>
   <li> NATIVE: by the language that is used by the search engine</li>
-
   <li><a name="nativeregionname"></a>NATIVEREGIONNAME: by the element name that  is indexed</li>
+
   <li>NATIVEREGIONNAME: by the element name that  is indexed</li>
</ol>
</ol>
-
<h3>Mapping terms for XPAT operators</h3>
+
 
 +
===Mapping terms for XPAT operators===
<p> The first part of the map (by convention rather than by DTD enforcement) contains the mappings for the boolean and proximity operators. In versions of DLXS prior to Release 10, mappings for operators tended to appear twice, with labels in all lower case and with mixed case, to cover likely interface option scenarios. Only one mapping per operator is now permitted; older map files must be updated to eliminate unused "duplicate" operator mappings.  Here is an example
<p> The first part of the map (by convention rather than by DTD enforcement) contains the mappings for the boolean and proximity operators. In versions of DLXS prior to Release 10, mappings for operators tended to appear twice, with labels in all lower case and with mixed case, to cover likely interface option scenarios. Only one mapping per operator is now permitted; older map files must be updated to eliminate unused "duplicate" operator mappings.  Here is an example
   of an operator mapping: </p>
   of an operator mapping: </p>
Line 28: Line 34:
</pre>
</pre>
<p><i>(^ is the symbol used in the XPAT query language to indicate an intersection.)</i></p>
<p><i>(^ is the symbol used in the XPAT query language to indicate an intersection.)</i></p>
-
  <h3>Mapping terms for regions</h3>
+
 
 +
===Mapping terms for regions===
<p>The second part of the map file contains region mappings, which identify the SGML elements, encoded or fabricated, that are used by the middleware and in the HTML, either as labels in pulldown menus or as <code>rgn</code> variables in links to text from results lists. These are the labels stored in the collection manager fields
<p>The second part of the map file contains region mappings, which identify the SGML elements, encoded or fabricated, that are used by the middleware and in the HTML, either as labels in pulldown menus or as <code>rgn</code> variables in links to text from results lists. These are the labels stored in the collection manager fields
   <code>termsearch</code>,<code>regionsearch</code>, and <code>bibsearch</code>. The mapping labels and the collmgr entries <b>must match exactly</b> in spelling, number, and case. If they do not, the middleware will fail. For any collection, there will be at a minimum entries with SYNTHETIC mappings for <code>MAIN_SEARCHABLE, IDNO, BIBL</code>, and <code>NODE</code> (used by the cgi); with LABEL mappings for <code>full text, works</code>, and <code>citation</code> (used as labels in the HTML search pages); and with NATIVEREGIONNAME mappings for <code>DIV1</code> (used to build a link to divisions from results lists). There should of course be maps for all the divisions in a given collection. Here is an example of a region mapping:</p>
   <code>termsearch</code>,<code>regionsearch</code>, and <code>bibsearch</code>. The mapping labels and the collmgr entries <b>must match exactly</b> in spelling, number, and case. If they do not, the middleware will fail. For any collection, there will be at a minimum entries with SYNTHETIC mappings for <code>MAIN_SEARCHABLE, IDNO, BIBL</code>, and <code>NODE</code> (used by the cgi); with LABEL mappings for <code>full text, works</code>, and <code>citation</code> (used as labels in the HTML search pages); and with NATIVEREGIONNAME mappings for <code>DIV1</code> (used to build a link to divisions from results lists). There should of course be maps for all the divisions in a given collection. Here is an example of a region mapping:</p>
Line 42: Line 49:
</pre>
</pre>
<p>Note: In BibClass, SYNTHETIC and NATIVEREGIONNAME are not used, but SUMMARYLABEL is. See
<p>Note: In BibClass, SYNTHETIC and NATIVEREGIONNAME are not used, but SUMMARYLABEL is. See
-
<a href="../class/bib/maps.html">BibClass documentation</a>.</p>
+
[[Mounting a Bib Class Collection]].</p>
-
  <h2><a name="ordering"></a>Mapping and ordering the terms for an HTML form selection's option elements</h2>
+
 
-
<p>This section of the <i>map file</i> is not needed in all collections, but may be needed for a specific collection if its markup supports specialized restrictions such as date of publication, genre, period, or gender. In general, the maps support label values, native values, and the order in which the restrictions should be presented in pulldown menus. The existence of these maps is indicated in the <a href="colldatabases.html">metadata database</a>. Here are the genre mappings for the Chadwyck-Healey Yeats collection, which divides works into four categories:</p>
+
==Mapping and ordering the terms for an HTML form selection's option elements==
 +
 
 +
<p>This section of the <i>map file</i> is not needed in all collections, but may be needed for a specific collection if its markup supports specialized restrictions such as date of publication, genre, period, or gender. In general, the maps support label values, native values, and the order in which the restrictions should be presented in pulldown menus. The existence of these maps is indicated in the [[DLXS Metadata Databases|metadata database]]. Here are the genre mappings for the Chadwyck-Healey Yeats collection, which divides works into four categories:</p>
<pre>
<pre>
Line 79: Line 88:
</pre>
</pre>
-
<p>Under the basic middleware architecture, collection maps are stored in <a href="../intro/dirstruct.html#dlxsrootenv">$DLXSROOT</a>/misc/c/class/maps/ and are named <i>collid.map</i> (for example, <i>moa.map</i> or <i>ampo20.map</i> for the Making of America and 20th Century American Poetry collections, respectively).</p>
+
<p>Under the basic middleware architecture, collection maps are stored in [[Directory Structure#DLXSDATAROOT Environment Variable|$DLXSROOT]]/misc/c/class/maps/ and are named <i>collid.map</i> (for example, <i>moa.map</i> or <i>ampo20.map</i> for the Making of America and 20th Century American Poetry collections, respectively).</p>
[[#top|Top]]
[[#top|Top]]

Current revision

Main Page > Working with DLXS Components > Working with the Collection Metadata Database > Working with Map Files

Contents

[edit] Overview

This document describes what we call maps or map files. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see LABEL below) can be mapped to an XPAT searchable region (e.g., TEXT; see NATIVEREGIONNAME below).

Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a TerminologyMapper object during the running of the middleware after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a <MAPPING> element.

[edit] Semantic Contexts

There are two semantic contexts for MAPPINGs currently implemented.

  1. Mapping a set of terms to one another
  2. Mapping and ordering the terms for an HTML form selection's option elements

[edit] Mapping a set of terms to one another

Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:

  1. LABEL: by the term that is used in the collection database and interface
  2. SYNTHETIC: by the variable name that is used in the cgi program
  3. NATIVE: by the language that is used by the search engine
  4. NATIVEREGIONNAME: by the element name that is indexed

[edit] Mapping terms for XPAT operators

The first part of the map (by convention rather than by DTD enforcement) contains the mappings for the boolean and proximity operators. In versions of DLXS prior to Release 10, mappings for operators tended to appear twice, with labels in all lower case and with mixed case, to cover likely interface option scenarios. Only one mapping per operator is now permitted; older map files must be updated to eliminate unused "duplicate" operator mappings. Here is an example of an operator mapping:

  <mapping>   <label>and</label>   <synthetic>AND</synthetic>   <native>^</native> </mapping>

(^ is the symbol used in the XPAT query language to indicate an intersection.)

[edit] Mapping terms for regions

The second part of the map file contains region mappings, which identify the SGML elements, encoded or fabricated, that are used by the middleware and in the HTML, either as labels in pulldown menus or as rgn variables in links to text from results lists. These are the labels stored in the collection manager fields termsearch,regionsearch, and bibsearch. The mapping labels and the collmgr entries must match exactly in spelling, number, and case. If they do not, the middleware will fail. For any collection, there will be at a minimum entries with SYNTHETIC mappings for MAIN_SEARCHABLE, IDNO, BIBL, and NODE (used by the cgi); with LABEL mappings for full text, works, and citation (used as labels in the HTML search pages); and with NATIVEREGIONNAME mappings for DIV1 (used to build a link to divisions from results lists). There should of course be maps for all the divisions in a given collection. Here is an example of a region mapping:

  <mapping>
  <label>full text</label>
  <synthetic>MAIN_SEARCHABLE</synthetic>
  <native>region TEXT</native>

  <nativeregionname>TEXT</nativeregionname>
  </mapping>

Note: In BibClass, SYNTHETIC and NATIVEREGIONNAME are not used, but SUMMARYLABEL is. See Mounting a Bib Class Collection.

[edit] Mapping and ordering the terms for an HTML form selection's option elements

This section of the map file is not needed in all collections, but may be needed for a specific collection if its markup supports specialized restrictions such as date of publication, genre, period, or gender. In general, the maps support label values, native values, and the order in which the restrictions should be presented in pulldown menus. The existence of these maps is indicated in the metadata database. Here are the genre mappings for the Chadwyck-Healey Yeats collection, which divides works into four categories:

  <mapping>
  <genrelabel>Prose Fiction</genrelabel>
  <genreorder>1</genreorder>
  <genrenative>FICT</genrenative>

  </mapping>
  <mapping>
  <genrelabel>Prose Non-fiction</genrelabel>
  <genreorder>2</genreorder>

  <genrenative>NONFICT</genrenative>
  </mapping>
  <mapping>
  <genrelabel>Drama</genrelabel>

  <genreorder>3</genreorder>
  <genrenative>PLAY</genrenative>
  </mapping>
  <mapping>

  <genrelabel>Poetry</genrelabel>
  <genreorder>4</genreorder>
  <genrenative>POEM</genrenative>
  </mapping>

Under the basic middleware architecture, collection maps are stored in $DLXSROOT/misc/c/class/maps/ and are named collid.map (for example, moa.map or ampo20.map for the Making of America and 20th Century American Poetry collections, respectively).

Top

Personal tools