Working with Map Files
From DLXS Documentation
(→Mapping and ordering the terms for an HTML form selection's option elements) |
|||
(7 intermediate revisions not shown.) | |||
Line 3: | Line 3: | ||
==Overview== | ==Overview== | ||
- | <p>This document describes what we call <i>maps</i> or <i>map files</i>. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see [[Mapping a set of terms to one another|LABEL]] below) can be mapped to an [[ | + | <p>This document describes what we call <i>maps</i> or <i>map files</i>. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see [[#Mapping a set of terms to one another|LABEL]] below) can be mapped to an [[Working with XPAT|XPAT]] searchable region (e.g., TEXT; see [[#Mapping a set of terms to one another|NATIVEREGIONNAME]] below). </p> |
- | <p>Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a | + | <p>Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a [[DLXS Middleware Library Modules#TerminologyMapper|TerminologyMapper]] object during the running of the middleware after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a <MAPPING> element.</p> |
- | ==Semantic Contexts= | + | ==Semantic Contexts== |
<p>There are two semantic contexts for MAPPINGs currently implemented.</p> | <p>There are two semantic contexts for MAPPINGs currently implemented.</p> | ||
<ol> | <ol> | ||
- | <li> | + | <li>[[#Mapping a set of terms to one another|Mapping a set of terms to one another]]</li> |
- | <li> | + | <li>[[#Mapping and ordering the terms for an HTML form selection's option elements|Mapping and ordering the terms for an HTML form selection's option elements]]</li> |
</ol> | </ol> | ||
Line 19: | Line 19: | ||
<p>Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:</p> | <p>Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:</p> | ||
<ol> | <ol> | ||
- | <li | + | <li>LABEL: by the term that is used in the collection database and interface</li> |
<li>SYNTHETIC: by the variable name that is used in the cgi program</li> | <li>SYNTHETIC: by the variable name that is used in the cgi program</li> | ||
<li> NATIVE: by the language that is used by the search engine</li> | <li> NATIVE: by the language that is used by the search engine</li> | ||
- | <li | + | <li>NATIVEREGIONNAME: by the element name that is indexed</li> |
</ol> | </ol> | ||
Line 88: | Line 88: | ||
</pre> | </pre> | ||
- | <p>Under the basic middleware architecture, collection maps are stored in | + | <p>Under the basic middleware architecture, collection maps are stored in [[Directory Structure#DLXSDATAROOT Environment Variable|$DLXSROOT]]/misc/c/class/maps/ and are named <i>collid.map</i> (for example, <i>moa.map</i> or <i>ampo20.map</i> for the Making of America and 20th Century American Poetry collections, respectively).</p> |
[[#top|Top]] | [[#top|Top]] |
Current revision
Main Page > Working with DLXS Components > Working with the Collection Metadata Database > Working with Map Files
Contents |
[edit] Overview
This document describes what we call maps or map files. Map files contain mapped items where one term or name for an item is mapped to another term or name. For example, a term used by an HTML form to refer to a searchable region (e.g., "entire text"; see LABEL below) can be mapped to an XPAT searchable region (e.g., TEXT; see NATIVEREGIONNAME below).
Currently, the format of the map files is SGML and each collection map file conforms to a simple DTD (other ways of mapping terms, such as a database where one could map from one column's data to another are possible and have been considered for implementation). The map is read into a TerminologyMapper object during the running of the middleware after which the CGI program can at any time request of the object the mappings for terms. Each mapped item and its various terms are contained within a <MAPPING> element.
[edit] Semantic Contexts
There are two semantic contexts for MAPPINGs currently implemented.
- Mapping a set of terms to one another
- Mapping and ordering the terms for an HTML form selection's option elements
[edit] Mapping a set of terms to one another
Collection map files exist to identify the regions and operators used by the middleware and XPAT in four ways, each way represented by one of four terms:
- LABEL: by the term that is used in the collection database and interface
- SYNTHETIC: by the variable name that is used in the cgi program
- NATIVE: by the language that is used by the search engine
- NATIVEREGIONNAME: by the element name that is indexed
[edit] Mapping terms for XPAT operators
The first part of the map (by convention rather than by DTD enforcement) contains the mappings for the boolean and proximity operators. In versions of DLXS prior to Release 10, mappings for operators tended to appear twice, with labels in all lower case and with mixed case, to cover likely interface option scenarios. Only one mapping per operator is now permitted; older map files must be updated to eliminate unused "duplicate" operator mappings. Here is an example of an operator mapping:
<mapping> <label>and</label> <synthetic>AND</synthetic> <native>^</native> </mapping>
(^ is the symbol used in the XPAT query language to indicate an intersection.)
[edit] Mapping terms for regions
The second part of the map file contains region mappings, which identify the SGML elements, encoded or fabricated, that are used by the middleware and in the HTML, either as labels in pulldown menus or as rgn
variables in links to text from results lists. These are the labels stored in the collection manager fields
termsearch
,regionsearch
, and bibsearch
. The mapping labels and the collmgr entries must match exactly in spelling, number, and case. If they do not, the middleware will fail. For any collection, there will be at a minimum entries with SYNTHETIC mappings for MAIN_SEARCHABLE, IDNO, BIBL
, and NODE
(used by the cgi); with LABEL mappings for full text, works
, and citation
(used as labels in the HTML search pages); and with NATIVEREGIONNAME mappings for DIV1
(used to build a link to divisions from results lists). There should of course be maps for all the divisions in a given collection. Here is an example of a region mapping:
<mapping> <label>full text</label> <synthetic>MAIN_SEARCHABLE</synthetic> <native>region TEXT</native> <nativeregionname>TEXT</nativeregionname> </mapping>
Note: In BibClass, SYNTHETIC and NATIVEREGIONNAME are not used, but SUMMARYLABEL is. See Mounting a Bib Class Collection.
[edit] Mapping and ordering the terms for an HTML form selection's option elements
This section of the map file is not needed in all collections, but may be needed for a specific collection if its markup supports specialized restrictions such as date of publication, genre, period, or gender. In general, the maps support label values, native values, and the order in which the restrictions should be presented in pulldown menus. The existence of these maps is indicated in the metadata database. Here are the genre mappings for the Chadwyck-Healey Yeats collection, which divides works into four categories:
<mapping> <genrelabel>Prose Fiction</genrelabel> <genreorder>1</genreorder> <genrenative>FICT</genrenative> </mapping>
<mapping> <genrelabel>Prose Non-fiction</genrelabel> <genreorder>2</genreorder> <genrenative>NONFICT</genrenative> </mapping>
<mapping> <genrelabel>Drama</genrelabel> <genreorder>3</genreorder> <genrenative>PLAY</genrenative> </mapping>
<mapping> <genrelabel>Poetry</genrelabel> <genreorder>4</genreorder> <genrenative>POEM</genrenative> </mapping>
Under the basic middleware architecture, collection maps are stored in $DLXSROOT/misc/c/class/maps/ and are named collid.map (for example, moa.map or ampo20.map for the Making of America and 20th Century American Poetry collections, respectively).