Working with Fabricated Regions

From DLXS Documentation

Revision as of 09:37, 15 September 2007 by Cboulay (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Main Page > Mounting Collections: Class-specific Steps > Mounting a Text Class Collection > Working with Fabricated Regions

Contents

[edit] Overview

When you use XPAT in combination with xmlrgn or sgmlrgn and a DTD, or multirgn and a tags file, you are identifying the elements and attributes in the DTD or tags file as "regions," containers of content rather like fields in a database. These separate regions are built into the regions file (collid.rgn) and are identified in the data dictionary (collid.dd). This is what is happening when you are running sgmlrgn and/or xmlrgn.

However, sometimes the things you want to identify collectively aren't so handily identified as elements in the DTD. For example, suppose you want to search within specific features of a book, such as a chapter, that can occur at different heirarchical levels in different volumes. Also, the element isn't even called CHAPTER; it's a numbered division with a type attribute telling you that it's a chapter.

In order to fabricate a region containing all the divisions in books that are chapters, for example you can first find all the regions with a query:

(region DIV1 incl (region "DIV1-T" incl "type=chapter"))+ (region DIV2 incl (region "DIV2-T" incl "type=chapter"))

You could do a more complex search based on treating attributes as regions instead of text strings, which is functionally the same:

(region DIV1 incl (region "DIV1-T" incl (region "A-TYPE" incl chapter)))+ (region DIV2 incl (region "DIV2-T" incl (region "A-TYPE" incl chapter)))

Finally, once you have a query that produces the results you want, create a file for export, export it, and sync it:

{exportfile "$DLXSROOT/idx/c/collid/chapter.rgn"}
export
~sync "chapter"

[edit] Why Fabricate Regions?

Why fabricate regions? Why not just put these queries in the map file and call them chapters? While you could, it's probably worth your time to build these succinctly-named and precompiled regions; query errors are more easily identified in the index building than in the CGI, and XPAT searches can be simpler and quicker for terms within the prebuilt regions.

Fabricated regions within Text Class can be found in the extra.srch file for the sample collection at $DLXSROOT/prep/s/sampletc_utf8/sampletc_utf8.extra.srch . As with any other elements used in the interface for a given collection, fabricated regions used must also be represented in the collmgr entry and the map file for that collection.

[edit] Fabricated regions required in Text Class

  • main
  • mainheader
  • maintitle
  • div1head

[edit] Fabricated regions commonly found in Text Class

  • mainauthor
  • maindate page (for collections with page images)
  • id (for collections with a number of different IDNO elements)
  • divxhead (for collections nested below DIV1)

Top

Personal tools