LanguagePractices
From oaibp
Main Page >> Shareable Metadata
Note: Summary of best practices added. Jenn Riley 10/12/05
Contents |
[edit] Recording Language of Content in OAI Records
[edit] Summary of Best Practices
- Supply a language element when relevant to the resource.
- Format the value of the language element according to the rules of the metadata format in use.
- Express multiple titles in repeated fields.
- Supply the language of the metadata record only in a metadata element specifically designed for this purpose.
[edit] Choice of Language Values
Best practice is to include the language of a resource in the metadata when it is appropriate to do so. For textual materials, for example, language can be an important access point for end users and therefore should be recorded. For graphic materials, however, the material itself may have no obvious connection to a specific language so none should be recorded.
Example of item for which language data is probably not relevant:
<dc:title>Houses in Eureka valley San Francisco from Collingwood near 22nd St.</dc:title>
<dc:creator>Cushman, Charles Weever, 1896-1972</dc:creator>
<dc:date>1955-03-14</dc:date>
<dc:type>Cityscape photographs</dc:type>
<dc:type>StillImage</dc:type>
<dc:identifier>http://purl.dlib.indiana.edu/iudl/archives/cushman/P07697 </dc:identifier>
See DC IU Charles Cushman Collection 1 for the complete record from which this example was taken.
Example of item for which language data provides a useful access point:
<dc:title>Die russischen Schwestern : eine Geschichte von der Belagerung von Sebastopel / von Robert Russell.</dc:title>
<dc:creator>Russell, Robert.</dc:creator>
<dc:publisher>Baltimore : A.R. Orton,</dc:publisher>
<dc:date>1856.</dc:date>
<dc:type>text</dc:type>
<dc:format>text/sgml</dc:format>
<dc:identifier>http://purl.dlib.indiana.edu/iudl/wright2/wright2-2144A </dc:identifier>
<dc:language>German</dc:language>
See DC IU Wright American Fiction 1 for the complete record from which this example was taken.
[edit] Format of Language Values
Metadata schemas frequently indicate that language values should use either a code from a specified standard or a term from a controlled list. Language values in OAI records should appear in a format dictated by the metadata schema in use. If possible to specify the controlled vocabulary or encoding scheme in use, data providers should do so.
Two frequently used content standards for language values (both terms and codes) are ISO 639-2b - Codes for the representation of names of languages-- Part 2: alpha-3 code, for which the Library of Congress serves as the registration authority, and RFC 3066 - Tags for the Identification of Languages, an Internet Current Best Practices document. RFC 3066 facilitates the use of both the original 2-letter ISO 639 (1988) or 3-letter ISO 639-2 (1998) codes, which may be especially good for those who have been using the 2-letter language codes.
Example from a MODS record with a controlled vocabulary in use and specified:
<language>
<languageTerm authority="iso639-2b" type="code"> eng </languageTerm>
</language>
When multiple languages apply to a resource, encode each language value in a separate field for language in your target metadata schema. (Needs example.)
[edit] Language of Metadata Record
Record the language of the metadata record when the metadata format used provides a specific element or attribute for this purpose.
Example from a MODS record:
<abstract lang="eng"> Broadside advertising a funeral ceremony
commemorating assassinated president Abraham Lincoln, held in Elgin,
Illinois, on April 19, 1865. It details the route of the procession, the
order of local official participants in the procession, and the order of
service for the ceremony to be held in the Academy Hall.</abstract>
