Annotated Corpora
From lingwiki
(Difference between revisions)
m (1 revision) |
Current revision
An annotated corpus incorporates various elements; a lexicon, morphological/syntactc analysis, and a text corpus. Developing a large multi-lingual corpus, together with tools to analyze the data and tools to navigate the data, is a project with far-reaching potential for linguistic research.
[edit] Online Corpora
Here are links to a few existing corpora which include some additional analysis of the texts:
- Perseus Digital Library -- includes texts from wide variety of sources, including classical Greek and Latin texts.
- The Vergil Project -- marked-up text of Vergil's Aeneid. The text is hand-annotated; each word form is described in terms of its root, inflectional morphology, and a rough English gloss. Annotation is incomplete (but ongoing?)