Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
1блок 1-25 пит.docx
Скачиваний:
8
Добавлен:
12.09.2019
Размер:
153.78 Кб
Скачать

15. Lexical content and All-Inclusive Lexicon. Tei lexicon.

Lexical Content

While considering definitions of the lexicon, its functions and linguistic importance the question of what linguistic information it should contain becomes important. In practice, what information the lexicon should contain depends on the purpose for which the lexicon was built. Chomsky suggests that ‘the lexicon presents, for each lexical item its abstract phonological form and all the semantic properties associated with it’. He maintains that ‘the lexicon is a set of lexical elements’. It must specify, for each element, the phonetic, semantic and syntactic properties which are idiosyncratic to it. In addition, depending on the sophistication of the overall grammar, the lexicon should contain information as to the subcategory of the word (such as whether a particular verb is transitive or intransitive). Other syntactic properties (such the gender of a noun in a language that makes gender distinctions.

All-Inclusive Lexicon

The conception of a richer lexicon leads to a combination and integration of phonological, morphological, collocational, syntactic, semantic and pragmatic information in various ways. In 1995 R.Hudson introduced a checklist for an All-Inclusive lexicon. This A-I lexicon would state the distinction between the lexicon and the grammar, reflecting the trend towards lexicalism. The checklist of information looks like following:

Phonology

  • underlying segment structure or several such structures if allomorphs are stored in the computer;

  • prosodic patterns or word (to the extent that there are no rules for computing these), i.e. mainly word stress or tone.

Morphology

  • structure in terms of morphemes;

  • irregular morphological structures linked to particular mono-syntactic features (i.e. irregular inflections);

  • partial similarities to other words (i.e. derived words and compounds);

Syntax

  • general word-class (e.g. ‘verb’);

  • sub-class (e.g. ‘auxiliary’);

  • obligatory mono-syntactic features (e.g. to be);

  • valency.

Context

  • restrictions relating to immediate social structure (e.g. solidarity markers);

  • restrictions relating to style (formal, slang)

  • restrictions relating to discourse structures (topic-change markers).

Spelling

  • normal orthography;

  • standard abbreviations;

  • inflectional irregularities in spelling

Etymology and Language

  • The language to which the word belongs;

  • The l-ge from which it was borrowed;

  • The word on which it is based;

  • The date when it was borrowed.

Usage

  • frequency and familiarity

  • age of acquisition;

  • particular occasions on which the word was used;

  • cliches containing the word;

  • taboos.

Specifications:

  • In this lexicon checklist there is no clear dividing line between linguistic knowledge and encyclopedic knowledge.

  • But nowadays linguists stick to the point of view that while lexical and world knowledge must be distinguished, it is impossible to discretely separate lexical (linguistic) and world knowledge.

The TEI Lexicon

A problem: what should go into the lexicon. In need of standardization in August 1991 the Computational Lexicon Working Group created the Text Encoding Initiative program. Its primary task was to conduct a survey of currently existing lexicons and produce standards for interchanging electronic documents of various types, i.e. to create lexical databases intended for use by natural language processing systems of all sorts. The TEI Lexicon should include the following types of information:

Nouns:

  • entity nouns (apple, book);

  • relational groups (speed, age, father)

  • abstract nouns (courage, love);

  • mass nouns (wine, sand);

  • proper names (John, IBM);

  • complement-taking properties (factive noun like story).

Pronouns:(I, he she) and bound anaphors (myself, himself, each other).

Verbs:

  • -a vide variety of valency classes:

  1. intransitive;

  2. transitive;

  3. ditransitive;

  4. clausal complement taking;

  5. infinitival complement taking;

  6. small clause taking including bare infinitive.

Modals and Auxiliaries.

Prepositions:

  • indication of subclasses of prepositions (case-marking, semantically contentful prepositions

Adjectives:

  • complement-taking properties ( proud of, likely to);

  • semantic classes of adjectives;

  • the position in which an adj. can appear (prenominal, postnominal, predicate).

Determiners and other similar nominal modifiers (articles, quantifiers, demonstratives etc.

Multi-word lexical entries.

Inflected categories of noun, verb and adjective: how irregular forms, inflectional paradigm, and other morphological information are stored.

Conclusion: different word classes vary in their specification of linguistic content and thus demand different treatment as to their packing into computer database.