Lexeme

Validate against: http://json-schema.org/schema#

Schema ID: http://cdn.digitallinguistics.io/schemas/Lexeme-1.0.0.json

Description

A lexeme is the set of all forms that share the same meaning. A lexeme is used broadly by DLx to refer to any collection of related senses and forms, whether the item is an individual word, a morpheme, or even a fully-inflected Phrase. In other words, DLx lexeme objects can be used to describe anything that constitutes a lexical unit or construction. A lexeme will often have multiple senses or meanings, and those are listed in the senses field. It is up to the linguist to decide when two meanings are related, and therefore part of the same lexeme, or when they belong in different lexemes. A lexeme will also often have multiple variants. For example, the lexeme run in English has two base forms: run and ran. The run form is listed as the headword or lemma, and ran is listed as a past tense variant. The variants field should not be used to list all the inflectional forms of a Lexeme.

Type: object

Required Properties

  • lemma
  • senses

Additional properties: true

Dependencies

  • If the property variantType is present, the following properties must also be present:

    • variantOf

Properties

  • Allomorphs: "allomorphs"

    Description

    A list of allomorphs of this Lexeme.

    Type: array

    Min items: 1

    Unique items: true

    Items

    Allomorph

    Description

    An allomorph of this Lexeme.

    Type: object

    Required Properties

    • environments
    • transcription

    Additional properties: true

    Properties

    • Transcription: "transcription"

      Description

      A transcription of this allomorph, optionally in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).

      Must be an instance of the MultiLangString schema.

    • Environments: "environments"

      Description

      A list of (morpho)phonological environments in which this allomorph occurs. May be an empty array.

      Type: array

      Unique items: true

      Items

      Environment

      Description

      A formalization of a (morpho)phonologial environemnt, e.g. _k.

      Type: string

  • Citation Form: "citationForm"

    Description

    The citation form of a lexeme is the form given when spoken in isolation, which may be different from its lemma form. For example, in English the citation form of a verb is typically the infinitive, e.g. to run, even though run is typically used as its lemma form. The citation form usually serves as the headword in a dictionary as well. It may be represented in multiple orthographies. Do not include leading or trailing tokens (e.g. hyphens, equal signs) in this field.

    Must be an instance of the MultiLangString schema.

  • Components: "components"

    Description

    A list of the morphemes or other lexical entries contained within the current form. For example, the form gentlemen in an English lexicon might have references to the form gentle, and the form men within the lexeme for man. Components may reference either an entire lexeme or a specific variant. Components do not have to be unique (useful when the same morpheme appears twice in a word).

    Type: array

    Min items: 1

    Unique items: false

    Items

    Referenced Lexeme

    Description

    The referenced component.

    Must be an instance of the LexemeReference schema.

  • Date Created: "dateCreated"

    Description

    The date and optionally time that this lexeme was created, in internet date-time format.

    Must be an instance of the DateCreated schema.

  • Date Modified: "dateModified"

    Description

    The date and optionally time that this lexeme was last modified, in internet date-time format.

    Must be an instance of the DateModified schema.

  • Examples: "examples"

    Description

    A collection of examples illustrating this lexeme in use. Each example is a phrase from a Text.

    Type: array

    Unique items: true

    Items

    Example Phrase

    Must be an instance of the Phrase schema.

  • Features: "features"

    Description

    A set of inflectional features for this morpheme (used primarily with grammatical morphemes). Each property should be the name of a feature type (e.g. case, person, number, gender, nounClass, etc.), and its value should be the value for that feature, as a string (e.g. nominative, 1, singular, masculine, etc.). Features may be written more than once, in a different Language. For example, a morpheme may have the feature "case": "accusative" (English) as well as "caso": "acusativo" (Spanish).

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Tags

      Must be an instance of the schema.

  • Included In: "includedIn"

    Description

    A list of references to lexemes or variants that this item is included in. For example, the lexeme ‑s (English plural for nouns) would have a reference to the lexeme pants, among others.

    Type: array

    Unique items: true

    Items

    Lexicon Reference

    Must be an instance of the LexemeReference schema.

  • Lexeme Key: "key"

    Description

    A human-readable key that uniquely identifies this lexeme or variant within its Lexicon. Best practice is for the key to consist of the lemma form of the word in the default orthography, and if the word is a homonym, the homonym number. However, any value is acceptable as long as it is unique within the Lexicon. (Keys do not need to be unique across lexicons.)

    Type: string

    Regular expression pattern: ^[^ ]+$

  • Lemma: "lemma"

    Description

    A lemma is the particular form conventionally used to represent a particular Lexeme. It may differ drastically from the citation form or headword form. For example, the form be is typically used as the lemma form of the English verb to be, with its variants am, are, is, etc. Lemmas may be represented in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).

    Must be an instance of the MultiLangString schema.

  • Lexical Relations: "lexicalRelations"

    Description

    A list of the lexical relations that this lexeme has to other lexemes or variants.

    Type: array

    Unique items: true

    Items

    Lexical Relation (Lexeme Reference)

    Description

    A lexical relation between two lexemes or senses. A LexicalRelation object is just a LexemeReference object, but with the relation property required.

    Must be an instance of the LexemeReference schema.

  • Literal Meaning: "literalMeaning"

    Description

    The literal meaning of the lexeme, optionally in multiple orthographies.

    Must be an instance of the MultiLangString schema.

  • Morpheme Type: "morphemeType"

    Description

    The type of morpheme or complex construction that this lexeme is. Examples: root, stem, bipartite stem, enclitic, prefix, inflected word, phrase, circumfix.

    Must be an instance of the MultiLangString schema.

  • Notes: "notes"

    Description

    A collection of notes about this Lexeme.

    Type: array

    Unique items: true

    Items

    Note

    Description

    A note about this lexeme, optionally in multiple orthographies.

    Must be an instance of the Note schema.

  • Bibliographic References: "references"

    Description

    A collection of bibliographic references relating to this lexeme or variant. For example, a particular lexeme may have been discussed in detail in a published article.

    Type: array

    Unique items: true

    Items

    Reference

    Description

    A bibliographic Reference.

    Must be an instance of the Reference schema.

  • Sources: "sources"

    Description

    A list of attested sources for this lexeme or variant.

    Type: array

    Unique items: true

    Items

    Source

    Description

    An attested source for this lexeme or variant. This will often be the initials of a speaker, but could also be the abbreviation of the story the lexeme was found it, or other types of sources.

    Type: string

  • Syllable Structure: "syllableStructure"

    Description

    An abstract representation of the syllable structure of this form, e.g. CVC.

    Type: string

  • Tags: "tags"

    Description

    A collection of tags for this lexeme or variant.

    Must be an instance of the Tags schema.

  • Tone: "tone"

    Description

    An abstract representation of the tonal pattern of this lexeme or variant. Examples: HLH, 323, etc.

    Type: string

  • URL: "url"

    Description

    The URL where this lexeme or variant is located.

    Must be an instance of the Url schema.

  • Variant Of: "variantOf"

    Description

    When this lexeme is a variant of another lexeme, this field should contain a reference to the other Lexeme. Lexemes may only be variants of one other Lexeme.

    Must be an instance of the LexemeReference schema.

  • Variants: "variants"

    Description

    A list of variants of this Lexeme.

    Type: array

    Unique items: true

    Items

    Referenced Variant (Lexeme Reference)

    Description

    A reference to the variant of this lexeme. A Variant is simply a Lexeme Reference object, but with the variantType required.

    Must be an instance of the LexemeReference schema.

  • Variant Type: "variantType"

    Description

    If this lexeme is a variant of another lexeme or sense, this field can be used to specify the type of variant. Possible values might be a person's name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc.

    Must be an instance of the MultiLangString schema.

  • Senses: "senses"

    Description

    A collection of senses for this Lexeme.

    Type: array

    Unique items: true

    Items

    Sense

    Description

    One of the meanings for this Lexeme. For example, the lexeme run might have two senses: one with a definition of 'run in a race', and the other with a definition 'run water in a sink'.

    Type: object

    Required Properties

    • gloss

    Additional properties: true

    Properties

    • Argument Structure: "argumentStructure"

      Description

      An abstract representation of the argument structure for this sense.

      Type: string

    • Lexical Category (Part of Speech, Morphosyntactic Class, etc.): "category"

      Description

      The lexical category, part of speech, or morphosyntactic class for this Lexeme. If the current lexeme is an affix or other grammatical morpheme morpheme, this field should be used to describe the category that the morpheme attaches to. For example, the English verb suffix ‑s would have this this property set to verb, and the English derivational suffix ‑ize would have this property set to noun.

      Must be an instance of the MultiLangString schema.

    • Definition: "definition"

      Description

      The definition for this particular sense, optionally in multiple languages.

      Must be an instance of the MultiLangString schema.

    • Derived Category: "derives"

      Description

      If this lexeme is a derivational morpheme, this field indicates the type of lexical category, part of speech, or morphosyntactic class that is derived when this morpheme is applied. For example, the English derivational suffix ‑er would have this property set to noun.

      Must be an instance of the MultiLangString schema.

    • Examples: "examples"

      Description

      A collection of examples of this sense in use. Each example is a Phrase.

      Type: array

      Unique items: true

      Items

      Example Phrase

      Must be an instance of the Phrase schema.

    • Gloss: "gloss"

      Description

      A Leipzig-style gloss for this sense.

      Must be an instance of the MultiLangString schema.

    • Inflectional Class: "inflectionClass"

      Description

      If this lexeme is a root or stem, this field indicates the inflectional class that the sense takes. If this lexeme is an inflectional morpheme, this field indicates the inflectional class that the morpheme belongs to. If this lexeme is a derivational morpheme, this field indicates the inflectional class of the derived form.

      Must be an instance of the MultiLangString schema.

    • Lexical Relations: "lexicalRelations"

      Description

      A collection of lexical relations between this sense and other senses in this lexicon or other lexicons.

      Type: array

      Unique items: true

      Items

      Lexical Relation (Lexeme Reference)

      Description

      A lexical relation between two lexemes or senses. A LexicalRelation object is just a LexemeReference object, but with the relation property required.

      Must be an instance of the LexemeReference schema.

    • Notes: "notes"

      Description

      A collection of notes about this sense.

      Type: array

      Unique items: true

      Items

      Note

      Must be an instance of the Note schema.

    • Bibliographic References: "references"

      Description

      A collection of bibliographic references about this particular sense.

      Type: array

      Unique items: true

      Items

      Reference

      Must be an instance of the Reference schema.

    • Scientific Name: "scientificName"

      Description

      The scientific name for this item.

      Type: string

    • Sources: "sources"

      Description

      A list of attested sources for this sense.

      Type: array

      Unique items: true

      Items

      Source

      Description

      An attested source for this sense. This will often be the initials of a speaker, but could also be the abbreviation of the story the lexeme was found it, or other types of sources.

      Type: string

    • Tags: "tags"

      Description

      A collection of tags for this Lexeme.

      Must be an instance of the Tags schema.

    • Usages: "usages"

      Description

      A list of the appropriate usages for this sense. Examples include formal, medicinal, informal, etc.

      Type: array

      Unique items: true

      Items

      Usage

      Must be an instance of the MultiLangString schema.

    • Variant Of: "variantOf"

      Description

      If this sense is a variant of another sense, a reference to the other sense should go here. For example, sometimes two speakers may use the same word with a slightly different set of senses. In American English, for instance, Coke is a specific brand of soda for most speakers, but a generic term for soda for other speakers. The generic sense would therefore be listed as a dialectal variant of the specific sense.

      Must be an instance of the LexemeReference schema.

    • Variants: "variants"

      Description

      A list of variants of this sense.

      Type: array

      Unique items: true

      Items

      Referenced Variant (Lexeme Reference)

      Description

      A reference to the variant of this sense. A Variant is simply a Lexeme Reference object, but with the variantType required.

      Must be an instance of the LexemeReference schema.

    • Variant Type: "variantType"

      Description

      If sense is a variant of another sense, this field can be used to specify the type of variant. Possible values might be a person's name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc.

      Must be an instance of the MultiLangString schema.

Default Value

{
  "lemma": {},
  "senses": []
}