Lexeme

Validate against: http://json-schema.org/schema#

Description

A lexeme is an abstract entity that represents all the various forms of a word. In DLx, a lexeme refers broadly to any bundle of related senses and forms, whether the item is an individual word, morpheme, idiom, etc. — anything that constitues a semantic unit. Examples of lexemes in English might include be, run up, and ‑ing. A lexeme will typically have multiple senses or meanings, and those are listed in the senses property. It is up to the linguist to decide when two meanings are related, and therefore belong to the same lexeme, or when they belong to different lexemes. A lexeme often also has multiple base forms, such as suppletive forms, irregular forms, or morphologically-conditioned forms. For example, the lexeme be has the base forms be, am, is, etc. These are listed in the forms field. The forms field should not be used to list all the regularly-inflected forms of a word. Individual base forms may have phonologically-conditioned allomorphs, and these are listed in the allomorphs field of the form. The lexeme and its forms and senses may also have variations, such as dialectal and idiolectal variants, rapid speech variants, register-based variants, variations in meaning, or even spelling variants. These are listed in the variants fields. By convention, one of the forms of a lexeme is typically chosen as a representative headword or lemma, and this is indicated by the lemma field. For example, the form man is typically used as the lemma/headword for the lexeme that includes the forms man and men. Note that the DLx Lexeme does not represent a lexical entry in a dictionary. Dictionaries typically list each base form of a lexeme as a separate lexical entry. The DLx Lexeme lists puts each of these lexical entries together in the forms field instead.

Type: object

Required Properties

  • forms
  • lemma
  • senses

Additional properties: true

Dependencies

  • If the property variantType is present, the following properties must also be present:

    • variantOf

Properties

  • Type: "type"

    Description

    The type of object. Must be set to Lexeme.

    Type: string

  • ID: "id"

    Description

    A unique database identifier for this Lexeme

  • Lexeme Key: "key"

    Description

    A human-readable key that uniquely identifies this lexeme or variant within the language. Best practice is for the key to consist of a representation of the lemma form of the word without diacritics, and, if the word is a homonym, the homonym number. However, any value is acceptable as long as it is unique for the language. (Keys do not need to be unique across languages.)

    Type: string

    Regular expression pattern: ^[^ ]+$

  • Citation Form: "citationForm"

    Description

    The citation form of a lexeme is the form given when spoken in isolation, which may be different from its lemma form. For example, in Swahili the citation form of a verb is typically the infinitive, e.g. kuandika to write, even though ‑andika is typically used as its lemma form. It may be represented in multiple orthographies. Do not include leading or trailing tokens (e.g. hyphens, equal signs) in this field.

    Must be an instance of the Transcription schema.

  • Date Created: "dateCreated"

    Description

    The date and optionally time that this lexeme was created

    Type: string

    Validates Against (oneOf)

    This schema must validate against one and only one of the following schemas:

    • Format: date

    • Format: date-time

  • Date Modified: "dateModified"

    Description

    The date and optionally time that this lexeme was last modified

    Type: string

    Validates Against (oneOf)

    This schema must validate against one and only one of the following schemas:

    • Format: date

    • Format: date-time

  • Examples: "examples"

    Description

    A collection of examples illustrating this lexeme in use. Each example is an Utterance from a Text. The Utterance number should be indicated in the index field of the Database Reference object. If using a full Utterance object rather than a Database Reference object, the key field should be included. For precision's sake, it is recommended that examples be given for individual senses and forms rather than the entire lexeme when possible.

    Type: array

    Unique items: true

    Items

    Example Utterance (Database Reference)

    Description

    A database reference to an Utterance object

    Must be an instance of the DatabaseReference schema.

  • Features: "features"

    Description

    A set of inflectional features for this lexeme (used primarily with grammatical morphemes). Each property should be the name of a feature type (e.g. case, person, number, gender, nounClass, etc.), and its value should be the value for that feature, as a string (e.g. nominative, 1, singular, masculine, etc.). Features may be written more than once, in different languages. For example, a morpheme may have the feature case: accusative (English) as well as caso: acusativo (Spanish).

    Type: object

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Tags

      Description

      The Features object must be a Tags object

      Must be an instance of the schema.

  • Lexeme Base Forms: "forms"

    Description

    A collection of base forms for this lexeme, i.e. the different forms that this lexeme or morpheme may take, exclusive of its regular inflectional variants. Each base form typically corresponds to a lexical entry in a dictionary. For example: the lexeme man would include the forms man and men; the lexeme run would include the forms run and ran, but not runs or running, because these are regularly-inflected and therefore predictable; the lexeme be would include am, are, is, etc., because these are irregular / suppletive forms, but would not include being.

    Type: array

    Unique items: true

    Items

    Lexeme Base Form

    Description

    One of the base forms of this lexeme

    Must be an instance of the LexemeForm schema.

  • Language (DatabaseReference): "language"

    Description

    The language of this Lexeme. This property is most useful when working with lexical data from multiple languages.

    Must be an instance of the DatabaseReference schema.

  • Lemma: "lemma"

    Description

    A lemma is the form of a lexeme conventionally used to represent that Lexeme. It may differ drastically from the citation form. For example, the form be is typically used as the lemma form of the English verb forms am, are, is, etc. Lemmas may be represented in multiple orthographies. Do not include any leading or trailing tokens (e.g. hyphens, equal signs).

    Must be an instance of the Transcription schema.

  • Lexical Relations: "lexicalRelations"

    Description

    A list of the lexical relations that this lexeme has to other lexemes. Each item is a Database Reference, and must also have a property called relation, indicating the type of lexical relation. For precision's sake, lexical relations should be specified for individual senses rather than the entire lexeme whenever possible.

    Type: array

    Unique items: true

    Items

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Lexeme (Database Reference)

      Description

      A database reference representing a lexical relation between two lexemes or senses. Note: The database reference must also have a relation property specified, indicating the type of lexical relation.

      Must be an instance of the schema.

    • Required Properties

      • relation

      Properties

      • Relation Type

        Description

        The type of lexical relation that holds between the current item and the referenced Lexeme. Can also be used for general cross-references (a compare relation) or historical relationships (a derivedFrom or originOf relation). Examples: antonym, synonym, cognate, derivedFrom, originOf, compare, partOf, hypernymOf, hyponymOf.

        Must be an instance of the schema.

  • Link: "link"

    Description

    A URL where a presentational format for this resource may be viewed

    Type: string

    Format: uri

  • Literal Meaning: "literalMeaning"

    Description

    The literal meaning of the lexeme, optionally in multiple languages

    Must be an instance of the MultiLangString schema.

  • Media: "media"

    Description

    Media items associated with this lexeme, such as recordings of the citation form of the word, pictures of the item this word refers to, or videos of the action being performed. If a media item pertains a specific sense or form, it should be placed in that sense or form's media field instead.

    Type: array

    Unique items: true

    Items

    Media Item (Database Reference)

    Description

    A database reference to a media item associated with this lexeme

    Must be an instance of the DatabaseReference schema.

  • Notes: "notes"

    Description

    A collection of notes about this lexeme. Each Note object must have its noteType property specified. Notes with a note type of private are not intended for publication in dictionaries, while other types of notes are. For precision's sake, it is recommended that notes be attached to specific forms or senses whenever possible.

    Type: array

    Unique items: true

    Items

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Note

      Description

      A note about this lexeme

      Must be an instance of the schema.

    • Required Properties

      • noteType

      Properties

      • Note Type

        Description

        The type of note about this lexeme

        Type: string

        Allowed Values (enum)

        • private
        • general
        • anthropology
        • discourse
        • encyclopedic
        • grammar
        • phonology
        • semantics
        • sociocultural
  • Bibliographic References: "references"

    Description

    A collection of bibliographic references relating to this lexeme. For example, a particular lexeme may have been discussed in detail in a published article.

    Type: array

    Unique items: true

    Items

    Reference

    Description

    A bibliographic Reference about this lexeme

    Must be an instance of the BibliographicReference schema.

  • Senses: "senses"

    Description

    A collection of meanings or senses for this lexeme. It is up to the linguist to decide whether two uses of a lexeme are distinct enough to be considered separate senses.

    Type: array

    Min items: 1

    Unique items: true

    Items

    Sense

    Description

    A sense or meaning for this lexeme.

    Must be an instance of the Sense schema.

  • Sources: "sources"

    Description

    A list of attested sources for this lexeme, such as a citation to a published text where it appears, the key of an Utterance in the database, or the initials of the speaker who provided it. For precision's sake, sources should be listed for specific senses or forms of a lexeme whenever possible.

    Type: array

    Unique items: true

    Items

    Source

    Description

    An attested source for this lexeme. This will often be the initials of a speaker, but could also be the abbreviation of the story the lexeme was found it, or a citation to a published text in which the lexeme appears.

    Type: string

    Min length: 1

  • Tags: "tags"

    Description

    A set of tags for this lexeme

    Must be an instance of the Tags schema.

  • URL: "url"

    Description

    A URL where a JSON representation of this lexeme may be retrieved

    Type: string

    Format: uri

  • Variant Of: "variantOf"

    Description

    If this lexeme is a variant of another lexeme, this field should contain a reference to the other Lexeme. Lexemes may only be variants of one other Lexeme.

    Must be an instance of the DatabaseReference schema.

  • Variants: "variants"

    Description

    A list of variants of this this lexeme. This field should be used for dialectal and idiolectal variants, rapid and careful speech variants, register-based variants, variations in meaning, spelling variants, etc. It should not be used for phonologically-conditioned variants (use the allomorphs field of a specific form instead) or morphologically-conditioned variants (use the forms field instead). Each variant should have its variantType property specified.

    Type: array

    Unique items: true

    Items

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Variant (Database Reference)

      Description

      A database reference to a variant of this lexeme. Note: The Database Reference object must have a variantType property, indicating the type of variant.

      Must be an instance of the schema.

    • Required Properties

      • variantType

      Properties

      • Variant Type

        Description

        This field is be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. May be in multiple languages.

        Must be an instance of the schema.

  • Variant Type: "variantType"

    Description

    If this lexeme is a variant of another lexeme or sense, this field can be used to specify the type of variant. Possible values might be a person’s name (representing an idiolectal variant), or simply idiolectal, or dialectal (or the name of the dialect), or rapid speech, etc. Optionally in multiple languages.

    Must be an instance of the MultiLangString schema.