Phrase

Validate against: http://json-schema.org/schema#

Schema ID: http://cdn.digitallinguistics.io/schemas/Phrase-1.0.0.json

Description

The term phrase is intentially ambiguous, and refers to any unit of a text above the word level. The DLx framework imposes no requirements regarding this size of this unit or how segmentation of the text into units should be accomplished. The user may choose to segment a text based on intonation units, turns, sentences, or any other appropriate subdivision. A DLx phrase consists minimally of a transcription, a translation, and an array of words (though the words array may be empty).

Type: object

Required Properties

  • transcription
  • translation
  • words

Additional properties: true

Dependencies

  • If the property startTime is present, the following properties must also be present:

    • endTime
  • If the property endTime is present, the following properties must also be present:

    • startTime

Properties

  • End Time: "endTime"

    Description

    The time that the speaker finishes producing this phrase within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).

    Type: number

  • Key: "key"

    Description

    A key which uniquely identifies this phrase within the Text. The key for a phrase consists of the abbreviation of the text, a period, and then the number of this phrase within the text (index starts at 1). For example, the third phrase of a text with the abbreviation A would be A.3. Keys should be unique within a corpus.

    Type: string

    Regular expression pattern: ^[(a-z)|(A-Z)|(0-9)]+\.[0-9]{1,3}$

  • Language: "language"

    Description

    The key for the language used in this phrase, e.g. spa or eng. If the text is labeled with a language, all its phrases are assumed to be the same language unless labeled otherwise. Likewise, if a phrase is given a language, all its words are assumed to be the same language unless the word is labeled otherwise.

    Must be an instance of the Abbreviation schema.

  • Notes: "notes"

    Description

    A collection of notes about this Phrase.

    Type: array

    Unique items: true

    Items

    Note

    Description

    A note about this phrase, optionally in multiple orthographies.

    Must be an instance of the Note schema.

  • Speaker: "speaker"

    Description

    The abbreviation of person who produced (uttered, signed, spoke, sung) this Phrase. The value of this field must match the abbreviation of one of the persons listed in the contributors array of the Text. If the text has a single contributor with the role of speaker, that speaker is assumed to be the speaker for all phrases in the Text. If multiple contributors with a speaker role are included in a text, each phrase must have its speaker attribute specified.

    Must be an instance of the Abbreviation schema.

  • Start Time: "startTime"

    Description

    The time that the speaker begins producing this phrase within the media file(s) associated with this Text. The timestamp should be formatted in SS.MMM (seconds and milliseconds).

    Type: number

  • Tags: "tags"

    Must be an instance of the Tags schema.

  • Transcription: "transcription"

    Description

    The transcriptions for this phrase, optionally in multiple orthographies.

    Must be an instance of the MultiLangString schema.

  • Translation: "translation"

    Description

    The translations for this phrase, optionally in multiple orthographies. Also includes an optional type attribute, for specifying things like free or literal translation.

    Validates Against (allOf)

    This schema must validate against all of the following schemas:

    • Must be an instance of the schema.

    • Properties

      • Translation Type

        Description

        The type of translation. Typical values are free or literal, but other values may be supplied.

        Type: string

  • URL: "url"

    Description

    The URL where this phrase can be retrieved in JSON format.

    Must be an instance of the Url schema.

  • Words: "words"

    Description

    A collection of the word tokens contained in this Phrase.

    Type: array

    Unique items: false

    Items

    Word

    Must be an instance of the Word schema.