Published March 27, 2025 | Version v1
Poster Open

Reading, listening to and watching concordances of audiovisual interaction corpora

  • 1. ROR icon Leibniz Institute for the German Language
  • 2. Musical Bits GmbH
  • 3. ROR icon University of Duisburg-Essen

Description

KWIC (Key Word in Context) concordancers have become a standard feature of every corpus analysis platform. Our contribution is concerned with extensions of the standard KWIC that are useful or even necessary for working with audiovisual interaction corpora. We do this from the perspective of research paradigms such as conversation analysis, which strive to integrate or supplement their established qualitative micro-analysis methods with corpus linguistic approaches.

Audiovisual interaction corpora consist of audio and/or video recordings, their transcriptions (possibly with additional annotations), and metadata about participants and the interaction situation. In a first approximation, corpus queries can be carried out on the transcript text and visualised as a KWIC with basic elements like highlighted hits, left and right context, and links to metadata. As an overview of current platforms providing access to interaction corpora (Frick/Schmidt, submitted) reveals, this is the minimum functionality that all platforms cater for in some form or other.

However, researchers working with audiovisual interaction data often require additional features that are owing both to the nature of the data themselves, and to the way they are typically analysed. Most importantly, it must be possible:

- to display the search result represented in a KWIC line in a larger transcript context. This feature is fundamental for understanding the conversationalto  context.[TS1] 

- to playback the corresponding part of the audio or video recording underlying a line in the KWIC. This is vital for accessing information not represented in the transcript, e.g. prosodic cues that help interpret spoken utterances.[TS2] 

- to manually select or deselect individual lines of the KWIC. This is important because queries on interaction corpora are often initially over-inclusive (“fuzzy”) and consequently may contain false positives that need to be removed.

- to manually add annotations to KWICs in order to further categorise search results analytically.

-   to access, in addition to metadata, supplementary materials (such as a diagram that speakers refer to) that may be crucial for interpreting an utterance.

 Last but not least, multimodal analyses of video data will profit from methods for integrating visual information (e.g. still images) into a KWIC in a compact manner, allowing researchers to get an idea of regularities and peculiarities of visible features of interaction in the same way that a “classical” KWIC enables the discovery of patterns in verbal behaviour. 

Based on our experience in developing EXAKT[1] (EXMARaLDA Analysis and Concordance Tool), DGD[2] (Database of Spoken German), and ZuMult[3] (Framework for object-oriented architecture of spoken corpora), our poster will provide an overview of the desiderata for KWIC concordances for research on audiovisual interaction data. It will also facilitate discussions on future developments, such as collaborative work with KWIC results, and address challenges such as how to display KWIC when hits involve multiword sequences realized by more than one speaker.


Files

Poster_Frick-Schmidt_02-2025-3.pdf

Files (2.1 MB)

Name Size Download all
md5:4ad3180d849c5873fe328747ee40d2e3
2.1 MB Preview Download