The Canadian Vichy Intercepts is a digital humanities project that makes publicly accessible a corpus of 13,848 diplomatic telegrams intercepted by the Examination Unit of Canada during World War II. The documents, held on microfilm reels T-17425 to T-17429 at Bibliothèque et Archives Canada, cover Vichy France communications (September 1941 – March 1945) and France libre communications (April 1943 – July 1945). The project develops and documents an AI-powered pipeline using large language model-based OCR (Mistral AI) to transcribe, structure, and extract metadata from the corpus at scale. All outputs — transcriptions, structured datasets, processing code, and performance reports — are published as open data in accordance with FAIR principles. The web interface is accessible via an Omeka Classic platform with IIIF integration at https://omeka.uottawa.ca/examination-unit/. This community archives the project's datasets and code across successive pipeline versions.

Subjects