Published July 8, 2013 | Version v1
Conference paper Open

Classical art semantics information extraction

  • 1. University of South Wales, UK


Abstract. The paper discusses the application of Natural Language Processing (NLP) techniques in the context of semantic annotation of classical art text via rule-based Information Extraction (IE) techniques combined with ontological and domain vocabulary input. The CASIE (Classical Art Semantics Information Extraction) was a pilot collaborative project between the Hypermedia Research
Unit (University of South Wales) and the Beazley Archive (Oxford University), which aims to automatically extract information about cultural objects from classical art scholarly texts and represent this information in terms of the ISO metadata standard for cultural heritage, the International Council of Museum’s CIDOC Conceptual Reference Model (CRM). In total 12 documents (fascicules
– high quality catalogues) were processed, originating from the Corpus Vasorum Antiquorum (CVA) collection containing over 350 high quality catalogues of mostly ancient Greek painted pottery, illustrating more than 100,000 vases. The extracted information was expressed in interoperable RDF graphs consistent with the CLAROS project format. The role of CIDOC-CRM is central for enabling semantic interoperability across the range of datasets that contribute to CLAROS. The CASIE pilot enabled a complementary exploitation of terminological and ontological resources via rule-based information extraction techniques, delivering semantic annotation with respect to the CRM in the broader field of digital humanities.



Files (481.0 kB)

Name Size Download all
481.0 kB Preview Download

Additional details