Logo HU Berlin
Logo FU Berlin Logo Xinnovations

Impulse

zur Übersicht

Antoine Isaac

Europeana Foundation

Data modeling at Europeana and DM2E

Europeana has developed a data model of its own, EDM (http://pro.europeana.eu/edm-documentation). The aim is to be able to harvest and disseminate metadata from libraries, archives and museums all over Europe.

The model is however much less monolithic than it sounds. The result of years of collaborative work in the cultural heritage community, EDM is not built from scratch. It incorporates data patterns from existing vocabularies (OAI-ORE, SKOS, CIDOC-CRM), and often directly re-uses their elements. In other word, EDM applies to the metadata vocabulary level the very principles it is supposed to enable at the data level, i.e. re-use and connect data on the web. Actually EDM is designed to encourage the adhesion to Linked Data principles within the cultural community and to ultimately enable the seamless plugging of Europeana in the Linked Data Paradigm.

EDM also shows flexibility in that Europeana partners can and are encouraged to create their own extensions: DM2E is designed as a ‘roof’ overspanning various communities and explicitly inviting their specialisations. The same way Europeana itself assembled its model, partners can and should extend it for their own purposes. For example the DM2E project (http://dm2e.eu) is geared towards aggregating data and developing richer services for the digital humanities. It created its own extension to EDM so as to better serve its specific needs. This results in a framework where minimal standardization helps to represent the data fundamentals in a coherent way across the board. As a consequence, DM2E may succeed in making Europeana one of the prime corpora sources for the Digital Humanities. The uptake of this approach by major initiatives such as Perseus (cf. Crane et al 2012) is a clear indicator of this potential.

Yet other projects and (sub-)domains are given freedom to devise their own vocabularies—sometimes even standards. The re-use of shared vocabularies is also expected to facilitate interoperability with other metadata framework, such as for instance the schema.org initiative sponsored by the main search engines. Europeana is able to gain from other efforts connecting their data to schema.org, or requesting extensions to it, e.g. for library-specific data. The same way, Europeana partners will hopefully benefit from Europeana’s own interoperability with schema.org or other comparable initiatives, as long as they comply to the Linked Data paradigm.

Re-using and extending vocabularies is however still an art. Choosing vocabularies is not easy: community uptake, a crucial guiding element, is not often visible. Semantic redundancies across vocabularies remain extremely difficult to identify. The ability to share and compare data is crucial, making initiatives like Linked Open Data even more useful. As a consequence, not only proper vocabulary hosting, documentation and process is also needed to ensure the vocabularies are re-used but also means for vocabulary alignments get essential. W3C is setting the scene, as witnessed in the recently released Government Linked Data Working Group vocabularies (DCAT, ORG..) and new initiatives to host and help developing vocabularies. However projects with domain expertise are still expected to drive efforts, and there is room for standardization organizations to facilitate discussions in their communities. Europeana itself, so as to help new partners create their own EDM extensions and refinements, is starting an effort to gather existing efforts and documentation, a first step towards making available useful best practices.

Kurzbiografie

Antoine Isaac (Europeana Foundation) works as scientific coordinator for Europeana.eu. He has been researching and promoting the use of Semantic Web and Linked Data technology in culture since his PhD studies at Paris-Sorbonne and the Institut National de l'Audiovisuel. He has especially worked on the representation and interoperability of collections and their vocabularies. Besides his work on SKOS in the Semantic Web Deployment group, he has served in other related W3C efforts, for example on Library Linked Data or Open Annotation.