ALIADA aims to develop a usable tool implementing a novel approach to automatically publish, under the Linked Data paradigm, the contents hosted by library or collection management software.

Progress beyond the state-of-the-art

The main technologies addressed by ALIADA and the corresponding progress beyond the state-of the-art are:

Bibliographic description

ISBDs (International Standard Bibliographic Descriptions) focus exclusively on abstract features of publications and contain no prescription as how the particular features of an individual copy should be documented. MARC formats were primarily developed to share information about publications: the five MARC 21 communication formats - MARC 21 Format for Bibliographic Data, MARC 21 Format for Authority Data, MARC 21 Format for Holdings Data, MARC 21 Format for Classification Data, and MARC 21 Format for Community Information - are widely used standards for the representation and exchange of bibliographic, authority, holdings, classification, and community information data in machine-readable form.

In the museum word the following standards coexist: CIDOC CRM is a reference ontology for the interchange of cultural heritage information and as such is an ISO 21127:2006 standard, providing a high level reference model for museums. CIDOC CRM is well placed to become an important information standard and reference model for Semantic Web initiatives, and serves as a guide for data, or database. The most recent implementation of the CIDOC CRM is the Erlangen CRM. Spectrum is a collection management standard, it can be the base of data format and the workflow management of Collection Management Systems. LIDO is an output format aiming information interchange. LIDO, specified as a XML Schema, is the result of a joint effort of the CDWA Lite, museumdat, SPECTRUM and CIDOC CRM. Under suitable choice of terminology, LIDO maps to a CIDOC CRM compatible form. CDWA Lite and museumdat are two formats that are no longer developed.

Considering the museum world, LIDO is the obvious choice for using as input format of ALIADA, because it is the exchange format of the museum domain. Regarding libraries, MARC21 is the most extended format and for that reason ALIADA uses it as input format, concretly the XML Schema for MARC 21 records: MARCXML. 

Ontologies in the biblographic and museum world

In the last years, several international initiatives have attempted to convert library and museum data to web language. As an example, the popular Dublin Core approach assumes that the relevant relationships could be represented by common attribute values. This will indeed allow selecting paintings and books created by the same painter, or things of a particular category and books specialized to this particular category.

The most known ontologies as well as the new appearances and approaches in the bibliographic world are:

  • The CIDOC Conceptual Reference Model (CRM). It is a core ontology aiming to integrate cultural heritage information.
  • The FRBR model (‘Functional Requirements for Bibliographic Records’) was designed as an entity-relationship model by a study group appointed by the International Federation of Library Associations and Institutions (IFLA).
  • The FRBRoo appears as an approach to reach harmonization between FRBR and CIDOC CRM model (International Council of Museums).
  • The Bibliographic Ontology (BIBO) describe citations and bibliographic references (i.e. quotes, books, articles, etc.) on the Semantic Web.
  • The BIBFRAME model is the library community’s formal entry point for becoming part of a much larger web of data and to determine a transition path for the MARC 21 exchange format to more Web based, Linked Data standards.

From these ontological resources it was decided to reuse the Erlangen FRBRoo ontology to implement ALIADA ontology, because it includes FRBR for libraries and CIDOC-CRM for museums.

Other ontologies reused in ALIADA ontology  to model the ALIADA knowledge domain are: FOAF, SKOS-XL, WGS84 and OWL-Time.

The choice of these ontologies make it possible to ALIADA to expose museums and libraries’ data – offered using standards as MARCXML, LIDO and even through a proprietary relational database - through linked data paradigm, making it possible connections with other linked open data sets that host bibliographic and cultural heritage data and favouring the development of new applications that make use of the interoperability and inferring capacity of this technology.

Linked data deployment in libraries and museums

The current approach on publishing datasets from libraries and museums in the Linked Data Cloud can be seen as individual and independent efforts, where datasets to be published are directly selected from the databases (when possible, because in general the library and museum management software vendors lock the databases access and consultancy should be hired), then a cleaning phase is needed to solve problems like:

  • incomplete information
  • use of different formats
  • ambiguous meanings or not clearly explained
  • lack of correspondence between names used in physical schema and actual data
  • poor or incomplete application of cataloguing rules and standards.

Afterwards, conversion of cataloguing standards or of database tables into RDF is carried out case by case (selecting different ontologies as needed), as well as linking to other linked datasets, using ad-hoc developments that are not (or cannot be) reused. Once converted, a server must be configured to host the dataset and the corresponding SPARQL endpoint. Usually these efforts are one way, that is, from the cataloguing databases to datasets ready to be installed in a triple store. 

ALIADA provides a user interface to select the contents to be published and, without human intervention, it converts the catalogues to RDF datasets. These RDF Datasets include links, automatically created to other published datasets (in the Linked Data Cloud), deploying at the end a SPARQL endpoint. The automatically created links can be checked by a professional user, when desired.