Checkpoint 3 (max 15 pts)

Goal

The last checkpoint combines and integrates data through the integration of models. To prove the interconnection of the datasets, you will formulate a non-trivial SPARQL query answering the scientific question.

Deliverable

  • create a mapping between domain ontology of conceptual model and ontologies of datasets, using OWL, SPARQL, and/or SHACL and upload it to the repository,
  • prove the interconnection of data by the sample integration of various knowledge among datasets with a high number of mutually interlinked resources from all the datasets, delivered as RDF file(s) or a SPARQL query, the interconnections shall be done using
  • design non-trivial SPARQL queries over the datasets you create, showing the integration capabilities of the integrated datasets by answering the question given in the Checkpoint 1 (include how long queries run, results and their interpretation),
  • and include them in the description (1-2 page extension of the report from Checkpoint 2). In the description also sum up the design decisions you made, the pros/cons of the ontology, description, and evaluation of SPARQL queries, and the conclusion you make out of the semestral work.

In the last tutorial, everyone takes a 5-minute presentation showing the results.

Details

Integration of annotated and well-described data is done by the interconnection of the ontologies describing them. By the beginning of the checkpoint, you have separated ontologies describing datasets and one or more ontologies describing the domain. Now it is time to interconnect them, using OWL statements and rules, SHACL shapes, and/or SPARQL queries.

We recommend creating mappings in a standalone RDF file importing all ontologies. To interconnect concepts you may use simple relations such as subClassOf, subPropertyOf, and sameAs (think twice or more times before using it, remember its symmetry), but you may also need to set up rules or even create new classes and properties (e.g. class inferring all instances of some classes with specific attribute value, using OWL). You may create the mapping even on the dataset ontology level but take care and do not get lost.

As the ontologies are interconnected, prove the interconnection of data by returning a large amount of resources with knowledge from the various datasets. Do that either by consolidation of data in the triple store, or by the SPARQL query. Export the output into RDF serialization and deliver (upload to the repository).

Now formulate a non-trivial SPARQL query or a combination of non-trivial queries over an OWL/RDF representation of the integrated datasets to answer the given question. Remember, do not query the data itself, but the concepts they represent. Practically, in the query, there shall be no specific identifiers of the specific data, but you shall ask for the instances of a specific class, eventually having connections to other classes or specific attribute values.

Presentation of the answer is also rated (by a maximum of 5 points). Come with a creative way of presentation, e.g. web application allowing users to set the parameters and run queries, showing results in a map, or nice visualization of data. Remember, the customer does not care how difficult it is to process the data, but how it looks in the final presentation.

courses/b4m36osw/cp3.txt · Last modified: 2023/09/21 12:41 by medmicha