CourseWare Wiki
Switch Term
Winter 2020 / 2021
Winter 2019 / 2020
Winter 2018 / 2019
Older
Search
Log In
old
courses
osw
cp1
Warning
This page is located in archive. Go to the latest version of this
course pages
. Go the latest version of
this page
.
Table of Contents
Goal
Deliverable
Details
Goal
Create semi-automated data pipeline, based on the data sources from checkpoint 0.
Deliverable
data pipeline (Scrapy, SPARQL, etc.) for reconstructing the data set;
the data set;
UML diagrams of the data set schema
Details
the data pipeline should transform data-sources into RDF (i.e. target data set)
choose any tools you like (e.g. any programming language you are familiar with) to create the data pipeline. However, for the most of the cases the following two alternatives should be sufficient to use:
GraphDB (OpenRefine+SPARQL) for processing CSV files, triplifying them, and manipulating resulting RDF
Scrapy
+ GraphDB (OpenRefine+SPARQL) for scraping web pages, triplifying them, and manipulating the resulting RDF
the resulting data set should contain all relevant data for the integration task unified in format (RDF)
courses/osw/cp1.txt
· Last modified: 2017/11/22 23:02 by
blaskmir