Checkpoint 0 (max 5 pts)


The goal of this checkpoint is to define the topic of your data integration task.


A PDF consisting of 1-2 pages describing

  1. the topic you created, your motivation for the topic and possible use-cases and beneficiaries
  2. a list of data sources which you found and which are relevant for the topic + basic info for each data source (how complex the data schema is, how much data the data source contains, what kind of data it contains, etc.)
  3. selection of data sources out of this list which you will use for your semestral work.


You should create a topic (e.g. ``Public Transportation delays in the Czech Republic'') and shortly describe its motivation, use-case and purpose. Your topic might span (but is not limited to) the following areas :

  • civil engineering constructions (buildings, construction components, construction failures),
  • urban planning (land use, cadastral data, address points, RUIAN, etc.),
  • infrastructure in town (roads, utilities)
  • budgets and other economy data for municipalities

Next, you should look for existing relevant public data sources (existing data sets, existing ontologies, books, web pages) on the topic. You choose at least three related datasets that are provided by different parties (organizations). Two datasets are related if they have an overlap on a topic, but are not technically integrated (they do not share the same identifiers). As an example, two datasets that are related are e.g. Safety Accident Database by Air Investigation Institute and Aircraft register by the Czech Civil Aviation Authority).

