HW 5 – Wide-column data stores. Cassandra

(5 points)

  1. Design the table schemas based on the analytical tasks (see HW1) that are intended to be solved with Cassandra.
    • The schemas must strictly follow Cassandra’s query-based modeling principles: correct selection of partition keys and clustering columns, and denormalization.
    • Materialized views are not allowed.
    • ALLOW FILTERING is forbidden. All queries must be supported by schema design.
  2. Insert the data into Cassandra using the CSV/JSON generated for the project (see HW0).
    • Load only the data required for the analytical tasks. Avoid unnecessary attributes or tables.
  3. Implement and run the queries (see HW1).
    • Provide efficient CQL queries that leverage the partitioning and clustering design.
    • Demonstrate how queries are optimized by schema design (no filtering scans).
    • For time-series data, show partitioning and clustering strategies for scalability.
  4. Prepare a detailed report including:
    • The text from HW0 and HW1 (edited if necessary);
    • Full schema creation statements with a brief explanation of design decisions;
    • The data import commands/scripts;
    • All analytical queries:
      • Task number (as in HW1);
      • Description of the analytical question;
      • Full CQL query with a short explanation of how the schema supports it;
      • Screenshot of the query result.

Important:

Submission:

Execution:

cqlsh -u $username -p $password -k $KeyspaceName -f $ScriptFile

Don’t forget to run the homework submission script!

Sunday 7. 12. 2025 until 23:59