HW 5 – Wide-column data stores. Cassandra

(5 points)

Design the table schemas based on the analytical tasks (see HW1) that are intended to be solved with Cassandra.
- The schemas must strictly follow Cassandra’s query-based modeling principles: correct selection of partition keys and clustering columns, and denormalization.
- Materialized views are not allowed.
- ALLOW FILTERING is forbidden. All queries must be supported by schema design.
Insert the data into Cassandra using the CSV/JSON generated for the project (see HW0).
- Load only the data required for the analytical tasks. Avoid unnecessary attributes or tables.
Implement and run the queries (see HW1).
- Provide efficient CQL queries that leverage the partitioning and clustering design.
- Demonstrate how queries are optimized by schema design (no filtering scans).
- For time-series data, show partitioning and clustering strategies for scalability.
Prepare a detailed report including:
- The text from HW0 and HW1 (edited if necessary);
- Full schema creation statements with a brief explanation of design decisions;
- The data import commands/scripts;
- All analytical queries:
  - Task number (as in HW1);
  - Description of the analytical question;
  - Full CQL query with a short explanation of how the schema supports it;
  - Screenshot of the query result.

Important:

If certain analytical tasks are not feasible under Cassandra’s data model, replace them with other meaningful tasks and update earlier work (HW0/HW1).
Use denormalization and query-driven schema design to avoid ALLOW FILTERING.

Submission:

Submit the HW5.docx file to the BRUTE system.
Submit the hw5.cql file to the NoSQL server (nosql.felk.cvut.cz), containing Cassandra CQL commands with brief explanatory comments.

Execution:

cqlsh -u $username -p $password -k $KeyspaceName -f $ScriptFile

$KeyspaceName is a name of keyspace that should be used (must already exist), e.g. f241_login
$ScriptFile is a file with CQL queries to be executed, i.e. script.cql
Tools:
- Apache Cassandra 4.1.6 (installed on the NoSQL server)
References:
- The Cassandra Query Language (CQL)
Server: nosql.felk.cvut.cz

Don’t forget to run the homework submission script!

Sunday 7. 12. 2025 until 23:59