====== HW 7 – Optimization and Comparative Analysis ======
**(10 points)**

  - For each NoSQL DBMS (MongoDB, Cassandra, Neo4j), after completing the target analytical queries:
    * Analyze the performance of **one** of the queries (by execution time, volume of data scanned).
    * Implement **one** technique to improve performance (for example: creating an index, changing key structures/partitioning, optimizing the storage schema, reducing the volume of returned data, etc.).
    * Compare the result **before and after** optimization and explain.
    * Briefly describe in the report **what you did, why this technique is applicable** specifically for your DBMS and your task, and **how it affected** the result.
  - **Measurement protocol**:
    * **Dataset**: use the Stretch dataset.
    * **DBMS-specific instrumentation**:
      * MongoDB: **explain("executionStats")** on find/aggregate.
      * Cassandra: **TRACING ON** (capture the query trace; note consistency level if relevant).
      * Neo4j: **PROFILE** (you may use **EXPLAIN** to inspect the plan without executing).
    * **Report "before/after"**: time and read volume/plan hits (docs/keys scanned, partitions touched, DB hits).
****Examples** (guidance)**
  * MongoDB
    * Add indexes on key fields (e.g., user_id, date, action type).
    * Compare execution time with vs. without the index (record results).
    * Assess impact of document nesting vs. flatter schema (multikey costs).
    * Prefer compound index order: equality → range → sort; use covered queries where possible.
  * Cassandra
    * Review partitioning (how partition/clustering keys affect latency).
    * Compare latency across key designs (the partition key must drive queries).
    * Assess batch size on inserts (prefer batches to the same partition).
  * Neo4j
    * Evaluate how graph size and traversal depth impact response time.
    * Use indexes or uniqueness constraints where applicable.
    * Compare a simple query vs. MATCH + WHERE on an indexed property.
    * Capture **DB Hits** and avoid NodeByLabelScan/CartesianProduct where possible.

**Comparative Analysis**
    * Compare how easily/quickly/conveniently your queries were executed in each DBMS, and which **limitations or surprises** you encountered.
    * Analyze the **optimization techniques** applied.
    * For each task, explain **why this query/type of analysis is not implemented (or is implemented poorly) in other DBMSs**. //Example: relationship search in Neo4j — why it is impossible (or inefficient) in Cassandra/Redis.//
    * Create a **mini-table** (for the queries you actually implemented):
      | DBMS | Query / task | Optimization technique | Median_before | Median_after | Read volume before | Read volume after |
      
**Scaling, schema changes, and “what if it doesn’t fit?”**
    * Briefly describe, for each implemented query or analytics, **what limitations, compromises, or potential problems** may arise when scaling, changing the schema, or increasing data volume. If a new, unexpected business query becomes difficult or impossible with the chosen design, state **what you would do** (e.g., add a query‑driven table; adjust partition keys; precompute rollups; introduce a specialized store).

**Submission Instructions**
  * Submit to the BRUTE system:
    * **HW7.docx** file.

**Deadline**
  * Sunday **28. 12. 2025** until 23:59