HW 3 – Document data stores. MongoDB

(5 points)

  1. Design the data structures based on the analytical tasks (see HW1) that are intended to be solved with MongoDB (primary and alternative 1).
    • The structures must be designed so that the target queries are feasible.
    • Avoid overly nested documents.
      • Denormalize only when it benefits performance; do not create overly deep nesting that complicates queries.
  2. Insert the data into MongoDB using the CSV/JSON generated for the project (see HW0). Use the Base dataset as defined in HW0.
    • Load only what is needed for tasks/comparisons. Do not import unnecessary data.
    • In the .js file, insert several dozen documents directly (if needed, across multiple collections) as examples, placed before any queries.
    • Using a separate, idempotent script, generate and insert the required dataset of the assignment’s target size (see HW0).
  3. Implement and run the queries (see HW1). Create supporting indexes for every analytical query (Note: creating indexes for this task is not mandatory, but is recommended).
    • Use the Aggregation Pipeline; do not use `$where`, server-side JavaScript, or `mapReduce`.
    • Apply appropriate time filters, groupings, and KPIs as required by your HW1 tasks (do not hardcode specific fields like “country/action” if your domain differs).
    • All queries must return a non-empty result on the sample data inserted in the .js file.
  4. Prepare a report and include:
    • the tasks from both HW0 and HW1 related to MongoDB (edited if necessary);
    • the collections creation and sample document-insertion commands, with explanations of why this structure was chosen;
    • the queries used to solve the analytical tasks. For each task, provide:
      • the task number (as in the previous work);
      • the verbal description of the task;
      • the query code with explanations;
      • a screenshot of the query result;
    • the data import script.

Note. If some analytical tasks turn out to be infeasible or meaningless, you may replace them with others and propagate the corresponding changes to the earlier work (HW0/HW1).

Submit to the BRUTE system:

  • HW3.docx file with the report; hw3.js file with queries.

Submit to the NoSQL server (nosql.felk.cvut.cz) the hw3.js file with MongoDB commands with short comments to each query.

  • Submission:
    • hw3.js: file with MongoDB database commands;
    • hw3_bulk_load.py: separate idempotent script for data generation;
    • data/: a folder with a subset of HW0 dataset for MongoDB tasks (only necessary fields).
  • Execution:
    • Execute the following shell command to evaluate the whole MongoDB script

mongosh --port 42222 -u $login -p $password $database $file

  • $login is your username, e.g. f25_login
  • $database - database to connect to (same as login)
  • $password is your password (Use the same password you received for your account at the beginning of the semester)
  • $file is a file with MongoDB queries to be executed, i.e. hw3.js
  • Double dashes before port
  • Tools:
    • MongoDB 7.0.14 (installed on the NoSQL server)
  • References:
  • Server: nosql.felk.cvut.cz
    • Do not forget to execute the homework submission script!
  • Deadline: Sunday 23. 11. 2025 30.11.2025 until 23:59
courses/b4m36ds2/homework/hw3.txt · Last modified: 2025/11/22 15:19 by prokoyul