HW 4 – Redis Cache for MongoDB Queries

(2 points)

Develop and demonstrate a simple caching system for MongoDB query results using Redis as an external cache. Evaluate the impact of caching on query performance and present findings in a well-structured report.

  1. Analyze the analytical queries you designed in HW3.
    • Identify scenarios where Redis caching meaningfully reduces response time and server load, and distinguish them from cases where caching provides no real benefit.
    • If any of the HW3 queries would benefit from caching, justify your choice and use them here.
    • If none benefit, justify why caching would not help and propose two new queries that would benefit from caching.
    • Examples of queries that benefit from caching:
      • “Top-10 courses completed yesterday” — computed once per day, requested by many users, heavy aggregation.
      • “Daily active users (DAU) for the last 7 days” — fixed time window, repetitive dashboard query.
      • “Total sales per product category for the current week” — periodic updates, shown on dashboards.
    • Examples of queries that do NOT benefit from caching:
      • “All activities of user X over a custom date range” — highly personalized; low repeat rate.
      • “Search for an order by unique ID” — primary key lookup is already fast.
      • “Ad-hoc instructor report” — rare and highly variable.
  2. Implement a Python script/application that:
    • Checks if query results are available in Redis.
    • Returns cached results immediately if present.
    • Executes the MongoDB query and stores the result in Redis with a TTL if cache is missing.
  3. Design an explicit cache key schema based on query name and parameters; keys must be deterministic, extendable, and readable.
  4. Assign appropriate TTL values per query type (for example, 300 seconds; longer for daily aggregates).
  5. Implement functionality to clear or refresh the cache by key or by prefix.
  6. Experimental evaluation:
    • Execute identical queries multiple times (at least 5 runs).
    • Compare response times for the first request (cold cache) and subsequent requests (warm cache).
    • Present a small table or chart with cold vs warm timings and a short analysis of the gain.

Implementation Requirements

  • Language: Python.
  • Libraries: motor (MongoDB), redis-py (Redis).
  • Minimal API:
    • `get_<query_name>()` — executes the query with caching logic.
    • `purge_cache(…)` — clears cache entries (by key or prefix).
    • A simple test runner to measure timings.
  • Use TTL for each cache key.

Submission

  • To BRUTE:
    • HW4.docx — report with design and results.
    • hw4.py — Python script.
  • To the server:
    • hw4.py — upload and ensure it runs in the target environment.
  • To connect to your Redis instance remotely, create an ssh tunnel first:
    • ssh -L $Port:127.0.0.1:$Port $Username@nosql.felk.cvut.cz
    • Then from your local device you can connect to redis on localhost (127.0.0.1) using your port and password, e.g.:
    • redis-cli -h 127.0.0.1 -p  $Port -a  $Password
  • Your script will be tested on the server. Before submitting, you should run it and check if it works.
courses/be4m36ds2/homework/hw4.txt · Last modified: 2025/09/21 18:42 by prokoyul