Genre-Based Movie Recommendations by Average Rating
Build a movie recommendation function. You are given a set of users, a set of movie genres, and a table of ratings where each rating ties a user, a movie, the movie's genre, and a numeric score. Given a target user and a target genre, return the list of recommended movies in that genre, ordered from highest to lowest by their average rating across all users.
Apply these rules exactly:
- If the requested
useris not among the known users, or thegenreis not among the known genres, return an empty list. - Do not recommend any movie the target user has already rated.
- Do not recommend any movie that no one has rated (a movie with zero ratings).
- Order the recommendations by each movie's average rating (over all users who rated it), highest first.
Constraints
- A movie's average rating is the mean of all scores it received, not just the target user's.
- Movies the target user has rated are excluded even if they would otherwise rank highly.
Examples
Known genres include Action. User alice has already rated Movie A. Among Action movies, Movie B averages 4.5 and Movie C averages 4.0, both rated by others. Recommendation for (alice, Action) -> [Movie B, Movie C].
For an unknown user zzz -> [] (rule 1).
Hints
- This is a filter-then-sort, not a machine-learning model. Reduce it to: which movies survive the rules, and in what order.
- Compute each movie's average over ALL users who rated it, not just the target user. A movie with no ratings has no average and must be dropped (rule 3).
- Apply the exclusions in order: first the guard clauses (unknown user or genre return an empty list), then remove movies the user already rated (rule 2), then sort the survivors by average rating descending.
Worked Solution
How to Think About It: Strip away the framing and this is a filter-then-sort. The recommendation list is "all movies in the genre" minus "movies the user already saw" minus "movies nobody rated", sorted by average score descending. The only things that trip people up are the guard clauses (unknown user or genre returns empty) and the exclusion rules, each of which corresponds to one filtering pass. Get the order of operations right and the logic is short.
Algorithm: 1. Guard: if user not in the known users or genre not in the known genres, return []. 2. Collect the set of movies the target user has already rated. 3. Group ratings by movie to compute each movie's average rating and rating count. 4. Candidate movies = movies in the requested genre that (a) have at least one rating and (b) are NOT in the user's already-rated set. 5. Sort candidates by average rating, descending. Return the movie list.
Code:
```python from collections import defaultdict
def recommend(user, genre, users, genres, ratings): # ratings: list of dicts {user, movie, genre, score} if user not in users or genre not in genres: return []
seen = {r["movie"] for r in ratings if r["user"] == user}
sums = defaultdict(float) counts = defaultdict(int) movie_genre = {} for r in ratings: m = r["movie"] sums[m] += r["score"] counts[m] += 1 movie_genre[m] = r["genre"]
candidates = [] for m, c in counts.items(): if c == 0: continue # rule 3 (defensive) if movie_genre[m] != genre: continue # wrong genre if m in seen: continue # rule 2 candidates.append((m, sums[m] / c))
candidates.sort(key=lambda t: t[1], reverse=True) return [m for m, _ in candidates] ```
Complexity: $O(R + M \log M)$ where $R$ is the number of ratings and $M$ the number of candidate movies (the sort dominates).
Answer: Guard against unknown user or genre, build the user's already-rated set, compute each movie's average from all ratings, then return genre movies that are rated and unseen, sorted by average descending.
Intuition
The hidden conditions are the whole point of this problem -- the happy-path 'sort genre movies by average rating' is trivial, and what an interviewer is actually testing is whether you read the spec carefully and handle the exclusions and guard clauses. The most commonly missed rule is excluding movies the target user has already rated, because it requires a second pass over the data keyed by user rather than by movie, and a single unfiltered test case will fail silently. The broader lesson, true of most 'recommender' interview prompts, is that a popularity-by-average baseline is a legitimate and expected first cut -- you are not expected to build collaborative filtering, you are expected to be precise about set membership and ordering.