Designing a City Rent Prediction System
You are designing an end-to-end machine learning system to predict rental prices across a city, leveraging structured listing data, text descriptions, images, and geospatial features.
Address the following:
(a) **Data collection and cleaning** -- What data sources would you use? What cleaning steps matter most?
(b) **Feature engineering** -- How would you extract useful signals from text descriptions, listing photos, and geospatial data?
(c) **Model selection** -- What model architectures would you consider, and how would you combine structured and unstructured features?
(d) **Validation** -- How should you split data to avoid leakage, given that listings have both temporal and geographic structure?
(e) **Fairness and robustness** -- What biases are you worried about, and how do you detect and mitigate them?
Open the full interactive solver, hints, and worked solution →