5 fare that takes 15 minutes. The driver's scarce resource is time, so the target variable must be a rate, not a total. The second challenge is that this is a sequential decision problem: after dropping off a passenger, the driver must decide where to go next, and that choice affects future opportunities.
Key Insight: The right target variable is profit per hour, not total profit per trip. This accounts for both trip duration and dead time between fares.
The Method:
Step 1: Define the target variable.
$\text{Profit per hour} = \frac{\text{fare} + \text{tip} - \text{fuel cost}}{\text{trip time} + \text{wait time for next fare}}$
The denominator is critical -- it includes the time spent waiting or cruising for the next passenger. Without this, you bias toward long airport runs that look profitable per-trip but waste time.
Step 2: Feature engineering.
- Spatial features: Pickup location (grid cell, neighborhood cluster, or lat/long bucket), dropoff location, distance to key landmarks (airports, stadiums, convention centers).
- Temporal features: Hour of day, day of week, month, holiday indicator, minutes since last major event ended.
- Demand proxies: Historical pickup density at this location/time, lagged demand from the last 15/30/60 minutes.
- Supply features: Number of available taxis nearby (if available), recent dropoff density (indicating supply glut).
- Weather: Rain, snow, temperature -- rain increases demand and tips.
- Traffic: Expected trip duration given current conditions, which affects both the denominator and the driver's ability to chain trips.
Step 3: Modeling approach.
Use gradient boosted trees (XGBoost or LightGBM) for the prediction model. These handle non-linear interactions well (e.g., airport pickups are great at 5 PM but terrible at 2 AM) and are robust to mixed feature types.
Train on historical trip data with time-series cross-validation -- never train on future data. Use expanding-window validation: train on months 1-3, validate on month 4; train on 1-4, validate on 5; etc.
Step 4: Decision layer.
After predicting profit/hour for candidate actions (drive to location A, B, C, or wait in place), choose the action with the highest predicted rate. This is a greedy one-step-ahead policy. For better performance, use reinforcement learning (Q-learning or policy gradient) to optimize multi-step decisions -- but the greedy policy is a strong baseline.
Step 5: Evaluation.
- Offline: Backtest on held-out time periods. Compare cumulative daily revenue under the model's recommendations vs. actual driver behavior vs. random routing.
- Online: A/B test with a subset of drivers. Measure total daily earnings, trips per shift, and dead miles (miles driven without a passenger).
- Key metrics: Average profit per hour, utilization rate (fraction of time with a passenger), total shift revenue.
Practical Considerations:
- Cold start: New locations or events have no history. Use spatial smoothing or transfer from similar neighborhoods.
- Competition effects: If all drivers follow the same model, they flood the recommended zones, destroying the edge. The model needs to account for supply or be personalized.
- Stationarity: Demand patterns shift over months (new buildings, road closures, pandemic). The model needs regular retraining.
- Fairness: Optimizing profit per hour may route drivers away from underserved areas. This is a business/regulatory consideration.
Answer: Build a gradient boosted model predicting profit per hour (including wait time in the denominator) using spatial, temporal, demand, and weather features. Evaluate with time-series backtesting on daily revenue. For multi-step optimization, layer a reinforcement learning policy on top of the point predictions.
Intuition
This problem tests whether you can frame a real-world optimization as a machine learning pipeline. The most common mistake is treating it as a simple regression (predict fare from features) without thinking about what the driver actually needs to optimize. A driver does not care about the fare of a single trip in isolation -- they care about their earning rate across the entire shift. This means the target variable, the feature set, and the evaluation metric all need to reflect the time dimension.
The deeper lesson is about the gap between prediction and decision-making. A great fare prediction model is useless if it does not tell the driver what to do next. The decision layer -- whether greedy or RL-based -- is where the value is created. In quant finance, this is analogous to the difference between a return forecast and a portfolio optimizer: the forecast is necessary but not sufficient.
Open the full interactive solver →