A per-attraction time-series forecasting engine for all four Walt Disney World parks. One Facebook Prophet model per attraction, trained on 2+ years of hourly wait data from BigQuery, with 50+ external regressors: weather (temperature, precipitation, humidity, thunderstorm probability), rolling windows at 1h/2h/4h, precipitation lags, park operating hours, holidays, and time decomposition features. 24-hour ahead forecasts written back to BigQuery.
prophet_ready_data_view unified feature view
2023-01-01 → 2025-07-01 training window (2+ years)
Joined sources: historical wait CSVs (2015-2021), NWS weather, park schedules, holiday metadata
Target variable y: actual wait time in seconds
Partition key: entity_id (one row per attraction per hour)
50+ columns: time, weather, rolling stats, park context, one-hot categoricals
entity_id — no global model
Prophet config: additive seasonality, daily + weekly + yearly, linear growth
changepoint_prior_scale=0.05 — moderate trend flexibility
All 50+ numerical features added via m.add_regressor()
Categorical features excluded (land_name, attraction_name) — not directly supported as regressors
Rides with zero observations still trained (MIN_OBSERVATIONS=0)
weather_forecasts BQ table → temperature, precipitation, dewpoint, humidity, cloud cover, wind, thunderstorm prob
Park hours: park_open_status_hourly_v1 BQ table → per-park is_park_open for 24hr horizon
Historical regressors (rolling means, lags) imputed to 0 — cannot be computed without actuals
Time-based features re-derived from ds: day_of_week, hour_of_day, month, quarter, is_weekend, is_holiday
m.predict(future_ride) per ride — returns yhat + confidence interval (yhat_lower, yhat_upper)
24-hour horizon at hourly resolution
Output annotated with ride_entity_id, park_name, prediction_generated_at UTC
Rides with missing critical regressors skipped with error log
prophet_predicted_wait_times table — WRITE_APPEND disposition
Schema: prediction_timestamp (TIMESTAMP), ride_entity_id (STRING), predicted_wait_time_seconds (FLOAT64), yhat_lower, yhat_upper, prediction_generated_at
Downstream: feed to dashboard, alerts, and routing recommendations
All features were computed in BigQuery and materialized into prophet_ready_data_view
before training. The feature set grew from ~10 in v2 to 50+ by v8 as each iteration
added a new data source or transformation.
All data was ingested to a GCP BigQuery data warehouse under the
park-whisperer.park_data_warehouse project. The training view
joined four independent datasets into a single query surface.
long_term_historical_ride_data.parquet → loaded to BQ
130+ attraction CSV files across 4 parks ingested via preprocess.py
historical_weather.py → NWS hourly observations → BQ
Forecast: weather_forecasts BQ table (live NWS feed)
Computed in BQ: adverse_heavy_rain_pred_1hr, adverse_thunderstorm_prob_3hr, weather_condition_category
park_open_status_hourly_v1 — hourly is_park_open flag per park
Derived from park schedule data: MKOPEN/MKCLOSE, EPOPEN/EPCLOSE, HSOPEN/HSCLOSE, AKOPEN/AKCLOSE
is_park_operating_hour distinguishes scheduled hours from actual operation
prophet_ready_data_view — SQL view joining all sources
Rolling windows and lag features computed in SQL (y_rolling_mean_*hr, precip_lag_*min)
One-hot encoding done in SQL: CASE WHEN park_name = 'Magic Kingdom' THEN 1 ELSE 0
Training query reads from view, filters by date range, orders by ds, entity_id
prophet_predicted_wait_times — BigQuery Output Table| Column | Type | Description |
|---|---|---|
| prediction_timestamp | TIMESTAMP | Hourly timestamp for which the wait is predicted (Eastern Time, as UTC) |
| ride_entity_id | STRING | Attraction identifier matching entity_id in training data (e.g., "seven-dwarfs-mine-train") |
| predicted_wait_time_seconds | FLOAT64 | Prophet yhat — point forecast in seconds. Divide by 60 for minutes. |
| yhat_lower | FLOAT64 | 80% confidence interval lower bound |
| yhat_upper | FLOAT64 | 80% confidence interval upper bound |
| prediction_generated_at | TIMESTAMP | UTC timestamp when the prediction batch was computed. WRITE_APPEND — each run adds rows. |
prophet_ready_data_view.lightgbm_model.joblib.By v5, the most predictive features were the rolling window statistics: y_rolling_mean_1hr, y_rolling_mean_2hr, y_rolling_mean_4hr. These captured autocorrelation in wait times — if a ride has been running 60-minute waits for the past two hours, that's a strong signal for the next hour. In training, these values are computed from actual historical data and are highly accurate.
At forecast time, there is no "actual" data for the future. The pipeline imputed these to 0, which is demonstrably wrong — 0 means the ride has had zero wait time for the past four hours, a state that doesn't exist during park operating hours. The model trained with real values and predicted with zeroes. Every prediction was effectively generated without the features the model weighted most heavily. Adding more weather features didn't compensate for this structural gap.
Prophet is computationally heavy: fitting a full Prophet model with 50 regressors and 2+ years of hourly data takes 30–120 seconds per attraction. With 130+ attractions across 4 parks, a single training run was measured in hours on commodity hardware. Retraining every night to keep models fresh on new data was not practical without significant infrastructure — a Cloud Run or Dataflow job with parallelism across all rides.
A global model (one model for all rides with entity_id as a categorical feature) was considered but Prophet doesn't natively support entity-level models — that's more LightGBM/XGBoost territory, which is why the v7 LightGBM parallel track existed. Neither approach reached production readiness.
The BigQuery warehouse, the NWS weather ingestion, the historical CSV preprocessing pipeline, and the prophet_ready_data_view feature view all survived the Prophet experiment and fed directly into the next architecture. The feature categories that proved most important — weather condition, time-of-day category, park operating hours — became the basis for the knowledge extractor taxonomy in the current PostgreSQL RAG system.
Instead of serving predictions from a live model, the current system precomputes day-of-week and hour-of-day aggregates from the same historical dataset and stores them as knowledge documents. "What's the typical wait for Space Mountain on a Saturday morning?" is answered directly from those precomputed aggregates — no real-time inference required, and the answer is based on 7+ years of actual data rather than a forecast from a model with imputed zero-value features.
The imputation problem is solvable — but not with Prophet. A model architecture that treats the rolling features as autoregressive inputs natively (LSTM, N-BEATS, or a tree-based model with an explicit lag feature pipeline) would handle the forecast horizon correctly. The BigQuery feature store is largely ready; the main engineering work would be:
selloutEventv2 / wait-time records) at inference time rather than from precomputed training dataprophet_predicted_wait_times just needs a better upstream producer