Learning to Bet: Reinforcement Learning for Live Wagering in Major League Baseball
This project investigates how reinforcement learning (RL) algorithms can optimize betting strategies in Major League Baseball (MLB) live wagering. Traditional sports betting models focus on pre-game predictions, but live betting introduces unique complexities—and opportunities—by allowing dynamic, strategic decision-making throughout a game.
Using historical MLB play-by-play data spanning 25 years (2000–2024) and bookmaker odds from recent seasons, the team created a realistic environment simulating the sequential decision-making process faced by bettors. Due to sparse historical odds data, they built an ensemble regression model (combining XGBoost, LightGBM, and Random Forest) to predict missing live odds with high accuracy (MAE = 0.04).
They evaluated four prominent RL algorithms—A2C, PPO, TRPO, and DSAC—under different behavioral variants and reward shaping conditions. PPO with reward shaping emerged as the standout model, achieving positive returns in simulation but facing challenges generalizing fully to real-world bookmaker odds.
The findings emphasize the potential—and limitations—of reinforcement learning in dynamic financial markets like sports betting, highlighting critical areas such as market inefficiencies and bookmaker advantages. Future work will focus on enhancing simulation realism and improving generalization to real-world scenarios.