Beating MLB 2021 MARCEL Projections

October 2023
MARCEL projections are essentially the most basic form of predictive modeling that simply weighs the past three seasons to predict the next season's performance, which also includes an age adjustment, reliability score, and a natural regression rate. I began my goal to beat these MARCEL projections by theorizing how to correlate metrics to OPS. I came to a conclusion that maybe we can use batted ball data, more specifically exit velocity, to assist in our OPS projections. Exit Velocity demonstrates how well a player is hitting the ball. Sometimes, even when a player hits the ball at a high exit velocity, the player may not be rewarded if he hits the ball right to the defense. I believe we can get a trajectory of how well a player has been hitting the ball (using exit velocity) to predict the OPS. Obviously, OBP is used to calculate OPS, which factors walks (unrelated to batted balls). However, exit velocity rates still are relevant because exit velocity not only correlates to extra base hits (which contributes to a higher SLG %), but it also can be theorized to increase walk % because of pitchers 'pitching around' these batters. The project prompt was to predict what each player's 2021 OPS would be using any past data. To begin, I imported Statcast data using Baseball Savant to acquire the needed information. I had to train the model by first joining my Statcast data with player data (slash lines) from the past four years. I joined these tables using the Player IDs as the primary key and then used that table to create the model with Statcast data from 2018-2021. I wanted to calculate a 'trajectory' rate to tell us how this player is trending in terms of how much better or worse the contact they're making is (exit velocity). After finding those rates, I used linear regression to create my model. I then took the exit velocity data from 2019-2020 to predict what the player's OPS would be in 2021. After I got my rough projection, I wanted to adjust it more using the average OPS of each player for the last two seasons to balance out the margins of error. After running multiple error scripts to see how accurate the final predicted OPS is, it was evident that my predicted 2021 OPS numbers were more accurate to actual 2021 OPS numbers than the MARCEL projections were.
The Code



