How Does Swing Percentage Correlate to Batting Average?
September 2022 - December 2022
A key question we can ask ourselves is: Can we predict a players on batting average based on swing percentage or swing percentage on pitches in the zone? To answer this specifically, we can use a players swing percentage both overall and swing percentage on pitches in the zone to discover the correlation to batting averages. The hypothesis I came to was that we will find that players who are more aggressive at the plate (higher swing percentages) will result in higher batting averages rather then players with a more selective approach (lower swing percentages). I used two separate models (one model using overall swing % and the other model using swing % on pitches in the zone) with similar functions and lines of code to attempt to answer this hypothesis. First, I calculated log_avg which is the log of the batting averages of players in the database. We can group two different outcomes with a target batting average of .275. Players above this target can be considered group 1L and below the target can be group 2L. Using this, I can use the quap() function to see the standard deviation and understand how the swing percentage (both overall and inside the zone) affects the players batting averages. On top of this, I used a WAIC to measure the out-of-sample predictive accuracy of a statistical models.

The Code




Summary
After running both the models, it is clear that the second model which features swing percentage on pitches inside the zone have more of an affect on a players batting average rather than Model 1 (overall Swing %). On the surface, we see there are more players on the right side of the x=0 red line average in Model 2 then in Model 1. To take a specific player to the forefront, Greg Allen's dwaic in Model 1 is -3.45. In Model 2, Greg Allen's dwaic completely jumps to 1.16. As for Pablo Reyes, in Model 1 the dwaic is 3.7 but jumps down to 1.1 in Model 2. Overall, we can determine that a players overall swing percentage does not play a huge role in regards to their batting averages. However, a players aggressiveness on pitches inside the zone can be considered a bigger piece in a players overall batting average