Sabermetrics: The Empirical Analysis of Baseball
For most of baseball’s history, scouting departments have used observation and logic to determine a baseball player’s ability for success. In the book, ‘Moneyball’, by Michael Lewis, Lewis discusses how MLB teams originally relied on the skills of their scouts to find and evaluate players. Easily observed statistics like Stolen Bases, Runs Batted In and and Batting Average were focused on by scouts when evaluating players. Although these statistical categories can be valuable, further analysis has shown that these statistics are less valuable to winning games than are less readily available statistics. Starting with Earnshaw Cook in his 1964 book, ‘Percentage Baseball’, and continuing with the annual baseball abstracts of writer, Bill James, the idea of Sabermetrics (or SABRmetrics), the science of baseball statistics, was popularized and brought into the mainstream.
Advancing Traditional Measurements
As discussed earlier, the traditional measurement for evaluating a player’s hitting was batting average. Batting average is calculated by dividing a player’s total number of hits by their total number of at bats (H/AB). Bill James and other early believers in sabermetrics found this statistic to be defective as it ignored the other ways a player could reach base. Therefore, the statistic OBP (On-Base Percentage) was created, which calculated the total number of times a player got a hit, walked, and got hit by a pitch and divided them by the total number of at bats, walks, hit by pitches and sacrifice flies ((H+BB+HBP)/(AB+BB+HB+SF)). Additionally, another issue with batting average was that it did not distinguish between different types of hits like a single, double, triple and home run. The statistic, slugging percentage (or SLG), was created to provide a way to differentiate a player’s types of hits. To calculate the slugging percentage, the total number of bases of all hits (single = 1, double = 2, triple = 3, and home run = 4) is divided by the total number of at bats (((1B) + (2B*2) + (3B*3) + (HR*4))/AB). These two improved statistics can be combined to create the statistic OPS (On-Base plus Slugging), which is the sum of a player’s on-base percentage and slugging percentage. As a rule of thumb, a .900 OPS is considered to put a player in the upper echelon of hitters.
For pitchers, statistics like earned run average (ERA) were traditionally used as an evaluation of a pitcher’s abilities. This statistic is calculated by dividing the total number of earned runs a pitcher allows by the number of innings pitched and multiplying the total by 9 (ER/IP * 9). The issue with ERA is that it is partially dependent on a pitcher’s fielders. Additionally, other traditional measure of pitchers like Total Wins and Winning Percentage were proven to be flawed as they did not take into account the pitcher’s teammates abilities to field and score runs. One popular statistic that was developed through sabermetrics is WHIP ((Walks + Hits)/IP). Although this statistic is somewhat reliant on a pitcher’s fielders, it is a pretty accurate measure of a pitcher’s ability to keep players from reaching base. More recently, statisticians have worked to create DIPS (Defensive Independent Pitching Statistics) through efforts led by Voros McCracken. Some examples of DIPS statistics include BABIP (batting average on balls in play) and FIP (fielding independent pitching).
Additional Sabermetrics Statistics
Two statistics that attempt to summarize a player’s overall value to his team are WAR and VORP.
- WAR (Wins Above Replacement) is a statistic that evaluates a player’s overall contributions to his team in terms of games won. WAR compares a player to a replacement-level player in order to determine the number of additional wins the player provides to his team.
- VORP (Value over Replacement Player) is a statistic similar to WAR. It demonstrates how much a player contributes to his team compared to a fake replacement player that performs below average. VORP differs from WAR in that it measures how many runs a player creates for his team compared to a replacement-level player rather than how many wins the player provides.
Recent Advances
Over recent years, the MLB and its scouts have attempted to further advance empirical analysis of baseball. Some of the more recent measures that have become popular and resulted from sabermetrics include launch angle and exit velocity for hitters and spin rate for pitchers. If you tune into an MLB broadcast on any given night, you are likely to see references to all of these measures.
In addition, sites like FanGraphs have become popular for their research and in-depth analysis into advanced baseball statistics as well as their publishing of graphics that track and evaluate the performance of players and teams. As time goes on, baseball scouts and affiliated data scientists should continue to find innovative ways to measure the success of baseball players and teams through the use of sabermetrics.