top of page
Writer's pictureJacob Bleiweis

A Simple Outline of Advanced Analytics in Baseball


(Denis Poroy/Getty Images)


The use of analytics has made its way into every major American sport, changing the way players have played, coaches have coached, and fans have observed their respective sport. In basketball, the use of analytics have led to a surge in three pointers at the expense of the mid-range shot. In football, it has led people to question the value of the running back position and how to best utilize the position. (These are not the only applications of analytics, just two examples).


However, baseball pioneered the use of analytics, finding new and improved (to some people) ways to value on-field performance. I put “to some people” in that sentence because although the use of analytics have spread quickly throughout the league, its merits are still heavily debated in the baseball world. In this article, I’m going to explore the value of analytics, illustrating how advanced statistics are being used in baseball, as well as outline some of the main arguments against the use of analytics.


The way people value a players’ performance has changed drastically in the past decade. Initially, basic stats, such as batting average, were used, but that has given way to more complete stats, such as wOBA.


Batting average has been found to be less effective due to the fact that it does not take into account walks and the type of hit. That has led to the widespread use of OPS and OPS+. OPS (on-base plus slugging) adds a players’ on-base percentage and their slugging percentage. This helps mediate some of the shortcomings of batting average.


An even more comprehensive number than OPS is OPS+. This takes a hitters’ OPS and takes into account outside factors, such as ballpark. It then normalizes it so that 100 is league average. This means that an OPS+ of 125 is 25% better than league average.


Another commonly used stat to evaluate a hitter is wOBA (weighted on-base average). This stat is takes a hitter’s OBP but takes into account how they reached base, calculating a numerical value for each event based on run expectancy.


Weighted runs created plus (wRC+), another commonly used stat, adjusts the existing stat runs created to include other factors, again such as ballpark. Like OPS+, it is also scaled so that 100 is league average.


This is not a comprehensive list of offensive statistics, but from the quick explanation I provided, it should be easy to see why they provide additional value beyond what basic numbers can produce. However, many people still reject their value, which is bizarre to me.


A large portion of critics of analytics are former players and coaches who did not have access to them throughout all or the majority of their careers. These are smart people who are very knowledgeable about baseball and had much longer MLB careers than the data scientists who evaluate them using advanced analytics, so they should see their benefit. A number that is like batting average but includes walks and values the type of hit seems like an obviously better and more complete stat. The doubters just must not understand what is incorporated in these numbers otherwise they would see their value.


There are advanced statistics to evaluate pitchers as well. Due to the factors out of pitchers’ control, ERA and pitcher wins have been deemed as insufficient numbers when evaluating a pitcher.


Instead, people use FIP and xFIP among many quality advanced stats. FIP (Fielding Independent Pitching) takes into account only outcomes that a pitcher can control, eliminating results on balls hit in play. xFIP is the same, but uses the league average HR/FB  rate since an individual pitcher’s HR/FB rate can fluctuate from season to season. FIP and xFIP are clearly better tools to evaluate a pitcher.


Errors, putouts, and assists have been the traditional stats used to evaluate fielders. Instead Defensive Runs Saved (DRS) and Ultimate Zone Rating (UZR) are used. Like most advanced stats, DRS and UZR have their flaws. But, also like most advanced stats, DRS and UZR are more complete metrics than the basic numbers that have been used for years.


DRS and UZR take into account a fielder’s range, arm, double play ability, and play difficulty among factors that are not included in errors. A shortstop that has superb range will get to batted balls that other shortstops do not have the ability to field. However, they are more likely to make an error because of the difficulty of the play. This is not to say that a shortstop with more errors always has better range, but it shows how the number of errors is not a great evaluation tool. Willy Adames, the shortstop for the Rays, has the fourth most errors among shortstops. However, he has the fifth highest DRS among shortstops.


Probably the most highly debated advanced stat is Wins Above Replacement (WAR). WAR does exactly what its names suggests, it finds how many more wins a player adds compared to a replacement-level player at their position. However, not everyone agrees on its credibility.


A while ago, there was a tweet from @MLBStats stated that Mike Trout has already surpassed future Hall of Famer Derek Jeter in career WAR. Trout is in his ninth season, and Derek Jeter played 20 seasons. This ignited the New York Yankees fanbase in defense of their longtime shortstop who won five World Series rings. The replies ranged from “let me know when Trout has 5 rings” to “World Series > WAR” to “not exactly sure what WAR is and really don’t care” to calling Trout a “HOF stat padder.”

In my opinion, as the case with many advanced numbers, people denounce stats because they do not understand them. WAR is a very complicated stat, but simply, you add up a player’s batting, fielding, and baserunning runs, adjust it for position and league, and then scale it. Each of those calculations are really complicated though. Not only would it help to understand the numbers, but people need to realize that WAR is not meant to be a precise number, but an approximation of a player’s value.


For people that do understand WAR, their issues include measuring defense and the positional adjustment. However, it needs to be understood that these factors and adjustments are not arbitrary numbers that someone pulled out of a hat and inserted into the WAR formula. They are heavily researched and calculated. Also, WAR is not an all-encompassing number, even though that is its intent. It’s supposed to be used along with other statistics to paint a clearer picture about a player.


WAR is still not a perfect measurement of a player’s value, but that does not mean that it is not a very useful stat and more complete than the traditional numbers, such as batting average or RBI. It may be difficult to differentiate a 6.4 WAR player and a 6.1 WAR player, but you can confidently conclude that a 6 WAR player is more valuable than a 4 WAR player. When it is used correctly, WAR can provide incredible insight. (For more information on WAR click here.)


The most contentious issue regarding analytics is how they have changed the way teams and players approach the game of baseball, specifically regarding the surge in home runs and shifts.


(Baseball Savant)


Understanding why teams shift is pretty simple: if you know a player hits a ball to a certain part of the field the majority of the time, you are more likely to get an out if you have a fielder there. The above image is Joey Gallo's career slice chart, showing where he tends to hit the ball. If you know this information, it would make sense to shift to the pull side to limit. This is shown in the data. Last season, Gallo had a significantly higher wOBA without the shift (.644) than with the shift (.390).


But because this strategy deviates from traditional baseball, older fans disapprove. Former manager Lou Piniella illustrates this perfectly when he said that “I managed 3,400 games in the big leagues, and never once did I put on a full shift on anybody. Not once. And I think I won a few games without having to shift.’’


Just because Piniella won 1835 without shifting, that doesn’t make it an ineffective strategy. It was not a thing when he managed so of course he never shifted. However, according to Baseball-Savant, the Dodgers and Astros, the teams with the two best records in the league, shift more than any other team in baseball. They were also at the top of the league in DRS (the Dodgers led the league with 136 and the Astros were fourth with 90).


Shifting doesn’t guarantee a good defense. The Orioles shift the third most in baseball but have the worst DRS in baseball. That is due to the fact that they simply don’t have good players, so a shift doesn’t have the same impact.


Teams shift to take away a hitter’s strength just like a basketball defender will shade towards an opponent’s dominant hand. But some MLB fans condemn shifting. The only explanation for that is it’s a change. People like Lou Piniella disagree with shifting because it wasn’t around when they played or coached. Their pride interferes with them understanding the advantages of new strategies derived from analytics.


The same is true for the home run craze that has consumed baseball recently. However, with more home runs comes more strikeouts, which people claim is ruining the sport. Just listen to the hit king, Pete Rose. He said that “it’s home run derby every night, and if that’s what they want, that’s what they’re going to get. But they have to understand something ... Home runs are up. Strikeouts are up. But attendance is down. I didn’t go to Harvard or one of those Ivy League schools, but that’s not a good thing.’’


It doesn’t take an Ivy League education to know that correlation does not necessarily mean causation. Attendance is down around baseball — it would take a whole other article to explore why — but there is no evidence it is because of the increase in home runs and strikeouts.


Home runs are up because data shows that a home run has the highest run probability of any other outcome (walk, single, double, triple, etc.), which should not be surprising. In order to hit more home runs, players have increased their launch angle, which means they are hitting the ball in the air more.


This isn’t done just to hit more home runs, it also increases the number of hits and the quality of hits. A batter is more likely to get a hit if they hit the ball in the air, over the infield. It is also more likely that the ball goes into the gap, resulting in an extra base hit. It is significantly harder to hit a double, triple, or home run on a ground ball.


Teams that have a heavy emphasis on analytics include the Astros, A’s, Rays, Cubs, Red Sox, and Cardinals. All of these teams have had a tremendous amount of success recently, including multiple World Series wins. The A’s and Rays, who do not have the financial abilities of large market teams, have been very successful by using advanced analytics to maximize the production they can receive from their limited payroll.


It seems clear that advanced analytics are superior to the basic stats that have been used in baseball for decades even though many fans, former players, and coaches disagree. If teams are reluctant to utilize advanced analytics, they are just digging their own grave and hurting their chances of competing for a World Series.

5 views0 comments

Recent Posts

See All

Comments


Post: Blog2_Post
bottom of page