# Moneyball and Business Analytics

Brad Pitt in the movie Moneyball, standing in front of UT Professor Mary Leitnaker's book

*Power of Statistical Thinking*

The book Moneyball and the 2011 movie Moneyball tell the story of Oakland A's baseball manager Billy Beane and Paul Depodesta, a PHD economist on Bean's staff. Through the use of analytics, the A's consistently fielded a competitive team with a budget that was about a third of that of the big market teams such as the New York Yankees. This was made possible by exploiting the inefficiencies in the market for baseball. Some players were vastly overpriced while others were underpriced. The Moneyball story illustrates three important concepts in business analytics

- The need to look at the big picture and ask the right question.
- The use of data analysis to answer that question and the importance of subject matter expertise in guiding that analysis.
- The need to communicate the results in language that non-technical listeners can understand.

## Use big picture thinking to ask the right question.

In selecting players, the entire baseball industry had focused on "How can we get the 'best quality' players per dollar?" This led to a focus on the "five tools" in evaluating players: fielding, throwing, foot speed, hitting and hitting for power.

But the goal is to win, not to field the best athletes. Depodesta observed that the question should be "How can we get the most wins per dollar?" The currency of wins is runs, runs scored and runs given up to the opponent. Thus a player's contribution to the team should be measured by how he affects runs.

Ultimately, the A's discovered that on base percentage measures a player's offensive contributions better than the traditional statistics such as batting average, runs batted in and steals. On base percentage is simply the percent of time that a player gets on base either through a hit or a walk. It measures the probability that the player will not make an out.

A lineup having high on-base percentage will score a lot of runs. This was an insight that the rest of the baseball industry had overlooked. This meant that the A's could find players who had high on base percentages who were underpriced.

Three examples featured in the movie were players with high on base percentages who were undervalued because they had a perceived defect that made them unattractive to other teams. That made their "runs per dollar" attractively high. Before the 2002 season they signed Scott Hattesburg to play first base. Hattesburg was a former Red Sox catcher whose career as a catcher was over because of an injury to his throwing arm. For designated hitter they signed former Yankee outfielder, David Justice, who, at 36 years old was not being sought by other teams. They signed outfielder Jeremy Giambi who had two perceived defects: a reputation for trouble off the field and slow foot speed.

## Use Analytics, guided by process knowledge, to answer the question.

The analysis that yielded these insights was not uninformed number crunching, but an analysis guided by process knowledge. The initial analysis focused not on outcomes such as hits and runs, but on process variables, such as the event of batter hitting the ball to a specific location, at a specific speed and trajectory.

The outcome from such a batter's actions is subject to many factors, including luck. For example, the same line drive could sometimes result in a single, a double or an out. But it is possible to estimate the probabilities of these outcomes and, in turn, how these outcomes affect the "state" of the inning.

The state of the inning specifies the number of outs and the positions of runners on the bases. The expected number of runs for a given state can be estimated from the thousands of time this state has been observed in the past. For example, the expected number of runs in an inning increases if the state changes from no outs and no runners on base to no outs and a runner on second, i.e. when the batter gets a double.

This analysis revealed that on base percentage best measures a player's impact on number of runs scored over a season. For example, the A's lost 2001 MVP Jason Giambi to the Yankees. In replacing Giambi, the A's focused on replacing his on base percentage of .477, not his 38 home runs or even his .342 batting average. The 47.7 percent of the time that he got on base contributed far more runs over the course of a season than the 38 home runs that he hit. His on base percentage was also a more complete measure of his impact on runs than his batting average.

## Communicate in language everyone can understand

The communication of the results of the analysis focused on the central point: the importance of on base percentage. It did not focus on the complexities of the analysis that revealed this insight.

In explaining the importance of on-base-percentage as measure of a player's contributions, Moneyball quotes baseball writer Eric Walker, "...three outs define an inning… Anything that increases the offense's chances of making an out is bad; anything that decreases the chance of making an out is good. And what is on-base percentage? It is the probability that the batter will not make an out." Page 58.

The A's communicated their finding in even more simple terms: "Runs lead to wins and getting on base leads to runs."

In response to the question: "Why do we want that player?" the answer was simply, "Because he gets on base."

## The Philosophy of the UT Business Analytics Program

The book and the movie illustrate an effective model for implementation that summarizes the philosophy of the Business Analytics program at the University of Tennessee. That philosophy was summarized by a quote by UT alum Dave Clark, VP North American Fulfillment at Amazon.com:

People who can do high level math are practically a commodity. People who can figure out which problem is the right one to solve and then apply high level math are both expensive and elusive. Those who can communicate effectively the answer in such a way managers can understand, priceless.