Exceptions and Results of IBB Usage
John F. Jarvis
In the October 13, 1923 issue of Colliers, Walter Camp writes:
I don't like the IBB either. I greatly prefer the excitement of the irresistible hitter against the immovable pitcher in a situation crucial to the outcome of the game. In this study I offer new evidence that the IBB is exceedingly overused.
Intentional base on balls (IBB) usage is summarized by a new measure, IWFR "Intentional base on balls (Walk) FRaction" defined as the number of IBBs given expressed as a fraction of IBB situations. The critical task deciding if a plate appearance (PA) warrants an IBB, an IBB situation, is accomplished with a neural net evaluating ten parameters expressed numerically that describe the context of the PA in the game. The neural net, trained using the 419,571 UBBs and 42,721 IBBs presently available in the available play-by-play record (1963, 1967 and 1968 AL, both leagues 1969,1972-2002) is an extension of an earlier version I described in a BRJ article (Volume 29 (2000), p107) and at the SABR 30 Convention (2000). The sources of the play-by-play data are Retrosheet and Gary Gillette/Pete Palmer/Ray Kerby. The IBB analysis presented is based on the 32 seasons, 1969, 1972 - 2002.
Two categories of bases on balls are referred to throughout this presentation. The intentional kind are referred to by the standard acronym IBB while the unintentional variety are referred to by the slightly non standard term, UBB. BB by itself is used to refer to all of them. I believe the other statistical terms and acronyms used are well known.
In this presentation I give a brief description of the neural net technique used to characterize each plate appearance as an IBB situation or not, describe long term IBB usage and examine run production following the first IBB situation in an inning. The AL's stubborn insistence on the using a DH forces much of this analysis to done separately for the two leagues.
Long term trends in IBB usage are shown by tabulating IWFR by league and season. Overall, the IBB does not appear to be a winning strategy. Examining the first IBB situation in an inning indicates that giving an IBB results in an average of 0.14 runs per inning more being scored than when the batter is allowed to hit. The team and individual player data presented questions the conventional wisdom about the use of the IBB except in cases of extreme hitting ability.
The 2002 season had three players in the 32 season IWFR top 10 list (Table 6): Barry Bonds, Vladimir Guerrero and Ichiro Suzuki. Bonds' 2002 season ranks as the highest on this list and perhaps is justified although his 2001 home run record season isn't among the top 10. Suzuki's overall #3 ranking for his 2002 season seems strangely high.
DEFINING AN IBB SITUATION: THE
I define an IBB situation as a game situation where an IBB is can be considered a tactical option for the defense. The problem with this definition is identifying these IBB situations in the play-by-play record. The play-by-play record for MLB documents what happened, not why it or what might have happened. Fortunately, there are a number of statistical techniques that can be used to construct a decision procedure based on actual data. I have chosen one of these, a particular kind of neural network, as the method for making the decision whether a game situation warrants an IBB or not.
The term "neural net" is unfortunate, implying that some sort of cognitive process is being developed or used by a computing system. In reality, the neural net is a well defined numerical computation for converting the ten game parameters describing the context of a PA , the values for the ten game parameters at the time of the PA, into a single number that is compared to a decision value. If the neural net calculated value is greater than the decision value the game situation is called an IBB situation. A training procedure using game data establishes the behavior of the neural net.
When presented with a particular PA context the trained neural net generates a number between +1, most IBB like and -1 which is most UBB like. The decision point in declaring a particular PA context UBB or IBB is set midway between the average of the neural net output values for the training set IBB event and the average of the UBB event neural net outputs. Repeating the training using other randomly chosen groups of BBs produces essentially the same results. Splitting the data by league or chronologically also produces equivalent results.
Table 1 lists the ten parameters used to characterize a PA in a particular game situation and indicates their relative importance using the results of individual linear regressions for each parameter. The IBB or UBB value is the dependent variable. In these calculations an IBB is represented as a +1 value while the UBB is given a -1. STD is the standard deviation for the regression. R is the correlation coefficient. A negative R indicates anti correlation, that is increasing the variable in question reduces the chance for an IBB. As expected, a runner on first decreases the chance for an IBB as does a higher batting average for the on deck hitter. The square of R indicates the amount of variance "explained" by the variable. The most important indicator, largest R squared, for the IBB is a runner on second base, confirming conventional opinions. The batter evaluator used is the current slugging average computed from a two week running average preceding the PA in question. Similarly, the on deck batter evaluator is the current batting average. This choice of hitting evaluators lead to the most accurate neural net. Use of the current values for these two parameters rather than season average values provides a very small improvement in the neural net classification accuracy. More surprising is the relatively low importance of the hitting measures to the neural net. The R2 column values total more than 1 indicating the parameters used are not linearly independent. The neural net does not require linear independence of its inputs. Creating a set of 10 neural nets, each with one of the parameters removed, leads to a similar estimate for the importance of the individual parameters as is shown by the linear regressions.
Table 2 displays the results of the
neural net classification for all BBs in the play-by-play
record. The results for the two leagues are essentially identical. IBBs
are 9.2% of all BBs. The fraction of UBBs in IBB situations, 9.7%, is
slightly higher than this fraction. Perhaps this slightly elevated
number of UBBs occurring in IBB situations is evidence for the
"pitching around" phenomena. The fraction of HPs in each category, IBB
and UBB, is very close to the total for all BBs.
Additional details of this neural net are given in an
IWFR - Intentional Walks Fraction
Table 3 presents several of the statistics used in this presentation as a function of the batting order separately for the two leagues. The practice of placing power hitters in the third, fourth and fifth positions shows up very clearly in the slugging averages. About half of the substantially higher total value of IWFR in the National League is from the large number of IBBs given to the eighth place hitter in the order. The particularly weak hitting in the ninth place that gives rise to the eighth place IBB totals is also apparent.
Tables 4 and 5
tabulate a number of IBB related quantities. Table 4
lists the league averages while Table 5 gives team
averages for the 2002 season. The season IWFR data from Table
4 is plotted in Figure 1. The straight lines are
the results of linear regression applied to each league. The downward
trend for IWFR indicates a decreasing use of the IBB. This is evident
in both leagues, but is more pronounced in the NL.
SCORING IN IBB SITUATIONS
Does the IBB save runs? I have developed a convincing answer to this crucial question by tabulating runs scored in an inning following the first IBB situation in the inning in two categories. The first is for the IBB not being given, the batter being challenged by the pitcher. The second is for the batter being given the IBB. About 55% of IBBs are given in the first IBB situation of an inning. First IBB situations in an inning are 67% of all IBB situations. IBB situations are indicated in 7.9% of all PA.
To make a comparison as meaningful as possible it is necessary to select situations that only differ by the presence or absence of the particular event being studied, the IBB. The neural net identifies all PA where an IBB is an option. I have additionally limited the analysis to the first IBB situation in an inning as a way of minimizing the problems that would occur if the analysis had to deal separately with the many things including additional IBB situations that can follow in such innings. With these restrictions the differences in scoring can be attributed to the the IBB.
Tables 4 and 5 contain the overview IBB situational data for both leagues for the entire period analyzed. Three columns, RUNS, NUMB and R/N are given for both categories, IBB not given and IBB given. RUNS is the sum of all runs scored including the plate appearance that is considered the first IBB situation to the end of the inning. NUMB is the number of first in inning IBB situations and R/N is the ratio of the two preceding columns thus is the average number of runs scored from the first IBB situation to the end of the inning. These headings are also used in Tables 6-9 that summarize individual batter outcomes in IBB situations. Table 4 lists season average values of this data by league. Table 5 provides the same information by team for the 2002 season. For both leagues, there is higher scoring after an IBB than when the batter is challenged. The difference is larger in the NL.
Following the TOTALS lines in Tables 4 and 5 standard deviations (STD) are presented. The STD values for IWFR are the STDs of the IWFR values presented in the table and are calculated from the linear regressions used to create Figure 1. Since the independent variable in the regression is time (season), the STD values are missing in Table 5. The STDs for the R/N columns are not based on the averages given in the columns but are based on the actual numbers of runs occurring from the defining IBB situation. That these STD values are greater than averages indicates multiple run scoring occurs often.
The "Law of Small Numbers" must be respected in any statistical study. The accuracy of averages or any other quantity computed from a series of measurements or events is highly dependent on the number of samples. In general the accuracy of averages increases as the square root of the number of samples: to achieve twice the accuracy requires four times the number of samples. Comparing the runs/inning in the two categories stated, IBB given or not given, requires comparing two such averages. A common technique for assessing the significance of differences in these quantities is Student's t-test (Numerical Recipes in C, 2nd Ed, pp 616-617). The t-test returns the probability that the null hypotheses is confirmed. That is, the probability that the two averages being compared are drawn from the same distribution of events. Probabilities less than 0.01 to 0.05 indicate the differences are significant. The result of the t-test is given for the comparison of the two R/N columns in Tables 4 and 5 in the T-TEST column. Examining Table 4 suggests that for the number of IBB situations tabulated for league seasons a difference in the averages of 0.1 to 0.2 is significant. For the smaller number of events when comparing team season totals differences of 0.5 to 0.6 are needed to insure significance. Overall, the differences between the IBB given and not given categories is significant although several seasons of data must be used to show this clearly.
During the 32 seasons used in this study there have been 5630 player seasons having at least 400 plate appearances. Table 6 clearly shows that the frequency of IBBs, expressed as IWFR, does not correspond to highest SA. This group of hitters having the 10 highest season values for IWFR of the 5630 shows a higher rate of scoring runs after an IBB than when they are allowed to hit. Table 7 lists all batters with SA >= 0.700, coincidentally 10 of them, thus its total line is the last line in Table 8. Overall, in this hitter selection the IBB reduces runs scored although there are exceptions. Individual batter season values in tables 6, 7 and 9 can be expected to show high variability because of small numbers of IBB related events in a single player season. The t-test for the TOTALS line in Table 6 is 0.43 suggesting that the increase in runs for this group of player seasons is not significant. The somewhat larger magnitude of the decrease in Table 7 has a t-test value of 0.04 indicating some significance can be attached to the difference.
Table 8 tabulates the 5630 player
seasons by SA ranges. This data is plotted in Figure 2
and provides the strongest evidence that there is a value of batter SA
that seems to justify giving the IBB. The IBB not given curve is
roughly linear in SA displaying the obvious that allowing a good batter
to hit produces more runs than a weaker hitter will. The extra base
runner due to the IBB clearly provides a greater chance for scoring for
all but the very best hitters. There is a drop off in runs scored
following an IBB for batters with SA > 0.60 and for the lowest SA
group. The decrease in runs scored without an IBB in the lowest SA
group is not unexpected. The best hitters tend to be "protected" by
strong hitters behind them in the order. I do not have an explanation
for the drop off in runs following an IBB for the best hitters. Since
the t-test value for the highest SA range indicates marginal
significance perhaps an explanation is not needed.
Table 9 displays the IBB given and
not given results for Barry Bonds' career. While his career shows a
slightly higher scoring following an IBB the t-test applied to his
career is 0.47, suggesting that even for a hitter of his caliber there
is insufficient data to demonstrate the efficacy of the IBB applied to
him. Mark McGwire's career hitting is similarly inconclusive regarding
While it is not possible to know what a manager was thinking in a particular situation, using suitable statistical techniques the decision making about offering an IBB can be approximated. IBB situations can be accurately recognized as indicated by the 97% accuracy for the offering of IBBs in neural net defined IBB situations. However, managers don't call for the IBB in every possible situation it could be offered. This component of their decision making has not been modeled.
Tables 6 and 7 list the top 10 of the 5630 player seasons having at least 400 plate appearances. Table 6 is ordered by IWFR while Table 7 is ordered by SA. Only Barry Bonds in 2002 appears on both of these lists. There is a reduction in runs scored following the first IBB situation following an IBB in the highest SA group. The lack of consistency and judgment in using the IBB is seen in the high IWFR group which shows an increase in runs scored following the IBB compared to allowing the batter to hit. In both Tables 5 and 6 the "Law of Small Numbers" is evident. Individual players do not always follow the average results. Notable small, but not statistically significant, deviations in the opposite direction include Bonds in 2001 (Table 7) and Suzuki in 2002 (Table 6).
I have shown, the incriminating data is in Table 8 and Figure 2, that the IBB creates runs when batter receiving it has SA less than 0.600 . Only about 4% of the IBBs tabulated in the period covered in the study were given to batters during seasons where they has a SA greater than 0.600 . Giving the benefit of statistical doubt to the managers, at least 96% of the IBBs were offered in situations that increase the chances for a run being scored. There is some evidence that the use of the IBB is diminishing but it is still greatly overused.
Barry Bonds might not be pleased with these results but Ichiro Suzuki should be.
Figure 1. Season IWFR for the AL and NL
Figure 2. Average Runs Scored to End
of Inning Following First IBB Situation
Appendix - More on the Neural Net
For those technically inclined, the neural net used
in this project is a standard back propagation net with one layer of
hidden units. Ten input units are used (corresponding to each parameter
in the game situation with each base given a separate input unit) and
eleven hidden units are used. A hidden unit is a weighted sum over the
input unit values plus a constant. The hidden unit computation has the
same form and complexity as a linear regression formula evaluation. The
result of this sum is passed through an activation function, sigma(x),
to generate the hidden unit output. The activation function in this
case is sigmoidal, having asymptotic values of +1 and -1 and a slope of
1.0 at x=0. The output unit is a sum of a constant term and the
weighted outputs of the hidden units. Subjecting the output unit sum to
the same activation function completes the neural net calculation. This
neural net requires about twelve times the computation of evaluating a
linear regression equation for the same input parameters in addition to
the eleven activation function evaluations. It contains 133
coefficients which, unlike a linear regression formula, do not have an
interpretation in terms of the input parameters. There are no
requirements for the linear independence of the parameters or the range
of values used.
A training procedure using the available data determines the value of each coefficient by minimizing the sum of the squared errors for all the events in the training data set.
The neural net as a whole can be given a geometric interpretation. Consider a neural net with two input units (parameters). Each hidden unit has the form F(X,Y) = AX + BY + C with A, B, and C the values that are determined for a hidden unit by the training procedure. Each hidden unit will have a unique set of values for A, B and C. F(X,Y)=0 is the equation for a line in a plane. For any pair of values for X and Y, F(X,Y) gives the signed distance from the line to the point X,Y. Using three such hidden units (equations) defines three lines. A triangle, the simplest figure enclosing an area, is formed from three lines. The output unit weights the values of the three hidden units to define the final area defining the two categories.
Figure 3 provides an example of using a three hidden unit back propagation neural net to create a decision "surface" separating points in a plane. The two categories of points, "A" and "B", are generated using a random number generator and have no significance beyond a simple demonstration of the neural net technique. The "A" points are uniform in a statistical sense over the entire area. The "B" points are concentrated in the center of the area. The curve enclosing the "B" points clearly shows the three lines corresponding to the three hidden units and the blending of them into a single boundary (green) by the output unit. There are 1000 data points in each category and the neural net correctly classified 1847 of them.
Moving to three dimensions, the hidden units describe planes and give the distance of three dimensional points to the plane. The simplest solid object composed from planes is the four sided tetrahedron. Four hidden units are appropriate when there are three input units. While I can't visualize higher dimensions, the equations have the same form. This discussion provides the primary reason for using eleven hidden units in the IBB situation classification neural net. In practice the classification error improves, but only slightly, as the number of hidden units is increased.
Many different kinds of neural nets have been invented. There is no assurance that the back propagation net is the optimum method for this task. The theory of this particular form of the neural net used is covered in “Neural Networks”, Laurene Fausett, Prentice Hall, 1994, and other textbooks.
Figure 3. Neural Net with Two Parameters
Copyright 2003, John F. Jarvis