Trends,
Exceptions and Results of IBB Usage
John F. Jarvis

INTRODUCTION
In the October 13, 1923 issue of Colliers, Walter Camp
writes:
I don't like the IBB either. I greatly prefer the excitement of the
irresistible hitter against the immovable pitcher in a situation
crucial to the outcome of the game. In this study I offer new evidence
that the IBB is exceedingly overused.
Intentional base on balls (IBB) usage is summarized by a new measure,
IWFR "Intentional base on balls (Walk) FRaction"
defined as the number of IBBs given expressed as a fraction of IBB
situations. The critical task deciding if a plate appearance (PA)
warrants an IBB, an IBB situation, is accomplished with a neural net
evaluating ten parameters expressed numerically that describe the
context of the PA in the game. The neural net, trained using the
419,571 UBBs and 42,721 IBBs presently available in the available
play-by-play record (1963, 1967 and 1968 AL, both leagues
1969,1972-2002) is an extension of an earlier version I described in a
BRJ article (Volume 29 (2000), p107) and at
the SABR 30 Convention
(2000). The sources of the play-by-play data are Retrosheet and Gary Gillette/Pete
Palmer/Ray Kerby. The IBB analysis presented is based on the 32
seasons, 1969, 1972 - 2002.
Two categories of bases on balls are referred to throughout this
presentation. The intentional kind are referred to by the standard
acronym IBB while the unintentional variety are referred to by the
slightly non standard term, UBB. BB by itself is used to refer to all
of them. I believe the other statistical terms and acronyms used are
well known.
In this presentation I give a brief description of the neural net
technique used to characterize each plate appearance as an IBB
situation or not, describe long term IBB usage and examine run
production following the first IBB situation in an inning. The AL's
stubborn insistence on the using a DH forces much of this analysis to
done separately for the two leagues.
Long term trends in IBB usage are shown by tabulating IWFR by league
and season. Overall, the IBB does not appear to be a winning strategy.
Examining the first IBB situation in an inning indicates that giving an
IBB results in an average of 0.14 runs per inning more being scored
than when the batter is allowed to hit. The team and individual player
data presented questions the conventional wisdom about the use of the
IBB except in cases of extreme hitting ability.
The 2002 season had three players in the 32 season IWFR top 10 list (Table 6): Barry Bonds, Vladimir Guerrero and Ichiro
Suzuki. Bonds' 2002 season ranks as the highest on this list and
perhaps is justified although his 2001 home run record season isn't
among the top 10. Suzuki's overall #3 ranking for his 2002 season seems
strangely high.
DEFINING AN IBB SITUATION: THE
NEURAL NET
I define an IBB situation as a game situation where an IBB is can be
considered a tactical option for the defense. The problem with this
definition is identifying these IBB situations in the play-by-play
record. The play-by-play record for MLB documents what happened, not
why it or what might have happened. Fortunately, there are a number of
statistical techniques that can be used to construct a decision
procedure based on actual data. I have chosen one of these, a
particular kind of neural network, as the method for making the
decision whether a game situation warrants an IBB or not.
The term "neural net" is unfortunate, implying that some sort of
cognitive process is being developed or used by a computing system. In
reality, the neural net is a well defined numerical computation for
converting the ten game parameters describing the context of a PA , the
values for the ten game parameters at the time of the PA, into a single
number that is compared to a decision value. If the neural net
calculated value is greater than the decision value the game situation
is called an IBB situation. A training procedure using game data
establishes the behavior of the neural net.
When presented with a particular PA context the trained neural net
generates a number between +1, most IBB like and -1 which is most UBB
like. The decision point in declaring a particular PA context UBB or
IBB is set midway between the average of the neural net output values
for the training set IBB event and the average of the UBB event neural
net outputs. Repeating the training using other randomly chosen groups
of BBs produces essentially the same results. Splitting the data by
league or chronologically also produces equivalent results.
Table 1 lists the ten parameters used to characterize
a PA in a particular game situation and indicates their relative
importance using the results of individual linear regressions for each
parameter. The IBB or UBB value is the dependent variable. In these
calculations an IBB is represented as a +1 value while the UBB is given
a -1. STD is the standard deviation for the regression. R is the
correlation coefficient. A negative R indicates anti correlation, that
is increasing the variable in question reduces the chance for an IBB.
As expected, a runner on first decreases the chance for an IBB as does
a higher batting average for the on deck hitter. The square of R
indicates the amount of variance "explained" by the variable. The most
important indicator, largest R squared, for the IBB is a runner on
second base, confirming conventional opinions. The batter evaluator
used is the current slugging average computed from a two week running
average preceding the PA in question. Similarly, the on deck batter
evaluator is the current batting average. This choice of hitting
evaluators lead to the most accurate neural net. Use of the current
values for these two parameters rather than season average values
provides a very small improvement in the neural net classification
accuracy. More surprising is the relatively low importance of the
hitting measures to the neural net. The R2 column values
total more than 1 indicating the parameters used are not linearly
independent. The neural net does not require linear independence of its
inputs. Creating a set of 10 neural nets, each with one of the
parameters removed, leads to a similar estimate for the importance of
the individual parameters as is shown by the linear regressions.
Table 2 displays the results of the
neural net classification for all BBs in the play-by-play
record. The results for the two leagues are essentially identical. IBBs
are 9.2% of all BBs. The fraction of UBBs in IBB situations, 9.7%, is
slightly higher than this fraction. Perhaps this slightly elevated
number of UBBs occurring in IBB situations is evidence for the
"pitching around" phenomena. The fraction of HPs in each category, IBB
and UBB, is very close to the total for all BBs.
Additional details of this neural net are given in an
Appendix.
IWFR - Intentional Walks Fraction
Table 3 presents several of the statistics used in
this presentation as a function of the batting order separately for the
two leagues. The practice of placing power hitters in the third, fourth
and fifth positions shows up very clearly in the slugging averages.
About half of the substantially higher total value of IWFR in the
National League is from the large number of IBBs given to the eighth
place hitter in the order. The particularly weak hitting in the ninth
place that gives rise to the eighth place IBB totals is also apparent.
Tables 4 and 5
tabulate a number of IBB related quantities. Table 4
lists the league averages while Table 5 gives team
averages for the 2002 season. The season IWFR data from Table
4 is plotted in Figure 1. The straight lines are
the results of linear regression applied to each league. The downward
trend for IWFR indicates a decreasing use of the IBB. This is evident
in both leagues, but is more pronounced in the NL.
SCORING IN IBB SITUATIONS
Does the IBB save runs? I have developed a convincing answer to this
crucial question by tabulating runs scored in an inning following the
first IBB situation in the inning in two categories. The first is for
the IBB not being given, the batter being challenged by the pitcher.
The second is for the batter being given the IBB. About 55% of IBBs are
given in the first IBB situation of an inning. First IBB situations in
an inning are 67% of all IBB situations. IBB situations are indicated
in 7.9% of all PA.
To make a comparison as meaningful as possible it is necessary to
select situations that only differ by the presence or absence of the
particular event being studied, the IBB. The neural net identifies all
PA where an IBB is an option. I have additionally limited the analysis
to the first IBB situation in an inning as a way of minimizing the
problems that would occur if the analysis had to deal separately with
the many things including additional IBB situations that can follow in
such innings. With these restrictions the differences in scoring can be
attributed to the the IBB.
Tables 4 and 5 contain the
overview IBB situational data for both leagues for the entire period
analyzed. Three columns, RUNS, NUMB and R/N are given for both
categories, IBB not given and IBB given. RUNS is the sum of all runs
scored including the plate appearance that is considered the first IBB
situation to the end of the inning. NUMB is the number of first in
inning IBB situations and R/N is the ratio of the two preceding columns
thus is the average number of runs scored from the first IBB situation
to the end of the inning. These headings are also used in Tables 6-9
that summarize individual batter outcomes in IBB situations. Table 4 lists season average values of this data by
league. Table 5 provides the same information by
team for the 2002 season. For both leagues, there is higher scoring
after an IBB than when the batter is challenged. The difference is
larger in the NL.
Following the TOTALS lines in Tables 4 and 5 standard deviations (STD) are presented. The STD
values for IWFR are the STDs of the IWFR values presented in the table
and are calculated from the linear regressions used to create Figure 1.
Since the independent variable in the regression is time (season), the
STD values are missing in Table 5. The STDs for the
R/N columns are not based on the averages given in the columns but are
based on the actual numbers of runs occurring from the defining IBB
situation. That these STD values are greater than averages indicates
multiple run scoring occurs often.
The "Law of Small Numbers" must be respected in any statistical study.
The accuracy of averages or any other quantity computed from a series
of measurements or events is highly dependent on the number of samples.
In general the accuracy of averages increases as the square root of the
number of samples: to achieve twice the accuracy requires four times
the number of samples. Comparing the runs/inning in the two categories
stated, IBB given or not given, requires comparing two such averages. A
common technique for assessing the significance of differences in these
quantities is Student's t-test (Numerical Recipes in C, 2nd Ed, pp
616-617). The t-test returns the probability that the null hypotheses
is confirmed. That is, the probability that the two averages being
compared are drawn from the same distribution of events. Probabilities
less than 0.01 to 0.05 indicate the differences are significant. The
result of the t-test is given for the comparison of the two R/N columns
in Tables 4 and 5 in the T-TEST
column. Examining Table 4 suggests that for the
number of IBB situations tabulated for league seasons a difference in
the averages of 0.1 to 0.2 is significant. For the smaller number of
events when comparing team season totals differences of 0.5 to 0.6 are
needed to insure significance. Overall, the differences between the IBB
given and not given categories is significant although several seasons
of data must be used to show this clearly.
During the 32 seasons used in this study there have been 5630 player
seasons having at least 400 plate appearances. Table 6
clearly shows that the frequency of IBBs, expressed as IWFR, does not
correspond to highest SA. This group of hitters having the 10 highest
season values for IWFR of the 5630 shows a higher rate of scoring runs
after an IBB than when they are allowed to hit. Table 7
lists all batters with SA >= 0.700, coincidentally 10 of them, thus
its total line is the last line in Table 8. Overall,
in this hitter selection the IBB reduces runs scored although there are
exceptions. Individual batter season values in tables 6, 7 and 9 can be
expected to show high variability because of small numbers of IBB
related events in a single player season. The t-test for the TOTALS
line in Table 6 is 0.43 suggesting that the increase
in runs for this group of player seasons is not significant. The
somewhat larger magnitude of the decrease in Table 7
has a t-test value of 0.04 indicating some significance can be attached
to the difference.
Table 8 tabulates the 5630 player
seasons by SA ranges. This data is plotted in Figure 2
and provides the strongest evidence that there is a value of batter SA
that seems to justify giving the IBB. The IBB not given curve is
roughly linear in SA displaying the obvious that allowing a good batter
to hit produces more runs than a weaker hitter will. The extra base
runner due to the IBB clearly provides a greater chance for scoring for
all but the very best hitters. There is a drop off in runs scored
following an IBB for batters with SA > 0.60 and for the lowest SA
group. The decrease in runs scored without an IBB in the lowest SA
group is not unexpected. The best hitters tend to be "protected" by
strong hitters behind them in the order. I do not have an explanation
for the drop off in runs following an IBB for the best hitters. Since
the t-test value for the highest SA range indicates marginal
significance perhaps an explanation is not needed.
Table 9 displays the IBB given and
not given results for Barry Bonds' career. While his career shows a
slightly higher scoring following an IBB the t-test applied to his
career is 0.47, suggesting that even for a hitter of his caliber there
is insufficient data to demonstrate the efficacy of the IBB applied to
him. Mark McGwire's career hitting is similarly inconclusive regarding
the IBB.
CONCLUSIONS
While it is not possible to know what a manager was thinking in a
particular situation, using suitable statistical techniques the
decision making about offering an IBB can be approximated. IBB
situations can be accurately recognized as indicated by the 97%
accuracy for the offering of IBBs in neural net defined IBB situations.
However, managers don't call for the IBB in every possible situation it
could be offered. This component of their decision making has not been
modeled.
Tables 6 and 7 list the top 10 of
the 5630 player seasons having at least 400 plate appearances. Table 6 is ordered by IWFR while Table 7 is ordered by
SA. Only Barry Bonds in 2002 appears on both of these lists. There is a
reduction in runs scored following the first IBB situation following an
IBB in the highest SA group. The lack of consistency and judgment in
using the IBB is seen in the high IWFR group which shows an increase in
runs scored following the IBB compared to allowing the batter to hit.
In both Tables 5 and 6 the "Law
of Small Numbers" is evident. Individual players do not always follow
the average results. Notable small, but not statistically significant,
deviations in the opposite direction include Bonds in 2001 (Table 7) and Suzuki in 2002 (Table 6).
I have shown, the incriminating data is in Table 8
and Figure 2, that the IBB creates runs when batter
receiving it has SA less than 0.600 . Only about 4% of the IBBs
tabulated in the period covered in the study were given to batters
during seasons where they has a SA greater than 0.600 . Giving the
benefit of statistical doubt to the managers, at least 96% of the IBBs
were offered in situations that increase the chances for a run being
scored. There is some evidence that the use of the IBB is diminishing
but it is still greatly overused.
Barry Bonds might not be pleased with these results but Ichiro Suzuki
should be.
Figure 1. Season IWFR for the AL and NL
Figure 2. Average Runs Scored to End
of Inning Following First IBB Situation
Appendix - More on
the Neural Net
For those technically inclined, the neural net used
in this project is a standard back propagation net with one layer of
hidden units. Ten input units are used (corresponding to each parameter
in the game situation with each base given a separate input unit) and
eleven hidden units are used. A hidden unit is a weighted sum over the
input unit values plus a constant. The hidden unit computation has the
same form and complexity as a linear regression formula evaluation. The
result of this sum is passed through an activation function, sigma(x),
to generate the hidden unit output. The activation function in this
case is sigmoidal, having asymptotic values of +1 and -1 and a slope of
1.0 at x=0. The output unit is a sum of a constant term and the
weighted outputs of the hidden units. Subjecting the output unit sum to
the same activation function completes the neural net calculation. This
neural net requires about twelve times the computation of evaluating a
linear regression equation for the same input parameters in addition to
the eleven activation function evaluations. It contains 133
coefficients which, unlike a linear regression formula, do not have an
interpretation in terms of the input parameters. There are no
requirements for the linear independence of the parameters or the range
of values used.
A training procedure using the available data determines the value of
each coefficient by minimizing the sum of the squared errors for all
the events in the training data set.
The neural net as a whole can be given a geometric interpretation.
Consider a neural net with two input units (parameters). Each hidden
unit has the form F(X,Y) = AX + BY + C with A, B, and C the values that
are determined for a hidden unit by the training procedure. Each hidden
unit will have a unique set of values for A, B and C. F(X,Y)=0 is the
equation for a line in a plane. For any pair of values for X and Y,
F(X,Y) gives the signed distance from the line to the point X,Y. Using
three such hidden units (equations) defines three lines. A triangle,
the simplest figure enclosing an area, is formed from three lines. The
output unit weights the values of the three hidden units to define the
final area defining the two categories.
Figure 3 provides an example of using a three hidden
unit back propagation neural net to create a decision "surface"
separating points in a plane. The two categories of points, "A" and
"B", are generated using a random number generator and have no
significance beyond a simple demonstration of the neural net technique.
The "A" points are uniform in a statistical sense over the entire area.
The "B" points are concentrated in the center of the area. The curve
enclosing the "B" points clearly shows the three lines corresponding to
the three hidden units and the blending of them into a single boundary
(green) by the output unit. There are 1000 data points in each category
and the neural net correctly classified 1847 of them.
Moving to three dimensions, the hidden units describe planes and give
the distance of three dimensional points to the plane. The simplest
solid object composed from planes is the four sided tetrahedron. Four
hidden units are appropriate when there are three input units. While I
can't visualize higher dimensions, the equations have the same form.
This discussion provides the primary reason for using eleven hidden
units in the IBB situation classification neural net. In practice the
classification error improves, but only slightly, as the number of
hidden units is increased.
Many different kinds of neural nets have been invented. There is no
assurance that the back propagation net is the optimum method for this
task. The theory of this particular form of the neural net used is
covered in “Neural Networks”, Laurene Fausett, Prentice Hall, 1994, and
other textbooks.

Figure 3. Neural Net with Two Parameters
Copyright 2003, John F. Jarvis