Trends, Exceptions and Results of IBB Usage
John F. Jarvis

2003 SABR Convention Presentation, Denver, CO, July 15, 2003


INTRODUCTION


In the October 13, 1923 issue of Colliers, Walter Camp writes:

“Ruth, saving baseball with his terrific hitting and by inspiring the Hornsbys, the Walkers and the Williamses to go and do likewise, also made acute a baseball evil, an evil that must be destroyed if the rulers of the game mean to play fair with the fans. I mean, of course, the evil of the intentional pass. The daily patrons of the game have demonstrated that they like long hitting, and it is manifestly unfair to them when a pitcher deliberately passes a man like Ruth when the fans have come to see him hit.”


I don't like the IBB either. I greatly prefer the excitement of the irresistible hitter against the immovable pitcher in a situation crucial to the outcome of the game. In this study I offer new evidence that the IBB is exceedingly overused.

Intentional base on balls (IBB) usage is summarized by a new measure, IWFR "Intentional base on balls (Walk) FRaction" defined as the number of IBBs given expressed as a fraction of IBB situations. The critical task deciding if a plate appearance (PA) warrants an IBB, an IBB situation, is accomplished with a neural net evaluating ten parameters expressed numerically that describe the context of the PA in the game. The neural net, trained using the 419,571 UBBs and 42,721 IBBs presently available in the available play-by-play record (1963, 1967 and 1968 AL, both leagues 1969,1972-2002) is an extension of an earlier version I described in a BRJ article (Volume 29 (2000), p107) and at the SABR 30 Convention (2000). The sources of the play-by-play data are Retrosheet and Gary Gillette/Pete Palmer/Ray Kerby. The IBB analysis presented is based on the 32 seasons, 1969, 1972 - 2002.

Two categories of bases on balls are referred to throughout this presentation. The intentional kind are referred to by the standard acronym IBB while the unintentional variety are referred to by the slightly non standard term, UBB. BB by itself is used to refer to all of them. I believe the other statistical terms and acronyms used are well known.

In this presentation I give a brief description of the neural net technique used to characterize each plate appearance as an IBB situation or not, describe long term IBB usage and examine run production following the first IBB situation in an inning. The AL's stubborn insistence on the using a DH forces much of this analysis to done separately for the two leagues.

Long term trends in IBB usage are shown by tabulating IWFR by league and season. Overall, the IBB does not appear to be a winning strategy. Examining the first IBB situation in an inning indicates that giving an IBB results in an average of 0.14 runs per inning more being scored than when the batter is allowed to hit. The team and individual player data presented questions the conventional wisdom about the use of the IBB except in cases of extreme hitting ability.

The 2002 season had three players in the 32 season IWFR top 10 list (Table 6): Barry Bonds, Vladimir Guerrero and Ichiro Suzuki. Bonds' 2002 season ranks as the highest on this list and perhaps is justified although his 2001 home run record season isn't among the top 10. Suzuki's overall #3 ranking for his 2002 season seems strangely high.

DEFINING AN IBB SITUATION: THE NEURAL NET


I define an IBB situation as a game situation where an IBB is can be considered a tactical option for the defense. The problem with this definition is identifying these IBB situations in the play-by-play record. The play-by-play record for MLB documents what happened, not why it or what might have happened. Fortunately, there are a number of statistical techniques that can be used to construct a decision procedure based on actual data. I have chosen one of these, a particular kind of neural network, as the method for making the decision whether a game situation warrants an IBB or not.

The term "neural net" is unfortunate, implying that some sort of cognitive process is being developed or used by a computing system. In reality, the neural net is a well defined numerical computation for converting the ten game parameters describing the context of a PA , the values for the ten game parameters at the time of the PA, into a single number that is compared to a decision value. If the neural net calculated value is greater than the decision value the game situation is called an IBB situation. A training procedure using game data establishes the behavior of the neural net.

When presented with a particular PA context the trained neural net generates a number between +1, most IBB like and -1 which is most UBB like. The decision point in declaring a particular PA context UBB or IBB is set midway between the average of the neural net output values for the training set IBB event and the average of the UBB event neural net outputs. Repeating the training using other randomly chosen groups of BBs produces essentially the same results. Splitting the data by league or chronologically also produces equivalent results.

Table 1 lists the ten parameters used to characterize a PA in a particular game situation and indicates their relative importance using the results of individual linear regressions for each parameter. The IBB or UBB value is the dependent variable. In these calculations an IBB is represented as a +1 value while the UBB is given a -1. STD is the standard deviation for the regression. R is the correlation coefficient. A negative R indicates anti correlation, that is increasing the variable in question reduces the chance for an IBB. As expected, a runner on first decreases the chance for an IBB as does a higher batting average for the on deck hitter. The square of R indicates the amount of variance "explained" by the variable. The most important indicator, largest R squared, for the IBB is a runner on second base, confirming conventional opinions. The batter evaluator used is the current slugging average computed from a two week running average preceding the PA in question. Similarly, the on deck batter evaluator is the current batting average. This choice of hitting evaluators lead to the most accurate neural net. Use of the current values for these two parameters rather than season average values provides a very small improvement in the neural net classification accuracy. More surprising is the relatively low importance of the hitting measures to the neural net. The R2 column values total more than 1 indicating the parameters used are not linearly independent. The neural net does not require linear independence of its inputs. Creating a set of 10 neural nets, each with one of the parameters removed, leads to a similar estimate for the importance of the individual parameters as is shown by the linear regressions.

Table 2 displays the results of the neural net classification for all BBs in the play-by-play record. The results for the two leagues are essentially identical. IBBs are 9.2% of all BBs. The fraction of UBBs in IBB situations, 9.7%, is slightly higher than this fraction. Perhaps this slightly elevated number of UBBs occurring in IBB situations is evidence for the "pitching around" phenomena. The fraction of HPs in each category, IBB and UBB, is very close to the total for all BBs.

Additional details of this neural net are given in an Appendix.

IWFR - Intentional Walks Fraction


Table 3 presents several of the statistics used in this presentation as a function of the batting order separately for the two leagues. The practice of placing power hitters in the third, fourth and fifth positions shows up very clearly in the slugging averages. About half of the substantially higher total value of IWFR in the National League is from the large number of IBBs given to the eighth place hitter in the order. The particularly weak hitting in the ninth place that gives rise to the eighth place IBB totals is also apparent.

Tables 4 and 5 tabulate a number of IBB related quantities. Table 4 lists the league averages while Table 5 gives team averages for the 2002 season. The season IWFR data from Table 4 is plotted in Figure 1. The straight lines are the results of linear regression applied to each league. The downward trend for IWFR indicates a decreasing use of the IBB. This is evident in both leagues, but is more pronounced in the NL.


SCORING IN IBB SITUATIONS


Does the IBB save runs? I have developed a convincing answer to this crucial question by tabulating runs scored in an inning following the first IBB situation in the inning in two categories. The first is for the IBB not being given, the batter being challenged by the pitcher. The second is for the batter being given the IBB. About 55% of IBBs are given in the first IBB situation of an inning. First IBB situations in an inning are 67% of all IBB situations. IBB situations are indicated in 7.9% of all PA.

To make a comparison as meaningful as possible it is necessary to select situations that only differ by the presence or absence of the particular event being studied, the IBB. The neural net identifies all PA where an IBB is an option. I have additionally limited the analysis to the first IBB situation in an inning as a way of minimizing the problems that would occur if the analysis had to deal separately with the many things including additional IBB situations that can follow in such innings. With these restrictions the differences in scoring can be attributed to the the IBB.

Tables 4 and 5 contain the overview IBB situational data for both leagues for the entire period analyzed. Three columns, RUNS, NUMB and R/N are given for both categories, IBB not given and IBB given. RUNS is the sum of all runs scored including the plate appearance that is considered the first IBB situation to the end of the inning. NUMB is the number of first in inning IBB situations and R/N is the ratio of the two preceding columns thus is the average number of runs scored from the first IBB situation to the end of the inning. These headings are also used in Tables 6-9 that summarize individual batter outcomes in IBB situations. Table 4 lists season average values of this data by league. Table 5 provides the same information by team for the 2002 season. For both leagues, there is higher scoring after an IBB than when the batter is challenged. The difference is larger in the NL.

Following the TOTALS lines in Tables 4 and 5 standard deviations (STD) are presented. The STD values for IWFR are the STDs of the IWFR values presented in the table and are calculated from the linear regressions used to create Figure 1. Since the independent variable in the regression is time (season), the STD values are missing in Table 5. The STDs for the R/N columns are not based on the averages given in the columns but are based on the actual numbers of runs occurring from the defining IBB situation. That these STD values are greater than averages indicates multiple run scoring occurs often.

The "Law of Small Numbers" must be respected in any statistical study. The accuracy of averages or any other quantity computed from a series of measurements or events is highly dependent on the number of samples. In general the accuracy of averages increases as the square root of the number of samples: to achieve twice the accuracy requires four times the number of samples. Comparing the runs/inning in the two categories stated, IBB given or not given, requires comparing two such averages. A common technique for assessing the significance of differences in these quantities is Student's t-test (Numerical Recipes in C, 2nd Ed, pp 616-617). The t-test returns the probability that the null hypotheses is confirmed. That is, the probability that the two averages being compared are drawn from the same distribution of events. Probabilities less than 0.01 to 0.05 indicate the differences are significant. The result of the t-test is given for the comparison of the two R/N columns in Tables 4 and 5 in the T-TEST column. Examining Table 4 suggests that for the number of IBB situations tabulated for league seasons a difference in the averages of 0.1 to 0.2 is significant. For the smaller number of events when comparing team season totals differences of 0.5 to 0.6 are needed to insure significance. Overall, the differences between the IBB given and not given categories is significant although several seasons of data must be used to show this clearly.

During the 32 seasons used in this study there have been 5630 player seasons having at least 400 plate appearances. Table 6 clearly shows that the frequency of IBBs, expressed as IWFR, does not correspond to highest SA. This group of hitters having the 10 highest season values for IWFR of the 5630 shows a higher rate of scoring runs after an IBB than when they are allowed to hit. Table 7 lists all batters with SA >= 0.700, coincidentally 10 of them, thus its total line is the last line in Table 8. Overall, in this hitter selection the IBB reduces runs scored although there are exceptions. Individual batter season values in tables 6, 7 and 9 can be expected to show high variability because of small numbers of IBB related events in a single player season. The t-test for the TOTALS line in Table 6 is 0.43 suggesting that the increase in runs for this group of player seasons is not significant. The somewhat larger magnitude of the decrease in Table 7 has a t-test value of 0.04 indicating some significance can be attached to the difference.

Table 8 tabulates the 5630 player seasons by SA ranges. This data is plotted in Figure 2 and provides the strongest evidence that there is a value of batter SA that seems to justify giving the IBB. The IBB not given curve is roughly linear in SA displaying the obvious that allowing a good batter to hit produces more runs than a weaker hitter will. The extra base runner due to the IBB clearly provides a greater chance for scoring for all but the very best hitters. There is a drop off in runs scored following an IBB for batters with SA > 0.60 and for the lowest SA group. The decrease in runs scored without an IBB in the lowest SA group is not unexpected. The best hitters tend to be "protected" by strong hitters behind them in the order. I do not have an explanation for the drop off in runs following an IBB for the best hitters. Since the t-test value for the highest SA range indicates marginal significance perhaps an explanation is not needed.

Table 9 displays the IBB given and not given results for Barry Bonds' career. While his career shows a slightly higher scoring following an IBB the t-test applied to his career is 0.47, suggesting that even for a hitter of his caliber there is insufficient data to demonstrate the efficacy of the IBB applied to him. Mark McGwire's career hitting is similarly inconclusive regarding the IBB.

CONCLUSIONS


While it is not possible to know what a manager was thinking in a particular situation, using suitable statistical techniques the decision making about offering an IBB can be approximated. IBB situations can be accurately recognized as indicated by the 97% accuracy for the offering of IBBs in neural net defined IBB situations. However, managers don't call for the IBB in every possible situation it could be offered. This component of their decision making has not been modeled.

Tables 6 and 7 list the top 10 of the 5630 player seasons having at least 400 plate appearances. Table 6 is ordered by IWFR while Table 7 is ordered by SA. Only Barry Bonds in 2002 appears on both of these lists. There is a reduction in runs scored following the first IBB situation following an IBB in the highest SA group. The lack of consistency and judgment in using the IBB is seen in the high IWFR group which shows an increase in runs scored following the IBB compared to allowing the batter to hit. In both Tables 5 and 6 the "Law of Small Numbers" is evident. Individual players do not always follow the average results. Notable small, but not statistically significant, deviations in the opposite direction include Bonds in 2001 (Table 7) and Suzuki in 2002 (Table 6).

I have shown, the incriminating data is in Table 8 and Figure 2, that the IBB creates runs when batter receiving it has SA less than 0.600 . Only about 4% of the IBBs tabulated in the period covered in the study were given to batters during seasons where they has a SA greater than 0.600 . Giving the benefit of statistical doubt to the managers, at least 96% of the IBBs were offered in situations that increase the chances for a run being scored. There is some evidence that the use of the IBB is diminishing but it is still greatly overused.

Barry Bonds might not be pleased with these results but Ichiro Suzuki should be.


Table 1. Game information used in Neural Net
          Parameter        STD      R^2      R
       Runner on First   0.7583   0.1020  -0.3194
      Runner on Second   0.6238   0.3916   0.6258
       Runner on Third   0.7206   0.1891   0.4349
                Inning   0.7632   0.0899   0.2999
                  Outs   0.7643   0.0873   0.2954
      Score Difference   0.7667   0.0812   0.2850
             At Bat SA   0.7967   0.0083   0.0913
            On Deck BA   0.7934   0.0164  -0.1280
    Left-Right Matchup   0.7872   0.0313   0.1771
Batting Order Position   0.7949   0.0127   0.1125

Table 2. Accuracy of BB classification.
      UBB as                 IBB as                  UBB+IBB                HP as
      UBB   IBB fraction     IBB   UBB fraction     crrct incor  Fraction   UBB   IBB Fraction
  AL 206073 21602  0.905     18188   569  0.970     224261 22171  0.910    16311  1600  0.911
  NL 172952 18944  0.901     23231   733  0.969     196183 19677  0.909    13409  1284  0.913
 TOT 379025 40546  0.903     41419  1302  0.970     420444 41848  0.909    29720  2884  0.912

Table 3: IBB and Slugging Average by Batting Order
                     batting order position ->
            TOTAL    1     2     3     4     5     6     7     8     9
AL SA       0.403 0.382 0.392 0.455 0.461 0.433 0.413 0.390 0.360 0.332
AL IBB      17319  1278   822  2910  3300  2697  2211  2064  1643   394
AL IBB SIT 204333 14639 18700 25687 28767 25066 22536 23328 23509 22101
AL IWFR     0.085 0.087 0.044 0.113 0.115 0.108 0.098 0.088 0.070 0.018

NL SA       0.389 0.378 0.382 0.460 0.470 0.431 0.404 0.373 0.338 0.238
NL IBB      23964  1472   621  3079  4246  3031  2687  2649  5519   660
NL IBB SIT 202742 15289 17041 24778 29082 25660 23244 23894 23839 19915
NL IWFR     0.118 0.096 0.036 0.124 0.146 0.118 0.116 0.111 0.232 0.033

Table 4A. IBB Season Summaries for American League
                             Runs Scored per Inning following first IBB situation
                             Total    IBB    IBB not given         IBB given
TEAM YR     BA    SA   IWFR    IBBs    Sit    RUNS   NUMB    R/N    RUNS   NUMB    R/N  T-TEST
  AL 69  0.246 0.369  0.122     666   5459    2404   3372  0.713     327    381  0.858  0.0091
  AL 72  0.239 0.343  0.130     646   4982    1959   3044  0.644     322    382  0.843  0.0001
  AL 73  0.259 0.381  0.093     495   5307    2530   3351  0.755     285    258  1.105  0.0000
  AL 74  0.258 0.372  0.093     520   5577    2460   3459  0.711     274    290  0.945  0.0002
  AL 75  0.258 0.379  0.092     543   5898    2822   3674  0.768     227    285  0.796  0.6717
  AL 76  0.256 0.361  0.080     472   5907    2639   3713  0.711     234    261  0.897  0.0046
  AL 77  0.266 0.405  0.076     541   7074    3391   4436  0.764     265    303  0.875  0.0893
  AL 78  0.261 0.385  0.072     494   6857    3142   4339  0.724     329    285  1.154  0.0000
  AL 79  0.270 0.408  0.080     560   7019    3431   4384  0.783     347    334  1.039  0.0001
  AL 80  0.269 0.399  0.092     646   7016    3278   4308  0.761     331    372  0.890  0.0290
  AL 81  0.256 0.373  0.091     390   4306    1983   2687  0.738     212    232  0.914  0.0160
  AL 82  0.264 0.402  0.078     520   6643    3240   4180  0.775     269    294  0.915  0.0369
  AL 83  0.266 0.401  0.082     566   6888    3435   4292  0.800     364    312  1.167  0.0000
  AL 84  0.264 0.398  0.087     563   6490    3079   4090  0.753     383    331  1.157  0.0000
  AL 85  0.261 0.406  0.085     557   6517    3288   4120  0.798     307    299  1.027  0.0009
  AL 86  0.262 0.408  0.074     486   6558    3277   4214  0.778     242    264  0.917  0.0489
  AL 87  0.265 0.425  0.075     505   6777    3547   4289  0.827     280    285  0.982  0.0320
  AL 88  0.259 0.391  0.089     601   6758    3293   4191  0.786     343    346  0.991  0.0012
  AL 89  0.261 0.384  0.089     594   6693    3261   4244  0.768     310    325  0.954  0.0036
  AL 90  0.259 0.388  0.089     587   6561    3138   4149  0.756     326    340  0.959  0.0010
  AL 91  0.260 0.395  0.090     602   6718    3301   4244  0.778     281    338  0.831  0.3915
  AL 92  0.259 0.385  0.096     626   6551    3103   4107  0.756     293    342  0.857  0.0961
  AL 93  0.267 0.408  0.108     734   6793    3470   4194  0.827     364    418  0.871  0.4714
  AL 94  0.273 0.434  0.091     449   4928    2718   3093  0.879     232    234  0.991  0.1854
  AL 95  0.270 0.427  0.080     492   6134    3189   3847  0.829     241    266  0.906  0.3030
  AL 96  0.277 0.445  0.087     609   7029    3781   4294  0.881     350    328  1.067  0.0101
  AL 97  0.271 0.428  0.080     540   6748    3362   4252  0.791     307    284  1.081  0.0000
  AL 98  0.271 0.432  0.059     420   7176    3768   4589  0.821     201    211  0.953  0.1105
  AL 99  0.275 0.439  0.064     431   6694    3652   4250  0.859     247    231  1.069  0.0117
  AL 00  0.276 0.443  0.068     457   6676    3702   4263  0.868     238    246  0.967  0.2217
  AL 01  0.267 0.428  0.075     519   6957    3461   4293  0.806     267    264  1.011  0.0053
  AL 02  0.264 0.424  0.073     488   6642    3402   4189  0.812     269    284  0.947  0.0580
----------------------------------------------------------------------------------------------
 TOT 02  0.264 0.403  0.085   17319 204333  100506 128151  0.784    9267   9625  0.963  0.0000
 STD                  0.012                           STD  1.159                 1.404

Table 4B. IBB Season Summaries for National League
                              Runs Scored per Inning following first IBB situation
                              Total    IBB    IBB not given         IBB given
TEAM YR     BA    SA   IWFR    IBBs    Sit    RUNS   NUMB    R/N    RUNS   NUMB    R/N  T-TEST
  NL 69  0.250 0.369  0.139     767   5528    2360   3343  0.706     401    430  0.933  0.0000
  NL 72  0.248 0.365  0.134     727   5430    2322   3238  0.717     363    417  0.871  0.0047
  NL 73  0.254 0.375  0.152     861   5663    2472   3331  0.742     454    491  0.925  0.0005
  NL 74  0.255 0.367  0.146     833   5703    2538   3419  0.742     414    457  0.906  0.0024
  NL 75  0.257 0.369  0.129     795   6156    2568   3626  0.708     375    465  0.806  0.0503
  NL 76  0.255 0.361  0.116     687   5937    2528   3586  0.705     378    372  1.016  0.0000
  NL 77  0.262 0.396  0.122     755   6187    2691   3734  0.721     346    412  0.840  0.0270
  NL 78  0.254 0.372  0.138     844   6119    2574   3642  0.707     426    459  0.928  0.0000
  NL 79  0.261 0.385  0.129     806   6259    2642   3691  0.716     397    428  0.928  0.0001
  NL 80  0.259 0.374  0.126     790   6290    2690   3786  0.711     311    410  0.759  0.3613
  NL 81  0.255 0.364  0.123     505   4098    1724   2486  0.693     179    249  0.719  0.6980
  NL 82  0.258 0.373  0.130     799   6133    2584   3705  0.697     349    432  0.808  0.0305
  NL 83  0.255 0.376  0.134     813   6047    2569   3631  0.708     362    450  0.804  0.0565
  NL 84  0.255 0.369  0.117     707   6058    2654   3717  0.714     310    339  0.914  0.0007
  NL 85  0.252 0.374  0.128     780   6089    2761   3669  0.753     343    415  0.827  0.1841
  NL 86  0.253 0.380  0.129     803   6235    2657   3743  0.710     341    429  0.795  0.1011
  NL 87  0.261 0.404  0.124     782   6306    2851   3830  0.744     365    405  0.901  0.0053
  NL 88  0.248 0.363  0.121     766   6328    2695   3823  0.705     361    407  0.887  0.0007
  NL 89  0.246 0.365  0.142     853   5990    2514   3636  0.691     387    437  0.886  0.0002
  NL 90  0.256 0.383  0.128     797   6208    2664   3762  0.708     347    433  0.801  0.0706
  NL 91  0.250 0.373  0.103     626   6094    2720   3749  0.726     276    341  0.809  0.1527
  NL 92  0.252 0.368  0.113     689   6124    2490   3763  0.662     282    375  0.752  0.0787
  NL 93  0.264 0.399  0.105     743   7077    3271   4348  0.752     320    399  0.802  0.3746
  NL 94  0.267 0.415  0.109     559   5128    2516   3128  0.804     237    306  0.775  0.6605
  NL 95  0.263 0.408  0.099     613   6222    2970   3831  0.775     302    351  0.860  0.1679
  NL 96  0.262 0.408  0.105     732   6946    3253   4263  0.763     339    392  0.865  0.0777
  NL 97  0.263 0.410  0.088     629   7151    3398   4494  0.756     236    349  0.676  0.1757
  NL 98  0.262 0.410  0.081     647   7972    3827   5044  0.759     272    328  0.829  0.2515
  NL 99  0.268 0.429  0.083     674   8087    4176   5044  0.828     324    363  0.893  0.3123
  NL 00  0.266 0.432  0.096     753   7861    3875   4908  0.790     346    429  0.807  0.7628
  NL 01  0.261 0.425  0.113     865   7678    3764   4689  0.803     397    449  0.884  0.1502
  NL 02  0.259 0.410  0.126     964   7638    3450   4659  0.741     375    489  0.767  0.5976
----------------------------------------------------------------------------------------------
 TOT 02  0.258 0.389  0.118   23964 202742   90768 123318  0.736   10915  12908  0.846  0.0000
 STD                  0.012                           STD  1.105                 1.308

Table 5:                      Runs Scored per Inning following first IBB situation
                              Total    IBB    IBB not given         IBB given
TEAM YR     BA    SA   IWFR    IBBs    Sit    RUNS   NUMB    R/N    RUNS   NUMB    R/N  T-TEST
 ANA 02  0.282 0.433  0.071      42    591     323    374  0.864      13     19  0.684  0.5294
 BAL 02  0.246 0.403  0.057      25    441     236    290  0.814      14     18  0.778  0.8975
 BOS 02  0.277 0.444  0.075      39    518     291    317  0.918      31     25  1.240  0.2475
 CHA 02  0.268 0.449  0.040      17    426     295    270  1.093      15     11  1.364  0.5736
 CLE 02  0.249 0.412  0.090      35    391     186    260  0.715      30     22  1.364  0.0091
 DET 02  0.248 0.379  0.062      22    354     160    246  0.650       7     13  0.538  0.6675
 KCA 02  0.256 0.398  0.056      27    481     255    295  0.864      15     15  1.000  0.6784
 MIN 02  0.272 0.437  0.053      30    569     268    358  0.749      14     16  0.875  0.6436
 NYA 02  0.275 0.455  0.094      48    508     279    318  0.877      25     29  0.862  0.9494
 OAK 02  0.261 0.432  0.086      37    432     239    272  0.879      12     17  0.706  0.5752
 SEA 02  0.275 0.419  0.117      62    529     250    312  0.801      22     34  0.647  0.4445
 TBA 02  0.253 0.390  0.061      29    474     211    306  0.690      18     18  1.000  0.2039
 TEX 02  0.269 0.455  0.097      45    465     225    283  0.795      35     25  1.400  0.0178
 TOR 02  0.261 0.430  0.065      30    463     184    288  0.639      18     22  0.818  0.3815
----------------------------------------------------------------------------------------------
 TOT 02  0.264 0.424  0.073     488   6642    3402   4189  0.812     269    284  0.947  0.0580
 STD                                                  STD  1.179                 1.412

                              Runs Scored per Inning following first IBB situation
                              Total    IBB    IBB not given         IBB given
TEAM YR     BA    SA   IWFR    IBBs    Sit    RUNS   NUMB    R/N    RUNS   NUMB    R/N  T-TEST
 ARI 02  0.267 0.423  0.107      58    543     283    342  0.827      26     31  0.839  0.9593
 ATL 02  0.260 0.409  0.133      68    512     217    300  0.723      23     31  0.742  0.9236
 CHN 02  0.246 0.413  0.115      52    453     168    287  0.585      30     26  1.154  0.0028
 CIN 02  0.253 0.408  0.138      66    477     207    281  0.737      43     31  1.387  0.0035
 COL 02  0.274 0.423  0.092      40    435     256    284  0.901      18     19  0.947  0.8798
 FLO 02  0.261 0.403  0.131      69    526     232    306  0.758      15     30  0.500  0.1982
 HOU 02  0.262 0.417  0.130      57    440     242    269  0.900      13     25  0.520  0.1437
 LAN 02  0.264 0.409  0.108      50    465     223    299  0.746       9     24  0.375  0.0898
 MIL 02  0.253 0.390  0.092      37    402     165    263  0.627      12     14  0.857  0.3568
 MON 02  0.261 0.418  0.157      85    542     228    302  0.755      31     53  0.585  0.2716
 NYN 02  0.256 0.395  0.112      46    411     182    263  0.692      20     28  0.714  0.9094
 PHI 02  0.259 0.422  0.123      70    567     248    337  0.736      23     27  0.852  0.5830
 PIT 02  0.244 0.381  0.115      44    381     167    243  0.687      27     34  0.794  0.5573
 SDN 02  0.253 0.381  0.095      42    443     185    273  0.678      13     22  0.591  0.6811
 SFN 02  0.267 0.442  0.197     103    522     195    305  0.639      35     57  0.614  0.8456
 SLN 02  0.268 0.425  0.148      77    519     252    305  0.826      37     37  1.000  0.4057
----------------------------------------------------------------------------------------------
 TOT 02  0.259 0.410  0.126     964   7638    3450   4659  0.741     375    489  0.767  0.5976
 STD                                                  STD  1.164                 1.216

Table 6. Player Hitting in IBB Situations: Top 10 by Season IWFR
                                                               IBB not given     IBB given
RANK  PLAYER            TM YR    AB    BA    SA   BB IBB  IWFR RUNS NUMB   R/N   RUNS NUMB   R/N

   1  Barry Bonds      SFN 02   403 0.370 0.799  198  68 0.680   11   22  0.500   19   38  0.500
   2  Willie McCovey   SFN 69   491 0.320 0.656  121  45 0.553   28   30  0.933   23   31  0.742
   3  Ichiro Suzuki    SEA 02   647 0.321 0.425   68  27 0.522    6   15  0.400    9   14  0.643
   4  Vlad. Guerrero   MON 02   614 0.336 0.593   84  32 0.475   28   29  0.966   21   20  1.050
   5  Barry Bonds      SFN 93   539 0.336 0.677  126  43 0.471   29   24  1.208   22   19  1.158
   6  Wade Boggs       BOS 91   546 0.332 0.460   89  25 0.458   11   15  0.733   21   14  1.500
   7  Spike Owen       MON 89   437 0.233 0.332   76  25 0.455    9   24  0.375   12   11  1.091
   8  Tony Gwynn       SDN 87   589 0.370 0.511   82  26 0.453   22   23  0.957   26   19  1.368
   9  Garry Templeton  SDN 85   546 0.282 0.377   41  24 0.444   16   24  0.667    1   17  0.059
  10  Barry Bonds      SFN 96   517 0.308 0.615  151  30 0.439   23   32  0.719   23   24  0.958
                       TOTALS  5329 0.322 0.538 1036 345 0.501  183  238  0.769  177  207  0.855

Table 7. Player Hitting in IBB Situations: All with season SA >= 0.700
                                                               IBB not given     IBB given
RANK  PLAYER            TM YR    AB    BA    SA   BB IBB IWFR  RUNS NUMB   R/N   RUNS NUMB   R/N
   1  Barry Bonds      SFN 01   476 0.328 0.863  177  35 0.367   37   48  0.771   19   24  0.792
   2  Barry Bonds      SFN 02   403 0.370 0.799  198  68 0.680   11   22  0.500   19   38  0.500
   3  Mark McGwire     SLN 98   509 0.299 0.752  162  28 0.333   51   42  1.214    6   13  0.462
   4  Jeff Bagwell     HOU 94   400 0.367 0.750   65  14 0.155   59   47  1.255    8    5  1.600
   5  Sammy Sosa       CHN 01   577 0.328 0.737  116  37 0.393   40   40  1.000   18   18  1.000
   6  Mark McGwire     OAK 96   423 0.312 0.730  116  16 0.254   32   32  1.000    4   10  0.400
   7  Frank Thomas     CHA 94   399 0.353 0.729  109  12 0.226   32   31  1.032    8   11  0.727
   8  Larry Walker     COL 97   568 0.366 0.720   78  14 0.173   57   50  1.140   11    8  1.375
   9  Albert Belle     CLE 94   412 0.357 0.714   58   9 0.143   41   41  1.000    3    5  0.600
  10  Larry Walker     COL 99   438 0.379 0.710   57   8 0.167   35   31  1.129    4    2  2.000
                       TOTALS  4605 0.345 0.750 1136 241 0.299  395  384  1.029  100  134  0.746

Table 8. Players having >= 400 PA Runs Scored per Inning following first IBB situation
                                                  IBB not given     IBB given
     Range   NUMB   BA    SA      BB   IBB  IWFR  RUNS NUMB   R/N   RUNS NUMB   R/N   T-TEST
     SA<0.40 2246  0.259 0.354 97954  8111 0.095 37496 51250  0.732 3476 4370  0.795  0.0001
0.40<SA<0.45 1457  0.276 0.425 73252  7145 0.108 32137 39819  0.807 3656 3819  0.957  0.0000 
0.45<SA<0.50 1056  0.286 0.472 58997  6245 0.118 27075 31979  0.847 3366 3374  0.998  0.0000 
0.50<SA<0.55  543  0.295 0.522 33817  3894 0.129 15802 18268  0.865 2207 2164  1.020  0.0000
0.55<SA<0.60  219  0.308 0.572 15849  2136 0.159  7309  7952  0.919 1229 1185  1.037  0.0042
0.60<SA<0.65   80  0.317 0.621  6409  1012 0.197  2876  2905  0.990  590  579  1.019  0.6509 
0.65<SA<0.70   19  0.331 0.676  1850   354 0.256   762   738  1.033  184  207  0.889  0.1984 
0.70<SA        10  0.345 0.750  1136   241 0.299   395   384  1.029  100  134  0.746  0.0396

Table 9. Barry Bonds - Career
Runs Scored per Inning following first IBB situation
                                         IBB not given     IBB given
TM  YR    AB   BA    SA    BB IBB IWFR   RUNS NUMB   R/N   RUNS NUMB   R/N
SFN 02   403 0.370 0.799  198  68 0.680   11   22  0.500    19   38  0.500
SFN 01   476 0.328 0.863  177  35 0.367   37   48  0.771    19   24  0.792
SFN 00   480 0.306 0.688  117  22 0.333   32   34  0.941    19   15  1.267
SFN 99   355 0.262 0.617   73   9 0.257   23   23  1.000     9    6  1.500
SFN 98   552 0.303 0.609  130  29 0.302   43   50  0.860    25   19  1.316
SFN 97   532 0.291 0.585  145  34 0.405   35   35  1.000    17   23  0.739
SFN 96   517 0.308 0.615  151  30 0.439   23   32  0.719    23   24  0.958
SFN 95   506 0.294 0.577  120  22 0.353   20   27  0.741     7   10  0.700
SFN 94   391 0.312 0.647   74  18 0.295   39   41  0.951    19   13  1.462
SFN 93   539 0.336 0.677  126  43 0.471   29   24  1.208    22   19  1.158
PIT 92   473 0.311 0.624  127  32 0.344   35   41  0.854    18   17  1.059
PIT 91   510 0.292 0.514  107  25 0.289   54   42  1.286     6   15  0.400
PIT 90   519 0.301 0.565   93  15 0.221   34   36  0.944     7    8  0.875
PIT 89   580 0.248 0.426   93  22 0.322   12   26  0.462     7    6  1.167
PIT 88   538 0.283 0.491   72  14 0.261   18   27  0.667    12   11  1.091
PIT 87   551 0.261 0.492   54   3 0.079   14   26  0.538     0    0  0.000
PIT 86   413 0.223 0.416   65   2 0.069   13   18  0.722     0    0  0.000
--------------------------------------------------------------------------
CAREER  8335 0.295 0.595 1922 423 0.347  472  552  0.855   229  248  0.923





Figure 1. Season IWFR for the AL and NL




Figure 2. Average Runs Scored to End of Inning Following First IBB Situation



Appendix - More on the Neural Net


For those technically inclined, the neural net used in this project is a standard back propagation net with one layer of hidden units. Ten input units are used (corresponding to each parameter in the game situation with each base given a separate input unit) and eleven hidden units are used. A hidden unit is a weighted sum over the input unit values plus a constant. The hidden unit computation has the same form and complexity as a linear regression formula evaluation. The result of this sum is passed through an activation function, sigma(x), to generate the hidden unit output. The activation function in this case is sigmoidal, having asymptotic values of +1 and -1 and a slope of 1.0 at x=0. The output unit is a sum of a constant term and the weighted outputs of the hidden units. Subjecting the output unit sum to the same activation function completes the neural net calculation. This neural net requires about twelve times the computation of evaluating a linear regression equation for the same input parameters in addition to the eleven activation function evaluations. It contains 133 coefficients which, unlike a linear regression formula, do not have an interpretation in terms of the input parameters. There are no requirements for the linear independence of the parameters or the range of values used.

A training procedure using the available data determines the value of each coefficient by minimizing the sum of the squared errors for all the events in the training data set.

The neural net as a whole can be given a geometric interpretation. Consider a neural net with two input units (parameters). Each hidden unit has the form F(X,Y) = AX + BY + C with A, B, and C the values that are determined for a hidden unit by the training procedure. Each hidden unit will have a unique set of values for A, B and C. F(X,Y)=0 is the equation for a line in a plane. For any pair of values for X and Y, F(X,Y) gives the signed distance from the line to the point X,Y. Using three such hidden units (equations) defines three lines. A triangle, the simplest figure enclosing an area, is formed from three lines. The output unit weights the values of the three hidden units to define the final area defining the two categories.

Figure 3 provides an example of using a three hidden unit back propagation neural net to create a decision "surface" separating points in a plane. The two categories of points, "A" and "B", are generated using a random number generator and have no significance beyond a simple demonstration of the neural net technique. The "A" points are uniform in a statistical sense over the entire area. The "B" points are concentrated in the center of the area. The curve enclosing the "B" points clearly shows the three lines corresponding to the three hidden units and the blending of them into a single boundary (green) by the output unit. There are 1000 data points in each category and the neural net correctly classified 1847 of them.

Moving to three dimensions, the hidden units describe planes and give the distance of three dimensional points to the plane. The simplest solid object composed from planes is the four sided tetrahedron. Four hidden units are appropriate when there are three input units. While I can't visualize higher dimensions, the equations have the same form. This discussion provides the primary reason for using eleven hidden units in the IBB situation classification neural net. In practice the classification error improves, but only slightly, as the number of hidden units is increased.

Many different kinds of neural nets have been invented. There is no assurance that the back propagation net is the optimum method for this task. The theory of this particular form of the neural net used is covered in “Neural Networks”, Laurene Fausett, Prentice Hall, 1994, and other textbooks.

 
Figure 3. Neural Net with Two Parameters

Copyright 2003, John F. Jarvis