A Survey of Baseball Player Performance Evaluation Measures

John F. Jarvis

Assigning numerical measures to various aspects of a ball player's performance is a convenient and perhaps useful, if not completely defensible, practice. A numerical measure certainly eases the problem of rank ordering players. Hitting measures such as batting average and it's cousins are the best known known of these measures. The ease and finality of such comparisons guarantees that these measures will always be with us even if what is being ranked is not entirely clear. So, if we are going to do this, we should use the best measures available. My purpose with this paper is to to systematically compare a number of well known and a few less well known measures that have been proposed to summarize a player's offensive capabilities. This is not a new activity. Both the Bennett&Flueck paper and HGB book contain such analysis.

The starting point for any player evaluation measure are the counting statistics collected over some period of time, perhaps a season or a career. Examples of counting statistics include the number of stolen bases, number of hits and number of assists on defense. Any event that be adequately described in baseball can be counted and the resulting number potentially can be used as a component part of the calculation of a player evaluation measure.

Intuitively, if a measure does a good job of rating the offense, it should also work well on defense. A secondary purpose of the paper is to show that offensive evaluation measures are equally applicable to the defense, if the same counting statistics are provided.

There are two types of measures in common use. The first form is the linear weights method which is a linear combination of basic counting statistics using empirically determined coefficients or weights. The second, and more common method is to define a complex and usually nonlinear combination of the basic quantities (various sums, products and ratios) that empirically reproduces individual or team runs.

The data used in the following discussions are the 1348 team season records derived from full season event files for both leagues for the 1954 to 2008 seasons obtained from Retrosheet, Inc. I have written a parser for these data sets and it is used to provide the team summary data used in the following calculations.

A formula where each coefficient appears linearly can have these coefficients determined by the linear regression procedure (Numerical Recipes in C, 2nd Edition, 1992, Chapter 15). As an example, consider the following simple linear weights formula:

RUNS = W0*TotalBases + W1*(Walks+HitByPitch)+ W2*(Ab-hits)+W3*Errors

This particular linear weights formulation will be referred to as BWOE which is suggested by its component parts, Bases, Walks plus hit by pitch, Outs and Errors.

The linear regression procedure minimizes the sum of squared differences between the actual runs scored by a team and the value estimated from the formula, the chi-square value, by determining the 4 coefficients, W0 - W3. For obvious reasons this procedure is also called Least Squares Fitting. The regression coefficients, Wi, effectively give the number of runs each single event contributes to the team runs. The following table shows the results of the regression based on these four terms.


       param   weight  average  contrib
 0     BASES   0.3761   2143.4   806.06
 1     BB+HP   0.3124    515.5   161.03
 2   AB-HITS  -0.0940   4006.9  -376.52
 3      ERRS   0.6157    178.2   109.73

Table 1. Four Parameter Linear Weights Formula, "BWOE"

The square root of Chi Square (the sum of the squares of differences between the data value and calculated value from the regression) divided by the number of data points (1348) is the Standard Deviation and is abbreviated STD. Smaller values of the STD or Chi Square indicate a better representation of the actual runs scored. The standard deviation for this regression is 22.43 runs. Column "average" is the per season value for the parameter averaged over the entire data set. "contrib" is the weight for the parameter times the average number thus represents the number of runs due to the particular parameter. Table 4, which is a summary of the various runs estimators, also includes this result.

To achieve the best results, defined here as the smallest STD (18.8 runs) for the regression, a number of additional counting statistics are used in Table 2. One of the advantages of using the full season events files is the ease of determining non traditional events counting statistics. Three such statistics have been defined and determined. ROH is the count of runners out on a hit, either base runners or the batter trying unsuccessfully for an additional base. RAO is runner advance on an out which includes the traditional sacrifice. Official scoring only credits a sacrifice when there is intent on the part of the hitter. Advances on an out are much more common than official sacrifices suggest. Any base advance increases the potential for scoring thus all such events must be counted. Similarly RSO, Runs Scored on an Out, generalizes the sacrifice fly. ER_BF and ER_RA are the number of times the batter is safe on first and the number of times a runner was able to advance on an error. The OUT category includes outs that are not counted in other categories (CS, ROH, second out on a GDP, K). The meaning of the other counting statistics should be obvious.


       param   weight  average  contrib
 0      OUTS  -0.0988   3031.1  -299.44
 1      SNGL   0.4827    981.6   473.78
 2      DOUB   0.6347    246.1   156.22
 3      TRIP   1.0233     35.0    35.82
 4      HRUN   1.4413    141.1   203.41
 5     BB+HP   0.3366    515.5   173.50
 6       IBB   0.1522     48.5     7.39
 7        SB   0.0555     96.2     5.33
 8        CS  -0.1628     55.9    -9.10
 9       ROH  -0.3578     26.3    -9.39
10       GDP  -0.4707    149.2   -70.24
11     ER_BF   0.5907     73.1    43.19
12     ER_RA   0.3037    105.1    31.92
13       RAO   0.0100    196.1     1.95
14       RSO   0.6546     81.9    53.62
15         K  -0.1075    908.2   -97.62

Table 2. The 16 Parameter Linear Weights Formula, "ALL"

Table 2 provides considerable insight into the value of these events. It is a reasonably complete list of offensive events. Stolen bases only make a modest contribution to team runs. Each caught stealing event has about three times the influence as a stolen base. Errors of any kind contribute significantly to the offense. Errors allowing the batter to advance to first are somewhat more beneficial to the offense compared to errors that allow a base runner to advance. Fielding outs and strike outs, K, have approximately the same influence on the outcome of a game. BB includes only unintentional bases on balls as intentional bases on balls are included as a separate term (IBB). The much smaller weight for IBBs compared to BBs suggests that the practice is a useful one for the defense even if the net results is to slightly increase runs scored. The category OUT does not include outs included in other categories: K, CS, RSO and the second out in GDB. The weight of RSO (Run Scored on an Out) is less than one as the negative cost of the out must be accounted for. A linear weights formula can provide great insight into the importance of particular events as well as providing a means for measuring the offense.

Table 3 shows the regression results for a group of commonly available statistics. As might be expected the STD for this lists is between what is shown in Tables 1 and 2.


       param   weight  average  contrib
 0      SNGL   0.4843    981.6   475.38
 1      DOUB   0.7393    246.1   181.96
 2      TRIP   1.1697     35.0    40.95
 3      HRUN   1.4281    141.1   201.54
 4     BB+HP   0.3334    515.5   171.86
 5   AB-HITS  -0.1145   4006.9  -458.97
 6      ERRS   0.4918    178.2    87.65

Table 3. The 7 Parameter Linear Weights Formula, "Basic"

The methodology for this offensive measure survey, following closely the HGB, pp 57-61, and Bennett&Flueck presentations on the same topic, is to determine the constants SLOPE and INTERCEPT in the following linear regression formula:

RUNS = SLOPE*MEASURE + INTERCEPT

The resulting formula is used to evaluate the goodness of the fit by evaluating the Chi-square statistic over the entire team season data set. The direct linear weights measures ("ALL", "Palmer+errors", "BASIC", "BWOE") are also subjected to the linear fitting procedure even though they are, by definition, already linear regression formulas. BASIC is defined in Table 3. This results in slopes near 1 and intercepts near 0. If a constant term had been included in the linear weight formulas, the slope would be exactly 1 and the intercept would be 0. The linear fit also allows the calculation of the correlation coefficient which is useful for the comparisons. The linear fit can be considered a conversion of the measure into runs. The correlation and standard deviation results are intrinsic to the measure. The linear fit makes the relationship explicit.


rank                measure     slope  intrcpt      R^2      chisq      std
   1          All (Table 2)    0.9982     1.31  0.96561     477592     18.8
   2          Palmer+errors    0.9973     1.91  0.96094     542474     20.1
   3        Basic (Table 3)    0.9996     0.26  0.95691     598499     21.1
   4         EqR(Davenport)    0.9779    15.03  0.95373     642628     21.8
   5            XR(Furtado)    0.9773    28.11  0.95123     677268     22.4
   6         BWOE (Table 1)    0.9952     3.46  0.95121     677649     22.4
   7         LWTS(unnormed)    0.9948     5.03  0.95048     687687     22.6
   8              OERA+EFBF    0.8701    11.63  0.95008     693372     22.7
   9           ERP(Johnson)    0.9737    20.11  0.94740     730542     23.3
  10              ERPA(B&F)    0.9581    45.41  0.94676     739477     23.4
  11               RC(1998)    0.9317    54.69  0.94564     754906     23.7
  12           XRR(Furtado)    0.9741    33.58  0.94553     756505     23.7
  13             RC(Tech-1)    0.9300    52.43  0.94547     757357     23.7
  14       Base Runs(Smyth)    0.9714    22.46  0.94518     761305     23.8
  15         ERP-2(Johnson)    0.9607    34.04  0.94393     778724     24.0
  16       LWTS(KenSchmidt)    0.8958   -50.42  0.94316     789390     24.2
  17            OERA(Cover)    0.8902    51.99  0.94261     797002     24.3
  18           OPA3(Pankin)    0.3954     7.09  0.94080     822143     24.7
  19              RC(basic)    0.9504    36.71  0.93798     861287     25.3
  20            LWTS(fixed)    0.9658    22.71  0.93788     862728     25.3
  21             BOP(Codel)    0.2278  -115.48  0.93316     928241     26.2
  22         AB*TA(Boswell)    0.2272   -74.52  0.92393    1056506     28.0
  23            AB*DX(Cook)    1.0171    36.40  0.92198    1083529     28.4
  24        AB*BRA(OBP*SLG)    1.0481    28.84  0.92092    1098347     28.5
  25            BWA(Cramer)    0.8545    13.22  0.91564    1171672     29.5
  26            BASES+BB+HP    0.3238  -176.19  0.90766    1282440     30.8
  27   OBP+(3/4)ISP(Rickey)    0.3984  -230.39  0.89572    1448251     32.8
  28                 AB*PRO    0.2444  -258.66  0.87678    1711283     35.6
  29                  BASES    0.3825  -119.53  0.86989    1806946     36.6
  30             HITS+BB+HP    0.5074  -298.13  0.79057    2908619     46.5
  31                 AB*OBP    0.5858  -343.15  0.76977    3197452     48.7
  32                    ISP    0.5986   257.66  0.74554    3534045     51.2
  33            OPS=SLG+OBP 1862.1326  -649.56  0.73224    3718676     52.5
  34          TBR(O'Reilly) 2625.6358  -479.22  0.71483    3960463     54.2
  35                   HITS    0.6763  -249.02  0.69569    4226371     56.0
  36                    SLG 2517.2123  -295.92  0.68757    4339060     56.7
  37                     BA 5682.6046  -773.36  0.52894    6542241     69.7
  38                AVERAGE    0.0000   700.37  0.00000   13888291    101.5

Table 4. A Survey of Baseball Offense Evaluation Measures.

It is necessary to distinguish between COUNTING stats, the number of times the particular event being counted occurred and RATE stats, a counting stat that has been divided by the number of chances used to get the particular counted value. The best known rate stat is BA, the batting average, defined as hits/atbats. Since a rate stat does not reflect the number of chances used it will not be a good predictor of counting stats. Since team runs scored is the counting stat being estimated in Table 4 the poor showing of the rate stats (BA, SLG, OPS and TBR) is not unexpected. The counting version of BA is total hits (HITS) which is a somewhat better predictor of team runs. Similarly compare SLG with BASES and TBR with BASES+BB+HP. In Table 4 several other rate stats have been multiplied by AB indicated by AB* in the measure column.

The simple AVERAGE of all the runs scored is given as this establishes a maximum for both the Chi-square and Standard Deviation thus provides a reference for evaluating the other measures. The column R^2 is the square of the linear correlation coefficient between the measure and observed runs. The conventional interpretation is that the correlation coefficient squared is the fraction of variance (Chi-square) in runs explained by the measure.

The first seven entries in Table 4 are linear weights formulas. ALL and BASIC are defined in the discussions of Tables 2 and 3. "Palmer+errors" is a linear regression that uses the same terms as the Palmer linear weights formula and an additional errors term. "BWOE", is a four parameter linear weights formula that summarizes all hitting with total bases instead of the four individual hit type counts (Table 1). Total bases in combination with at bats, errors and walks plus hit by pitch is a simple and effective measure for estimating team runs from basic hitting statistics. When used as an offensive runs measure for a player, the error term would be dropped. The component of these linear weights formulas leading to their excellent performance is the inclusion of a negative weight for outs made. The negative value of an out can be explicitly seen in Tables 1 - 3.

The four linear weights formulas that had their weights explicitly determined by regressions clearly provide the best representation of runs scored from the available team season data. Not surprisingly, including a larger selection of counting statistics results in a more accurate formula.

A Linear Weights formula determined from team season data should include an team errors term. This is evident as error contributions to runs scored are estimated to be around 10% of total runs allowed depending on the details of the particular formula being used. The effect of errors is larger than several other terms often used. The use of an explicit error term preserves the accuracy of the other terms in the regression.

OERA is the Cover and Keilers Offensive Earned Run Average. Their method determines the expected future runs for all 24 combinations of outs and base runners in an inning. OERA is 9 times the future runs expectation for no outs and no one on base. A separate document Implementing the Cover-Keilers Offensive Earned Run Average provides additional details of this measure including source code implementing its calculation. Expected Future Runs have been determined from the data used in this study and are compared to the OERA predictions. Since OERA was explicitly designed to be a player evaluator it did not include the effect of defensive team errors. Errors allowed a batter to reach first have a very similar effect as singles. Adding the number of such errors to the number of singles then doing the calculations is represented by the entry OERA+ER_BF. This combination provides a larger overestimate of team runs than the basic OERA but also correlates better with them.

Pete Palmer has introduced a method for evaluating the offensive contribution of a ball player known as linear weights. Three variations of this method are in Table 4, each starting LWTS. A key feature of Palmer's linear weights system is normalization. An average player or team will have a rating of 0 runs. This is accomplished by adjusting the weight of the AB-Hits term so that a league will have 0 runs for the season. A separate weight is required for each league and each season. This calculation is the (normed) variant in Table 4. The (unnormed) variant adjusts the AB-Hits weight for each league season to obtain agreement with total runs for the season. The (fixed) variant uses a single AB-Hits weight obtained by averaging the individual AB-Hits weights from the (unnormed) calculation. The (unnormed) calculation ranks among the better ones and (fixed) is only slightly inferior.

The other measures in the survey include Paul Johnson's ERP, Estimated Runs Produced, BJBA85 pp 277-281 which is also a linear formula. The Bill James Runs Created formulas Tech-1, RC(tech1), and basic, RC(basic), are defined in BJHBA(1987) pp 279 and 281. A newer version of Runs Created from the Bill James ML Handbook for 1998 is given as RC98 and is slightly more accurate than the older versions. BOP is Base Out Percentage by Barry Codell and is defined in BRJ79 pp 35-39. BWA is Richard Cramer's Batter Win Average, BRJ77 pp 74-79. TA is Tom Boswell's Total Average and is defined in TB5. Earnshaw Cook introduced gave us "Cooks Scoring index" abbreviated DX, HGB p 45. BRA is Dick Cramer's Batter Run Average defined as the product of On Base Percentage and Slugging Average. PRO is production which is defined as the sum of On Base Percentage, OBP, and Slugging Average. ISP is isolated power and is the difference between slugging average and batting average. Total Bases, total hits and OBP are also shown as hitting evaluation measures. Curiously, batting average and slugging average are not as effective a predictor of runs as are total hits and total bases. ERPA is from Bennet&Flueck. EqR is Clay Davenport's Equivalent Runs and is documented in the Baseball Prospectus. Jim Furtado's extended runs formulas are XR and XRR. OPA3 is from Mark Pankin's article in Operations Research. TBR = (BASES+BB+HP)/(AB+BB+HP+SF). While I have tried to find and cite the original references on the measures in Table 4, the best current source for such information is the Glossary of Statistical Terms in TB7.

Since the original posting of this paper, Ken Schmidt and David Smyth have sent me their evaluators which are identified by their names.

Specifically excluded from this analysis are any formulas that include runs batted in (RBI) or runs. Since runs are to be explained or predicted by the measures, using runs or RBIs in a measure is essentially circular reasoning (violates the need for linear independence of the quantities used in the regression).

Evaluating the offensive component of a baseball player appears to be straight forward. The Tech-1 version of Runs Created is an excellent measure and deserves the increasing usage it is receiving. However, a well crafted linear weights formula can provide the most accurate measure relating team counting statistics to team runs scored. A linear weights measure also allows evaluation of the same formula using individual player statistics without increasing uncertainty by more than the effects caused by the size of the statistical sample.

Application of the Offensive Evaluation Measures to the Defense

At present, the official statistical record for the defense is not kept to the same level of detail as for the offense. However, any quantity that can be defined and counted for offensive evaluation purposes can be defined for the defense. The very great advantage of working with full season events files is the ability to define and determine non traditional statistics. All the team related counting statistics used in the preparation of Tables 1 - 4 also have been determined for the defense. The same measure survey, using exactly the same methodology as described for the offense, was done on the defensive statistics with the results shown in Table 5.


rank                measure     slope  intrcpt      r^2      chisq      std
   1                    All    0.9970     2.14  0.96545     498562     19.2
   2          Palmer+errors    0.9971     2.09  0.96127     558920     20.4
   3                  Basic    0.9999     0.10  0.95556     641320     21.8
   4                   BWOE    0.9963     2.68  0.95187     694622     22.7
   5         EqR(Davenport)    0.9836    10.96  0.95019     718764     23.1
   6            XR(Furtado)    0.9892    19.91  0.94802     750149     23.6
   7              OERA+EFBF    0.8539    23.95  0.94656     771239     23.9
   8         LWTS(unnormed)    1.0166   -10.22  0.94633     774593     24.0
   9               RC(1998)    0.9362    51.46  0.94356     814502     24.6
  10             RC(Tech-1)    0.9348    49.03  0.94309     821229     24.7
  11           ERP(Johnson)    0.9800    15.72  0.94243     830783     24.8
  12           XRR(Furtado)    0.9828    27.63  0.94172     841030     25.0
  13       Base Runs(Smyth)    0.9840    13.37  0.94120     848509     25.1
  14              ERPA(B&F)    0.9682    38.44  0.94118     848864     25.1
  15         ERP-2(Johnson)    0.9723    25.99  0.94035     860789     25.3
  16       LWTS(KenSchmidt)    0.8991   -53.23  0.93758     900856     25.9
  17            OERA(Cover)    0.8840    56.08  0.93757     900886     25.9
  18           OPA3(Pankin)    0.3993     0.30  0.93586     925694     26.2
  19            LWTS(fixed)    0.9856     8.83  0.93284     969191     26.8
  20              RC(basic)    0.9493    37.31  0.93109     994434     27.2
  21             BOP(Codel)    0.2308  -126.17  0.92669    1057887     28.0
  22         AB*TA(Boswell)    0.2330   -94.55  0.92405    1096086     28.5
  23            BWA(Cramer)    0.8436    21.89  0.92181    1128403     28.9
  24            AB*DX(Cook)    1.0191    35.00  0.91408    1239979     30.3
  25        AB*BRA(OBP*SLG)    1.0463    29.86  0.91302    1255302     30.5
  26            BASES+BB+HP    0.3311  -195.94  0.90291    1401084     32.2
  27   OBP+(3/4)ISP(Rickey)    0.4052  -246.19  0.88779    1619331     34.7
  28                 AB*PRO    0.2465  -267.05  0.86999    1876233     37.3
  29                  BASES    0.3905  -136.52  0.85446    2100334     39.5
  30             HITS+BB+HP    0.5099  -303.15  0.81297    2699078     44.7
  31                 AB*OBP    0.5921  -354.32  0.79403    2972460     47.0
  32                    ISP    0.6558   215.38  0.74313    3706919     52.4
  33                   HITS    0.6932  -272.77  0.73146    3875330     53.6
  34            OPS=SLG+OBP 1860.1040  -648.08  0.73116    3879705     53.6
  35          TBR(O'Reilly) 2716.1334  -519.88  0.71842    4063615     54.9
  36                    SLG 2590.4804  -324.94  0.68186    4591124     58.4
  37                     BA 5704.1056  -778.93  0.58088    6048508     67.0
  38                AVERAGE    0.0000   700.37  0.00000   14431303    103.5

Table 5. Standard Evaluation Measures Applied to the Defense

All the discussions following Table 4 apply to Table 5. The standard deviations achieved and the relative rankings of the different measures are essentially the same using either offensive or defensive statistics.

There are two additional counting statistics available for defensive play analysis: put outs and assists. Put outs are redundant in the regressions since each out must be included in only one linear weights term. Assists are new information thus potentially could be expressed as a runs value by the regression. Table 6 shows the results obtained by adding assists to the set of parameters defining the linear regression, ALL.


       param   weight  average  contrib
 0      OUTS  -0.1025   3031.1  -310.55
 1      SNGL   0.4828    981.6   473.91
 2      DOUB   0.6778    246.1   166.82
 3      TRIP   1.1277     35.0    39.48
 4      HRUN   1.3949    141.1   196.86
 5     BB+HP   0.3498    515.5   180.32
 6       IBB   0.1178     48.5     5.71
 7        CS  -0.3077     55.9   -17.20
 8       ROH  -0.6643     26.3   -17.44
 9       GDP  -0.4556    149.2   -67.99
10        SB   0.1057     96.2    10.16
11     ER_BF   0.6350     73.1    46.43
12     ER_RA   0.3019    105.1    31.74
13       RAO  -0.0769    196.1   -15.08
14       RSO   0.5775     81.9    47.30
15         K  -0.1075    908.3   -97.63
16   ASSISTS   0.0161   1703.0    27.46

Table 6. A Linear Weights Determination for Defensive Statistics

Compared to the entry for ALL in Table 5, the Standard Deviation is unchanged. The total runs value of assists is +27.5 and in a defensive linear weights measure they would be expected to lead to reduced runs allowed. Consequently, assists do not appear to be significant at this level of analysis and demonstrably can not be associated with a runs value in a linear weight formula.

It is possible to combine both the offensive and defense data sets into a single one and perform the same survey calculation. The relative rankings and standard deviations are essentially the same providing another demonstration that all events have the same meaning to both the offense and defense.

Conclusions

Linear weights is best considered a technique for generating an evaluation measure. Different combinations of parameters and differing sets of data will lead to different values for individual parameters in these formulas. When a judicious set of parameters is used, linear weights formulas appear to provide the most accurate evaluation measures available. In addition, a linear formula can be developed using one set of data, team season results for example, and be applied meaningfully to other tasks such as evaluating individual players.

The four term formula, BWOE, is a particularly good combination of accuracy and simplicity. In spite of its simplicity it is more accurate than any of the existing measures for predicting team runs for the data available to this study. When used for player evaluation only three coefficients and three readily available statistics are needed.

Errors must be accounted for when determining the coefficients in any, but especially a linear weights, evaluation measure. When the measure is applied to an individual player the error term is simply not used.

The relative ranking of offensive runs estimation measures from the basic counting statistics is essentially the same as given by Thorn and Palmer (HGB pp 58-59). The only significant difference between HGB and this paper is their linear weights result. In this investigation their formula yields standard deviations of about 24 runs when applied to single or groups of seasons having the same number of games. Offensive player evaluation is in good shape with several strong techniques vying for general acceptance.

When the same counting statistics are available for the defense as for the offense, the hitting measures do an equally good job of estimating runs allowed. All the methods used to rate offense are applicable to defensive evaluations. Assists do not improve the accuracy of linear weights formulas when they are applied to the defense.

References

"An Evaluation of Major League Baseball Offensive Performance Models", Jay M. Bennett & John A. Flueck, American Statistician Vol 37 No, 1 (Feb 1983), pp 76-82
Baseball Prospectus (Brasseys, 1998)
BJBAyy is the Bill James Baseball Abstract for year yy. (Balantine Books)
BJHBA is the Bill James Historical Baseball Abstract. (Villard Books, 1986)
BRJyy is the SABR Baseball Research Journal for year yy.
"Evaluating Offensive Performance in Baseball", Mark D. Pankin, Operations Research 26, #4, Jul-Aug 1978, pp 610-619
HGB is the Hidden Game of Baseball by John Thorn and Pete Palmer. (Doubleday, 1984)
"An Offensive Earned-Run Average for Baseball", Thomas M. Cover & Carroll W. Keilers, Operations Research, Vol 25, No 5, Sep. 1977
TB7 is Total Baseball, Seventh Edition. John Thorn, Pete Palmer and Michael, (Total Sports Publishing, 2001)

Revision History

June 13, 1998: Original Posting to the Web

August 6, 1998: 1980 season statistics added to the analysis. A completely new treatment of the Palmer Linear Weights system was used. A paper copy of this version of the document is on file with the SABR Archives.

January 3, 2000: Data from 1980 -1998 seasons was used in the analysis. A few additional evaluators were included and several references were added to the bibliography.

June 2, 2000. Data from the 1979 - 1999 seasons has been used in the analysis.

June 6, 2001. Data from the 1978 - 2000 seasons has been used in the analysis. The linear weights formula BASIC was added to the analysis.

July 4, 2002. Data from the 1969 and 1974-2001 seasons has been used in the analysis.

September 29, 2003. Data from 1963, 1967 and 1968 AL and both leagues in 1969, 1972-2002 has been used in the analysis.

August 7, 2006. All RETROSHEET available seasons from 1957 to 2005 are included in the analysis.

January 28, 2008. All RETROSHEET available seasons from 1956 to 2007 are included in the analysis.

April 4, 2008. An error is the counting of CS in OUTS was corrected. Only the OUTS and CS lines of Tables 4 and 6 are changed. No changes are made to the conclusions.

January 1, 2008. All RETROSHEET available seasons from 1954 to 2008 are included in the analysis.

Back to the J. F. Jarvis Baseball page.


Copyright © 1998-2009, John F. Jarvis