1) Introduction
This study addresses the concept of projecting a players career -
estimating how he will do in one or more future seasons based on what
has been done so far in a career. A second possible use of projection
systems is interpolation, estimating what might have been
accomplished when a player misses all or most of a season.
Perhaps the best known system of this kind is Brock2 introduced by
Bill James. I will also describe a system of my own design and the
simplest projection system: assume a player will do the same next
year as he did last year. Projections made by each of these will be
compared to the actually achieved results.
2) Player Evaluation Measures
Since a single number is being used to represent player achievement,
the difficult question of what to use must be faced. Many different
quantities have been proposed for this purpose. Counting stats, as
well as various combinations them, can be used. Counting stats used
are RUNS, RBIS, hits, and home runs. The combinations include James'
Runs Created (BASIC, see Total Baseball 7th Ed. pp 2498-2499),
(RUNS+RBIS)/2 and two linear weights formulas, BWOE and SDTHWOE. Each
letter in the linear weights formulas represents a term in the
formula. STDH represents the number of singles, doubles, triples and
home runs. B is total bases, E is errors and O is outs defined as AB
- Hits. The weights for the terms have been determined by linear
regression using team data from 1978-2001. When used as a measure of
a player's contribution, the error term is not included in the linear
weights formulas. Both linear weights formulas estimate seasons runs
contributed by the player. The SDTHWO measure is used in this study.
Details on the linear weights formulas and the efficacy of some of
the other measures is available in my web published study
A
Survey of Baseball Player Performance Evaluation
Measures. In this study SDTHWO is referred
to as BASIC.
3) Data Source
The source of the batting data used in this study is the
Sean
Lahman version 4.5 database available.
Using the Lahman version 4.5 database for the years 1910 to 2001,
batting careers were assembled for players that are complete within
the years 1911 and 2000. Pitcher records are included. Completeness
is defined as not having played in 1910 or 2001. This is a weak
definition of completeness, especially at the beginning of a career.
This version of the database does not include fielding position
information, which is needed by the Brock2 system.
4) The Career Function
I feel very strongly that any formula containing arbitrary or data
related numeric constants that is used to represent or evaluate
player achievements must derive these constants from the game's
statistical record. Among the formulas included in this category are
the multitudinous linear weights batting evaluation formulas and the
projection formulas discussed in this study. The following discussion
of my career function will describe one method for determining values
for seemingly arbitrary constants.
The form of the career summary function I have developed, CFTN, is suggested by the typical ML career: a period of development, mid career at full capabilities followed by the inevitable decline due to age. Each individual career is characterized by two constants, mpeak and shape. Shape is related to the length of career and mpeak is a measure of the peak ability of a player. Given a player age, CFTN returns a value indicating his ability, as measured by CFTN, at the given age.The ability is given in the same units that were used in the creation of the function. CFTN is the product of three terms, one term for each of the three general periods of the career given by the following computational expressions.
1) cftn_b = cftn_ba + cftn_bb*shape
2) cftn_c = cftn_ca - cftn_cb*shape
3) cftn_t1 = cftn_t1z - cftn_t1s*shape
4) cftn_t2 = cftn_t2z + cftn_t2s*shape
5) a = 1.0/(1.0 + exp(-cftn_a*(age-cftn_t1)))
6) b = 1.0 + cftn_b*(age-20)
7) c = 1.0/(1.0 + exp(cftn_c*(age-cftn_t2)))
8) CFTN = mpeak*a*b*c
The nine constants: cftn_a, cftn_ba, cftn_bb,
cftn_ca , cftn_cb, cftn_t1z, cftn_t1s, cftn_t2z, cftn_t2s are global,
the same for all players. Determination of these values is the key to
CFTN and will be described in detail.
Expression 5) determines the behavior of CFTN at the beginning of a
career, 6) the middle and 7) the end. The resulting value, 8),
expresses the value of the function for some age. An illustration of
this is given in Figure
1 that has been computed from the
parameters given in Table
1.
The function CFTN is linear in the parameter mpeak. Thus, for any set
of values of the global constants and a value for the parameter
shape, mpeak can be calculated:
9) mpeak = sum(measure*CFTN) / sum(CFTN*CFTN)
In 9) the sums are for each season in a player's career. Measure is
the quantity chosen to represent a players achievement. For this
calculation CFTN is evaluated with mpeak set to 1. Varying shape
until a minimum in the variance (sum of squared differences between
the actual value and estimated value of the measure) is found for the
particular player determines the two parameters. To minimize
numerical problems in these calculations the shape parameter is
constrained to the range -0.25 to 1.50 . The units of CFTN are the
units of the measure used. If the measure specifies season runs then
mpeak will have the same interpretation. For some players and some
sets of the global parameters more than one minimum in the shape
range of -0.25 to 1.50 is possible.
The constant cftn_bb has a negative value that when combined with
values of shape near the upper limit (1.5) can result in a negative
value for CFTN towards then end of a career. In the plots included in
this report negative values of CFTN have been set to 0.
From the complete player career record a training set of players
having at least 2000 at bats, at least 6 seasons played and no more
than 3 missing seasons was selected. 1230 players satisfied this
criteria. Using this data set the global parameters are determined by
minimizing the total variance which is the sum of the variance
resulting from a determination of mpeak and shape for each player in
the training set. For this variance computation negative vales for
CFTN are not set to zero. This global minimization is accomplished by
a standard numerical procedure given in section 10.5 of Numerical
Recipes, 2nd Edition, Press et al, 1992. Missing seasons within a
career are not used in the computation of variance for the
optimization procedure. Typically, a single global optimization takes
30 minutes on a 500 MHz INTEL Pentium III processor.
To determine the parameters used an evaluation measure was specified
(SDTHWO) and a number of CFTN global parameter optimizations were
performed. To avoid ending the optimizations in the same local
minima, slightly different initial values for the CFTN parameters
were used for each starting point. The obvious criteria for choosing
a parameter set is the smallest total variance. Since the goal is to
project a career another selection criteria is to determine which
parameter set provides the most accurate projections defined as the
highest correlation. Details on correlations are given in Section 7.
There is a generally increasing accuracy of projection with smaller
variance but the relationship is noisy. After carefully examining
several parameter sets one providing the best projections based on
eight seasons was chosen and is given in Table
1. This parameter set provides a low but
not the minimum variance observed and is used with the SDTHWO
evaluation measure for all the CFTN computations given in this
report. The R2 vs Final Variance plot, Figure
11, shows the results of 258
determinations of CFTN global constants for the first year
projections based on 4, 8 and 12 seasons. The general trend that
lower variance corresponds to higher correlation is evident. There
are also some clearly poor CFTN parameters sets for any final
variance when the evaluation criteria is correlation. The form of
CFTN is only slightly dependent on the player measure used.
Figure
2 displays the resulting function for six
values of the shape parameter using the global parameters given in
Table
1. Each curve in Figure
2 is evaluated with mpeak set to 1. As
shape increases the general effect is towards a lower starting age
for the career. Once shape is above about 0.5 all careers end around
age 45. Values of shape less than 0.2 describe careers that are
fairly short with the peak age decreasing as shape decreases. The set
of shape curves obtained from the minimum variance parameter set is
not qualitatively different.
While the parameter shape is related to the length of a player's
career, mpeak is a measure of the peak value of it. However, as can
be seen in Figure
2, mpeak depends on shape. If player mpeak
values are to be compared, a correction needs to be made to mpeak to
compensate for this shape dependency. mpeak' (mpeak prime) is
this corrected value:
10) mpeak' = mpeak * max(CFTN(shape,1.0))where CFTN is evaluated from age 18 to 45.
Figure 3
shows a plot of mpeak', shape pairs for the 1921 complete
careers having at least 1000 AB and at least 6 seasons in the majors.
The evaluation measure for this plot is the linear weights formula
SDTHWO which can be interpreted as runs contributed to the team
during a season. Table
2 lists the ten players having the highest
mpeak' values from this data set. While there will always be debate
on the question of which ten players have the highest peak values,
this list is suggestive that the values have some connection with
reality.
Figure
4 shows career and CFTN data for Rod Carew
(shape=1.33, mpeak' =93.8) using the measure SDTHWO. The solid
circles show his career values and the four solid curves display CFTN
values. The curve labeled "entire" is based on his complete career.
The large variations in season to season output is typical and the
root of the difficulty in creating an accurate representation of a
career.
Projections can be made with CFTN by limiting the data used for a
player to fewer seasons than a complete career: the sums in 9) are
simply evaluated for the first N seasons of a career.
Figure
4 includes CFTN curves for Carew using 4,
8 and 12 seasons in the shape,mpeak determinations. The way normal
season to season accomplishments affect projections based on a few
seasons is also evident.
Quantitative results on the ability of these systems to predict
careers will be given after a discussion of the Brock2 and Most
Recent Seasons systems.
5) The Brock2 Projection System.
Bill James introduced this method in his 1985 Baseball Abstract, pp
301-305. While the James description allows implementation of the
scheme, no details on how well it works were presented. Brock2
requires up to four seasons of data depending on the player's age to
estimate one or more following seasons. Games played, at bats,
singles, doubles, triples, home runs, walks, runs and RBIs are needed
as input. These same quantities are individually projected by the
calculations. To compare with actual data and CFTN I evaluate the
same offensive measure for Brock2 as for CFTN. Up to age 27 Brock2
projects a slightly increasing ability. Thereafter, a smaller ability
is projected until a threshold is reached which effectively
terminates a career. The change in ability from season to season
depends only on player age and does not depend on when a career
starts. Brock2 generates projections using the preceding two seasons,
either actual or projected. If the season immediately preceding the
one being projected is missing a projection cannot be made. If the
second season preceding the projection is missing it will be
seriously in error. For an example of this effect see the eight year
base Brock2 projection for Ted Williams' career in Figure
9.
I have used the David
Grabiner C language implementation of 1997
as the starting point for this effort. In addition to the Grabiner
code, and extensions needed to interface it with my evaluation
framework, I packaged the Grabiner code in a C++ language class. The
Brock2 system was easily modified to project from more than the
minimum specified number of seasons. The Brock2 algorithm was not
changed.
An important part of the Brock2 calculation is the player sustenance
level that primarily affects the career termination part of the
calculations. Determining the initial value for this requires
knowledge of the fielding position played and league runs/game
information. The yearly runs/game values are readily available.
However, player position information is not readily available and
often changes during the course of a career. I have used an initial
position correction (-0.452) to the sustenance level that is the same
for all players. This constant value was chosen to give the same
average results with a data set extracted from the Lahman 2.0
database when used for all players as was obtained when using the
position information that was included in the 2.0 version. This
position independent constant was used for projections based on the
Lahman 4.5 data base player data that is given in this study.
Figure
5 shows Brock2 projections based on 4, 8
and 12 years for Rod Carew. The quantity plotted is SDTHWO, the same
as in Figure
4. The Brock2 curves are essentially
parallel showing the same decrease from their starting points which
are dependent on the last two actual career data points. The effect
of the career termination calculation can be seen in the 4 and 8 year
based projections in Figure
5.
Brock2 separately projects each of the stats it uses. A thorough test
would separately test the accuracy of each of these quantities.
6) Most Recent Season
The simplest projection scheme that can be devised is to assume a
player will continue to produce exactly as he did during the most
recent season that he completed and will be referred to by the
acronym MRS. MRS thus is a limited form of autocorrelation. I have
not provided a plot similar to Figures 4 and 5 for MRS. Such a plot
would consist of horizontal lines drawn from the last season of
actual data used to make the projections. While obviously lacking any
concept of how a career progresses, the comparisons with the more
complex Brock2 and CFTN methods will prove useful.
7) Results
The evaluations described in this section use the same player set as
was used for for Figure
3: complete player careers with at least
1000 AB and at least 6 seasons played.
The first comparison to be shown is of the averages of the player
season runs (using the SDTHWO measure) for the three projection
methods and the actual data. Perhaps the simplest criteria for
success in projections is that the projection and actual data give
the same value. The comparison in this case is crude:
Figure
6 displays ratios of the data set averages
for a projection normalized (divided) by the data set average
actually observed. Ratios greater than 1 indicate an overestimate by
the projection method. In the figure legends, B2 (red) shows Brock2
results, CF (blue) is used for CFTN and MS (green) for the MRS
projections. The number following the colon and symbol indicate how
many seasons were used for the projections. MRS projections for 8 and
12 years show clearly the overestimate resulting from the lack of
modeling an age related decline of ability. The MRS projection based
on 4 years suggests that player ability continues to improve a little
in the second 4 year segment of a career. While it is possible that
some of this improvement may be due to marginal players retiring the
selection criteria for the player data set minimizes this. Brock2
systematically underestimates projected careers. CFTN does a
reasonable job on reproducing the projected averages for three or
four years. This is not surprising since it was designed to match
averages.
A stronger comparison of the projections and actual data results from
comparing each player season projection to what was actually
achieved. The linear correlation ("R") coefficient has been computed
for the data used to create Figure
6. Figure
7 displays R2 , square of the
correlation coefficient R, using the same conventions as
Figure
6. In the legend for Figure
7 the circumflex "^" is used to suggest
correlation otherwise the interpretation of the legends is the same.
The interpretation of R2 is the fraction of variance in
one the variables explained by the variance of the other. The
projections based on 4 seasons clearly shows the difficulty of making
predictions early in a career. Except for the first year estimate,
Brock2 is clearly superior to CFTN for all the projections.
Distressingly, for the most important next year projections, neither
Brock2 or CFTN provide as much correlation with what is actually
accomplished as the trivially simple MRS method.
Ted Williams' career offers an example of the ability of these
systems to do interpolations. Fortuitously, his WW II absence
occurred after 4 seasons enabling the 4 year base Brock2 to estimate
them what he might have accomplished. The entire career CFTN estimate
(shape=1.27, mpeak'=155.7) also agrees with what might have been. The
entire career CFTN is appropriate here as a projection is not being
made.
Brock2 contains thresholds a player must exceed at the beginning of
his career if he is to reach the status of a "regular". Pitchers
generally fail to meet these requirements so Brock2 is of no use in
estimating pitcher batting. CFTN has no such restriction.
Figure
10 summarizes Steve Carlton's NL batting
career and indicates the projections made for it by CFTN (shape=1.36,
mpeak'=5.04). It also demonstrates dangers inherent in projections
based on small sets of data.
The low value for autocorrelation (MRS projection) is indicative of
the great amount of variability in individual careers. This seems to
be intrinsic and it is unlikely that a projection system can, in
principle, do much better than displayed in this study. While it is
obvious that baseball playing careers follow a pattern, the results
of this study seem to indicate that the more carefully age effects
are modelled the poorer the prediction system does. It also suggests
that any player career projections be treated with a considerable
amount of skepticism. This study also emphasizes the importance of
relating any statistical evaluation system to the actual historical
statistical record and its proper understanding.
Back
to the J. F. Jarvis baseball page.
The following list provides access to Brock2 and CFTN career summary
and projection plots similar to Figures 4 and 5 for Rod Carew and
Figures 8 and 9 for Ted Williams.
|
Player |
CFTN shape |
CFTN mpeak' |
|
1.30 |
129.9 |
|
|
1.36 |
139.5 |
|
|
1.32 |
100.4 |
|
|
1.37 |
105.2 |
|
|
0.88 |
188.9 |
cftn_a = 0.864861
cftn_ba = -0.031985 cftn_bb= -0.007588
cftn_c = 0.937970 cftn_cb= 0.774984
cftn_t1 = 33.893580 cftn_t1s= 11.006860
cftn_t2 = 27.122430 cftn_t2s= 24.778990Table 1. CFTN global parameters
Player SHAPE mpeak'
Ruth, Babe 0.88 188.9
Gehrig, Lou 0.99 177.8
Foxx, Jimmie 1.10 163.0
Williams, Ted 1.27 155.7
Mantle, Mickey 1.13 148.2
Hornsby, Rogers 1.09 142.8
Waner, Paul 1.04 138.0
Boggs, Wade 0.85 135.4
Schmidt, Mike 0.80 135.2
Musial, Stan 1.32 135.1
Table 2. Players with top ten mpeak' values











Back to the J. F. Jarvis baseball page.