How Many World Series Should the Braves Have Won?
John F. Jarvis

email address

Presented at SABR 34, July 15, 2004, Cincinnati, Ohio

“The race is not to the swift or the battle to the strong, ... ,
but time and chance happen to them all.” Ecclesiastes 9:11 NIV

Copyright 2004, John F. Jarvis



Introduction

Cubs and Red Sox fans are renowned for their long standing postseason frustrations. The current season started with “Is this the Cubs’ year, or another Cubs Year?” by Paul Sullivan of the Chicago Tribune in my local paper, the Augusta Chronicle. Boston’s “curse of the Bambino” has its own web site, one of about 29000 pages found with a Google search for “baseball curse bambino”. Have the Braves found a better way to build fan frustration with their seemingly endless string of playoff appearances that fall short of a World Series (WS) championship? I will provide evidence from studying the 12 consecutive playoffs, 1991-2003, that the Braves have made the playoffs that their single WS championship is not anomalous, even if it is disappointing. (I am a Braves fan by proximity.)

During these 12 seasons (there was no postseason in the strike terminated 1994 season) 25 of the current 30 teams have made the playoffs. The Yankees have won 4 WS while the Braves only have a single championship. Naively, since the playoff teams are presumably well matched, each playoff team has and equal chance of winning the WS. For 8 team playoffs this leads to an estimate of 1.5 WS wins for Atlanta. A single WS championship doesn’t seem too shabby by this standard. Using simulations of the playoffs, I will show that Atlanta expectations are a little higher. The blame for Atlanta’s perceived poor performance will be shown to be the structure of the playoffs, in particular the short playoff series, not any team failings.

Methodology

The primary tool I use for assessing MLB postseason performance is simulation using the plate appearance (PA) as the most basic element of the game. The outcome of a PA is chosen randomly, with probabilities dependent on the hitting team batting order position and the fielding team average pitching. Errors are simulated using fielding team values. Runner advances on hits and outs are chosen probabilistically based on team regular season rates. For each PA, stolen base and caught stealing including pick off events can occur. The probabilities do not change during a game and there is no attempt to model clutch hitting. There is no modeling of within game tactics including substitutions. Regular season results from 1000 simulated seasons are summarized in Table 1 and Table 2 considering the strike shortened and Braves WS winning 1995 season as “typical”. The simulator provides excellent results compared to the regular season. Team names are the 3 character Retrosheet team codes. Additional details on the simulator are available in references 1 and 2.

Table 1 (Figure 1) provides an average standings summary and the number of times each team finished at a particular place in the division standings. The extremes, most season wins and losses, from the simulations are also listed. It includes the actual season outcomes for comparison purposes. The standard deviation of the season simulation wins is approximately 6 games for all teams. The largest deviation between the actual and average simulated outcomes is Atlanta, approximately 1.3 standard deviations more wins achieved than simulated. All the deviations are well within the range that is probable.

Table 2 summarizes season runs scored (Figure 2) , runs allowed (Figure 3) and left on base for the same 1000 season simulations used to create Table 1. Overall, simulated runs differ from the observed runs by less than 1% and there are no team run deviations that are improbably large compared to the simulation standard deviations.

On a 1.3 GHz processor the simulator can perform 15 full season (30 team) simulations a second. All the data used is based on team season averages derived from Retrosheet or similarly formatted full season play-by-play files.

The American League use of offensive/defensive platooning (designated hitter) complicates simulations of any interleague play including the World Series (WS). The advent of regular season interleague play in 1997 provides the data allowing compensation of the team data for this defect. Hitting in interleague games has been adjusted to minimize the difference between the simulated and observed team runs scored and allowed for the interleague games. In interleague games a shaping of the batting order to compensate for the DH induced distortions followed by a 3% increase in AL hitting probabilities was necessary to obtain this equalization.

Playoff Simulations

Playoff simulations start with the as played match-ups with the winners advancing. The number of times each team wins the WS is recorded. The probability of winning the WS is simply the number of WS wins divided by the total number of playoffs simulated. Regular season team data is used for the playoff simulations with the interleague hitting adjustments noted above for the WS.

Table 3 is a summary of 1,000,000 simulations of each of the 12 postseasons included in this analysis. In as the column title. The actual WS winner is underlined. The column “Sim/Wins” is the total probability for each team for all 12 seasons. The team entries (rows) are ordered by this quantity. The postseason expanded to eight teams in 1995. The ‘91 - ‘93 playoffs only included 4 teams. For these three seasons the WS winning probability has been divided by 2 in the “Sim/Wins” column to make them consistent with the 8 team playoff values. “Act/Wins” lists the number of WS actually won. The ratio of actual wins to simulated wins is given in the column “A/S-Wins” with values of the ratio greater than 1.0 indicating overachievement. The greatest overachievers are the Angels, Marlins and Blue Jays.
Figure 4 displays the actual and simulated WS wins results.The most notable underachievers are the Braves and the Indians. Atlanta is the only WS winning team with a A/S ratio less than one. Cleveland has the highest simulated wins total without winning a WS. The highest WS winning probability (eight team playoffs) in the 12 seasons analyzed is Cleveland’s 0.355 in 1995. One most probable WS winning team during these 12 seasons has actually won the WS, the Yankees (0.276) in 1998. Nicely balancing this is a least likely WS winner that has actually won the WS, this time the Yankees (0.076) in 2000.

Table 4 summarizes the 1995 playoffs. The numbers given are the single game team against team winning probabilities for the teams in each column. The most extreme match-up shows less than a factor of two differences in the team single game winning probabilities. While the playoff teams are well matched, they are obviously not all equal in ability. Please note that the probabilities are for the team listed in each column.

The simulation results are consistent with other postseason analysis. In a web article (3) Tom Tippett provides an analysis of the 2002 postseason. He uses a different methodology than I do and about all we agree on is that Anaheim (his 5th and my 3rd choice) shouldn’t have won the WS.

Vince Gennero (4) provides support for his claim that the 1995 team is Cleveland’s best ever. This team has the highest (0.355) probability based on simulations of winning a WS of all the teams appearing in the 12 postseasons in this analysis (Table 3). Their dominance in the 1995 AL is evident in Table 1 and Table 2.

Stuart Shapiro (5) presents a linear regression model for predicting postseason outcomes. He lists the 1992 Atlanta-Toronto WS as one of the more notable upsets. My analysis also suggests Atlanta as the stronger team but I wouldn’t call this a major upset. His estimates are for series match-ups thus correspond to Table 4 values extrapolated to a series match-up, not the over all WS winning probabilities given in Table 3.

Discussion

The simulations tell how bad the postseason is in picking the WS winner but they don't indicate the reason for this poor performance. In “October Men” Roger Kahn quotes Sparky Anderson as stating “anything can happen in a short series”. A Google search using the terms “baseball” and “short series” returns almost 1000 references. The short series is also the focus of the previously mentioned Tippett article. In Table 3 the average WS win probability for the 12 actual winners is 0.128 and the average for the 72 non WS winning teams is 0.124 . The naive winning probability, equal for all playoff teams, is 0.125 . The differences are not statistically significant. Comparisons of simulations with actual results in postseason play show no agreement at all. The commonly held perception that any team can win a short series is well founded.

Figure 5 is a histogram of the numbers of teams having simulation winning probabilities in 0.01 width single game probability winning bins. The two groups plotted are all the team-team match-ups in the simulations and the actual WS. For both groups the 50% point of the cumulative distributions (half the values are below and half above) based on this data is for a 0.53 single game win probability. Again, no surprises since closely matched teams are expected in the playoffs.

Given the probability of winning a single game, the probability of winning a series of a specified length can easily be calculated (details are given on an appendix).
Figure 6 indicates how chances of winning a series improves as the length of the series increases. Two probabilities are shown, 0.529, Yankees vs. Braves in 1995, approximating the median single game winning probability for all playoff match ups and 0.589 for the Yankees against the Padres in the 1998 WS, the most dominant match-up in the 12 WS included in this study. For the 5 and 7 games series used in the playoffs the series winning probability for the better team does not increase much. Lengthening the playoff series enough to get a significant (perhaps 0.8 probability of winning a series) is clearly not feasible.

Conclusions

The present MLB postseason structure is no better than a lottery at determining the best team. Simulations have been used to demonstrate a lack of correlation between team abilities and actual WS results. A simple probability calculation clearly shows the inability of 5 or 7 game series to unambiguously identify the better team. Creating a definitive postseason structure appears to be unrealistic given the reality of the calendar (and finances). A slightly better, but still far from definitive approach, would be a return to two divisions in each league followed by a four team postseason. Nine or eleven games playoff series would provide a slight gain in playoff efficacy and provide a partial compensation for the revenue lost in the current first round of the playoffs. At the very least, the first round of the playoffs should be increased to 7 games.

Finally, an answer can be given to the question in the title. The Braves should have won two WS titles in their 12 consecutive playoff appearances. However, I don’t think one additional WS championship would assuage much of their fan’s frustration.

References

1. “A Baseball Simulator”, J. F. Jarvis

2. "Chance and Intent: A Baseball Paradox", J. F. Jarvis, CHANCE, Vol 11, No. 3: 12-19, Summer 1998

3. “May the best team win ... at least some of the time”, Tom Tippett,

4. “Cleveland’s Greatest Team ... So Far”, Vince Generro, Baseball Research Journal 1996 pp 36-38

5. “Predicting Postseason Results”, Stuart Shapiro, Baseball Research Journal 1998 pp 106-107



















 Table 1. Summary of 1000 Simulations of the 1995 Regular Season

NL East WFR\Plcd    1    2    3    4    5   wins  std min max act   a-s
ATL 0.568:    627  268   71   27    7   81.8  6.2  63 101  90   8.2
NYN 0.535:    272  354  225  104   45   77.0  6.1  55  95  69  -8.0
FLO 0.497:     50  188  303  269  190   71.0  6.0  51  93  67  -4.0
PHI 0.485:     36  114  232  327  291   69.8  6.0  47  88  69  -0.8
    MON 0.467:     15   76  169  273  467   67.3  6.0  49  87  66  -1.3
NL Cent WFR\Plcd    1    2    3    4    5   wins  std min max act   a-s
    CIN 0.586:    608  271  117    3    1   84.4  5.9  60 101  85   0.6
    HOU 0.561:    296  451  241   11    1   80.8  5.7  58  99  76  -4.8
    CHN 0.525:     95  272  567   56   10   75.6  6.0  58  93  73  -2.6
    PIT 0.421:      0    5   44  486  465   60.7  6.0  44  81  58  -2.7
    SLN 0.420:      1    1   31  444  523   60.1  5.9  42  80  62   1.9
NL West WFR\Plcd    1    2    3    4        wins  std min max act   a-s
    LAN 0.517:    518  281  157   44        74.5  5.9  55  93  78   3.5
    COL 0.501:    318  385  207   90        72.2  5.8  53  93  77   4.8
    SDN 0.475:    138  253  380  229        68.4  6.0  50  85  70   1.6
    SFN 0.441:     26   81  256  637        63.6  5.7  47  83  67   3.4
AL East WFR\Plcd    1    2    3    4    5   wins  std min max act   a-s
    BOS 0.549:    328  337  321   14    0   79.1  5.6  62  96  86   6.9
    NYA 0.548:    334  331  319   15    1   78.9  5.9  61  95  79   0.1
    BAL 0.545:    337  323  321   18    1   78.4  5.8  62  93  71  -7.4
    TOR 0.428:      1    8   36  785  170   61.6  5.7  46  78  56  -5.6
    DET 0.365:      0    1    3  168  828   52.6  5.8  34  71  60   7.4
AL Cent WFR\Plcd    1    2    3    4    5   wins  std min max act   a-s
   CLE 0.668:    997    3    0    0    0   96.2  5.8  80 114 100   3.8
    CHA 0.524:      3  707  215   72    3   75.4  5.7  57  92  68  -7.4
    KCA 0.475:      0  160  403  391   46   68.4  6.1  43  88  70   1.6
    MIL 0.469:      0  129  366  454   51   67.6  5.9  48  84  65  -2.6
    MIN 0.376:      0    1   16   83  900   54.1  5.9  38  73  56   1.9
AL West WFR\Plcd    1    2    3    4        wins  std min max act   a-s
    CAL 0.564:    636  284   69   11        81.8  6.1  63 104  78  -3.8
    SEA 0.542:    323  493  148   36        78.6  5.8  62  99  79   0.4
    TEX 0.477:     22   97  419  462        68.7  6.0  51  88  74   5.3
    OAK 0.470:     19  126  364  491        67.6  6.2  49  89  67  -0.6


Table 2. Scoring Summary for the 1000 Simulated 1995 Seasons
1995          runs scored              runs allowed             left on base
team  gms    act   sim sm-ac   std    act   sim sm-ac   std    act   sim sm-ac
 ATL  144    645   653     8  35.6    540   557    17  31.3    988   998    10
 CHN  144    693   696     3  35.6    671   661   -10  34.8    956   980    24
 CIN  144    747   738    -9  37.3    623   616    -7  34.2   1001  1016    15
 COL  144    785   783    -2  39.4    783   786     3  40.0   1017  1011    -6
 FLO  143    673   660   -13  34.8    673   662   -11  35.0   1028  1047    19
 HOU  144    747   743    -4  38.1    674   649   -25  34.5   1153  1150    -3
 LAN  144    634   643     9  34.9    609   618     9  34.9   1049  1065    16
 MON  144    621   592   -29  33.1    638   639     1  34.8    970  1001    31
 NYN  144    657   661     4  35.6    618   616    -2  34.9   1041  1029   -12
 PHI  144    615   628    13  34.1    658   650    -8  34.4   1114  1104   -10
 PIT  144    629   628    -1  35.2    736   746    10  38.7   1010  1023    13
 SDN  144    668   636   -32  34.6    672   666    -6  35.6    998  1054    56
 SFN  144    652   669    17  34.8    776   758   -18  39.5   1046  1017   -29
 SLN  143    563   560    -3  32.6    658   663     5  35.6    957   984    27
 BAL  144    704   701    -3  36.2    640   636    -4  35.7   1022  1042    20
 BOS  144    791   778   -13  38.9    698   701     3  38.2   1084  1095    11
 CAL  145    801   785   -16  40.0    697   684   -13  36.7   1056  1082    26
 CHA  144    755   772    17  39.7    758   738   -20  38.7   1155  1139   -16
 CLE  144    840   867    27  40.3    607   603    -4  34.4   1018  1041    23
 DET  144    654   637   -17  36.8    844   848     4  42.1   1017  1015    -2
 KCA  144    629   638     9  35.6    691   677   -14  36.4   1018  1034    16
 MIL  144    740   687   -53  36.6    747   732   -15  38.4   1012  1074    62
 MIN  144    703   689   -14  37.1    889   895     6  44.0   1025  1033     8
 NYA  144    749   747    -2  37.6    688   672   -16  35.4   1126  1140    14
 OAK  144    730   704   -26  37.6    761   750   -11  37.6   1049  1060    11
 SEA  145    796   769   -27  39.3    708   702    -6  37.5   1033  1064    31
 TEX  144    691   676   -15  36.6    720   708   -12  36.8   1031  1064    33
 TOR  144    642   653    11  34.8    777   759   -18  38.3   1079  1075    -4
----
 tot 4032  19554 19393  -161 196.1  19554 19393  -161 196.1  29053 29439   386

Table 3. Summary of Playoff Simulations

       PO   A/S  WSAct    Sim    Season ->
Team Lg Apps Wins    Wins    Wins   1991   1992   1993   1995   1996   1997   1998   1999   2000   2001   2002   2003
ATL  N    12    0.52    1    1.94  0.092  0.139  0.192  0.149  0.176  0.254  0.246  0.154  0.113  0.081  0.132  0.208
NYA  A     9    3.23    4    1.24                       0.080  0.078  0.205  0.276  0.141  0.076  0.068  0.138  0.177
CLE  A     6    0       0    0.83                       0.355  0.217  0.067  0.070  0.089         0.034       
OAK  A     5    0       0    0.61         0.100                                            0.124  0.159  0.123  0.104
SEA  A     4    0       0    0.60                       0.074         0.078                0.131  0.314       
SFN  N     4    0       0    0.59                                     0.036                0.224         0.206  0.125
HOU  N     4    0       0    0.56                                     0.127  0.219  0.127         0.088       
ARI  N     3    2.17    1    0.46                                                   0.196         0.162  0.099   
BOS  A     4    0       0    0.41                       0.073                0.081  0.092                       0.162
SLN  N     4    0       0    0.39                              0.060                       0.128  0.093  0.106   
PIT  N     2    0       0    0.33  0.168  0.160                                        
TOR  A     3    6.9     2    0.29  0.098  0.102  0.090                                   
TEX  A     3    0       0    0.28                              0.186         0.028  0.063               
MIN  A     3    3.7     1    0.27  0.143                                                                 0.059  0.069
NYN  N     2    0       0    0.23                                                   0.137  0.090           
BAL  A     2    0       0    0.22                              0.081  0.140                        
CHA  A     2    0       0    0.22                0.102                                     0.114           
SDN  N     2    0       0    0.19                              0.138         0.051                   
FLO  N     2   10.53    2    0.19                                     0.092                                     0.095
CIN  N     1    0       0    0.18                       0.175                               
ANA  A     1    7.14    1    0.14                                                                        0.136    
PHI  N     1    0       0    0.12                0.117                                   
LAN  N     2    0       0    0.12                       0.052  0.064                           
CHN  N     2    0       0    0.09                                            0.029                              0.060
COL  N     1    0       0    0.04                       0.043                               
MON  N     0    0       0    0
MIL  N     0    0       0    0
TBA  A     0    0       0    0
KCA  A     0    0       0    0
DET  A     0    0       0    0

Actual World Series winner in RED.


Table 4. 1995 Playoff Teams Winning Probabilities.
(Probabilities are for team in each column against team given in row.)

tm/tm   ATL     COL    CIN    LAN   CLE     BOS    SEA
  COL 0.5725
   CIN 0.4890 0.4115
   LAN 0.5616 0.4870 0.5734
   CLE 0.4208 0.3545 0.4333 0.3673
   BOS 0.5234 0.4622 0.5392 0.4714 0.6084
   SEA 0.5284 0.4696 0.5467 0.4759 0.6145 0.5070
   NYA 0.5289 0.4632 0.5431 0.4730 0.6110 0.5027 0.4944


Appendix: Calculating the probability of winning a series of length N games.

PGame: Probability of winning a single game
N: Length of the series in games.
(N/2)+1: Number of games needed to win an N game series
K: Number of games won in the series
Pseries: Probability of winning the series




Following is an example of this calculation for a 7 game series (N=7). The series win probability is calculated by summing the 4 terms Pk for K starting at 4 (Pk is the probability of winning k games given by the expression being summed), that is for all the series where there are at least 4 wins. Note the cumulative probability, the sum of all Pk, is 1.000 as it should be. (N,K) is shorthand for the binomial coefficient, the ratio of factorials at the end of the expression for Pseries. Pgame=0.59 thus the N=7 series winning probability of 0.69, corresponds to the upper curve in Figure 6.



Pg = 0.59	N = 7					
K PgK (1-Pg)(N-K) (N,K) Pk Cuml
0 1.0000 0.0019 1 0.0019 0.0019
1 0.5900 0.0048 7 0.0196 0.0216
2 0.3481 0.0116 21 0.0847 0.1063
3 0.2054 0.0283 35 0.2031 0.3094

4 0.1212 0.0689 35 0.2923 0.6017
5 0.0715 0.1681 21 0.2524 0.8541
6 0.0422 0.4100 7 0.1211 0.9751
7 0.0249 1.0000 1 0.0249 1.0000
Pseries= 0.6906