How Many
World Series Should the Braves Have Won?
John F. Jarvis

Presented at SABR 34, July 15, 2004, Cincinnati, Ohio
“The race is not to the swift or the battle to the strong, ... ,
but time and chance happen to them all.” Ecclesiastes 9:11 NIV
Copyright 2004, John F. Jarvis
Introduction
Cubs and Red Sox fans are renowned for their long standing postseason
frustrations. The current season started with “Is this the Cubs’ year,
or another Cubs Year?” by Paul Sullivan of the Chicago Tribune in my
local paper, the Augusta Chronicle. Boston’s “curse of the Bambino” has
its own web site, one of about 29000 pages found with a Google search
for “baseball curse bambino”. Have the Braves found a better way to
build fan frustration with their seemingly endless string of playoff
appearances that fall short of a World Series (WS) championship? I will
provide evidence from studying the 12 consecutive playoffs, 1991-2003,
that the Braves have made the playoffs that their single WS
championship is not anomalous, even if it is disappointing. (I am a
Braves fan by proximity.)
During these 12 seasons (there was no postseason in the strike
terminated 1994 season) 25 of the current 30 teams have made the
playoffs. The Yankees have won 4 WS while the Braves only have a single
championship. Naively, since the playoff teams are presumably well
matched, each playoff team has and equal chance of winning the WS. For
8 team playoffs this leads to an estimate of 1.5 WS wins for Atlanta. A
single WS championship doesn’t seem too shabby by this standard. Using
simulations of the playoffs, I will show that Atlanta expectations are
a little higher. The blame for Atlanta’s perceived poor performance
will be shown to be the structure of the playoffs, in particular the
short playoff series, not any team failings.
Methodology
The primary tool I use for assessing MLB postseason performance is
simulation using the plate appearance (PA) as the most basic element of
the game. The outcome of a PA is chosen randomly, with probabilities
dependent on the hitting team batting order position and the fielding
team average pitching. Errors are simulated using fielding team values.
Runner advances on hits and outs are chosen probabilistically based on
team regular season rates. For each PA, stolen base and caught stealing
including pick off events can occur. The probabilities do not change
during a game and there is no attempt to model clutch hitting. There is
no modeling of within game tactics including substitutions. Regular
season results from 1000 simulated seasons are summarized in Table 1
and Table 2 considering the strike shortened and
Braves WS winning 1995
season as “typical”. The simulator provides excellent results compared
to the regular season. Team names are the 3 character Retrosheet team
codes. Additional details on the simulator are available in references
1 and 2.
Table 1 (Figure 1) provides an
average standings summary and the number of times
each team finished at a particular place in the division standings. The
extremes, most season wins and losses, from the simulations are also
listed. It includes the actual season outcomes for comparison purposes.
The standard deviation of the season simulation wins is approximately 6
games for all teams. The largest deviation between the actual and
average simulated outcomes is Atlanta, approximately 1.3 standard
deviations more wins achieved than simulated. All the deviations are
well within the range that is probable.
Table 2 summarizes season runs scored (Figure
2) , runs allowed (Figure 3) and left on base
for the same 1000 season simulations used to create Table
1. Overall,
simulated runs differ from the observed runs by less than 1% and there
are no team run deviations that are improbably large compared to the
simulation standard deviations.
On a 1.3 GHz processor the simulator can perform 15 full season (30
team) simulations a second. All the data used is based on team season
averages derived from Retrosheet
or similarly
formatted full season play-by-play files.
The American League use of offensive/defensive platooning (designated
hitter) complicates simulations of any interleague play including the
World Series (WS). The advent of regular season interleague play in
1997 provides the data allowing compensation of the team data for this
defect. Hitting in interleague games has been adjusted to minimize the
difference between the simulated and observed team runs scored and
allowed for the interleague games. In interleague games a shaping of
the batting order to compensate for the DH induced distortions followed
by a 3% increase in AL hitting probabilities was necessary to obtain
this equalization.
Playoff Simulations
Playoff simulations start with the as played match-ups with the winners
advancing. The number of times each team wins the WS is recorded. The
probability of winning the WS is simply the number of WS wins divided
by the total number of playoffs simulated. Regular season team data is
used for the playoff simulations with the interleague hitting
adjustments noted above for the WS.
Table 3 is a summary of 1,000,000 simulations of each
of the 12
postseasons included in this analysis. In as the column title. The
actual WS
winner is underlined. The column “Sim/Wins” is the total probability
for each team for all 12 seasons. The team entries (rows) are ordered
by this quantity. The postseason expanded to eight teams in 1995. The
‘91 - ‘93 playoffs only included 4 teams. For these three seasons the
WS winning probability has been divided by 2 in the “Sim/Wins” column
to make them consistent with the 8 team playoff values. “Act/Wins”
lists the number of WS actually won. The ratio of actual wins to
simulated wins is given in the column “A/S-Wins” with values of the
ratio greater than 1.0 indicating overachievement. The greatest
overachievers are the Angels, Marlins and Blue Jays. Figure 4 displays the actual and
simulated WS wins results.The
most notable
underachievers are the Braves and the Indians. Atlanta is the only WS
winning team with a A/S ratio less than one. Cleveland has the highest
simulated wins total without winning a WS. The highest WS winning
probability (eight team playoffs) in the 12 seasons analyzed is
Cleveland’s 0.355 in 1995. One most probable WS winning team during
these 12 seasons has actually won the WS, the Yankees (0.276) in 1998.
Nicely balancing this is a least likely WS winner that has actually won
the WS, this time the Yankees (0.076) in 2000.
Table 4 summarizes the 1995 playoffs. The numbers
given are the single
game team against team winning probabilities for the teams in each
column. The most extreme match-up shows less than a factor of two
differences in the team single game winning probabilities. While the
playoff teams are well matched, they are obviously not all equal in
ability. Please note that the probabilities are for the team listed in
each column.
The simulation results are consistent with other postseason analysis.
In a web article (3) Tom Tippett provides an
analysis of the 2002
postseason. He uses a different methodology than I do and about all we
agree on is that Anaheim (his 5th and my 3rd choice) shouldn’t have won
the WS.
Vince Gennero (4) provides support for his claim that
the 1995 team is
Cleveland’s best ever. This team has the highest (0.355) probability
based on simulations of winning a WS of all the teams appearing in the
12 postseasons in this analysis (Table 3). Their
dominance in the 1995
AL is evident in Table 1 and Table 2.
Stuart Shapiro (5) presents a linear regression model
for predicting
postseason outcomes. He lists the 1992 Atlanta-Toronto WS as one of the
more notable upsets. My analysis also suggests Atlanta as the stronger
team but I wouldn’t call this a major upset. His estimates are for
series match-ups thus correspond to Table 4 values
extrapolated to a
series match-up, not the over all WS winning probabilities given in
Table 3.
Discussion
The simulations tell how bad the postseason is in picking the WS winner
but they don't indicate the reason for this poor performance. In
“October Men” Roger Kahn quotes Sparky Anderson as stating “anything
can happen in a short series”. A Google search using the terms
“baseball” and “short series” returns almost 1000 references. The short
series is also the focus of the previously mentioned Tippett
article.
In Table 3 the average WS win probability for the 12
actual winners is
0.128 and the average for the 72 non WS winning teams is 0.124 . The
naive winning probability, equal for all playoff teams, is 0.125 . The
differences are not statistically significant. Comparisons of
simulations with actual results in postseason play show no agreement at
all. The commonly held perception that any team can win a short series
is well founded.
Figure 5 is a histogram of the numbers of
teams having simulation winning probabilities in 0.01 width single game
probability winning bins. The two groups plotted are all the team-team
match-ups in the simulations and the actual WS. For both groups the 50%
point of the cumulative distributions (half the values are below and
half above) based on this data is for a 0.53 single game win
probability. Again, no surprises since closely matched teams are
expected in the playoffs.
Given the probability of winning a single game, the probability of
winning a series of a specified length can easily be calculated
(details are given on an appendix). Figure 6
indicates how chances of winning a series improves as the length of the
series increases. Two probabilities are shown, 0.529, Yankees vs.
Braves in 1995, approximating the median single game winning
probability for all playoff match ups and 0.589 for the Yankees against
the Padres in the 1998 WS, the most dominant match-up in the 12 WS
included in this study. For the 5 and 7 games series used in the
playoffs the series winning probability for the better team does not
increase much. Lengthening the playoff series enough to get a
significant (perhaps 0.8 probability of winning a series) is clearly
not feasible.
Conclusions
The present MLB postseason structure is no better than a lottery at
determining the best team. Simulations have been used to demonstrate a
lack of correlation between team abilities and actual WS results. A
simple probability calculation clearly shows the inability of 5 or 7
game series to unambiguously identify the better team. Creating a
definitive postseason structure appears to be unrealistic given the
reality of the calendar (and finances). A slightly better, but still
far from definitive approach, would be a return to two divisions in
each league followed by a four team postseason. Nine or eleven games
playoff series would provide a slight gain in playoff efficacy and
provide a partial compensation for the revenue lost in the current
first round of the playoffs. At the very least, the first round of the
playoffs should be increased to 7 games.
Finally, an answer can be given to the question in the title. The
Braves should have won two WS titles in their 12 consecutive playoff
appearances. However, I don’t think one additional WS championship
would assuage much of their fan’s frustration.
References
1. “A Baseball
Simulator”, J. F. Jarvis
2. "Chance and Intent: A Baseball Paradox", J. F.
Jarvis, CHANCE, Vol
11, No. 3: 12-19, Summer 1998
3. “May the
best team win ... at least some of the time”, Tom Tippett,
4. “Cleveland’s Greatest Team ... So Far”, Vince
Generro, Baseball Research Journal 1996 pp
36-38
5. “Predicting Postseason
Results”, Stuart Shapiro, Baseball
Research Journal 1998 pp 106-107






Table 1. Summary of 1000 Simulations of the 1995 Regular Season
NL East WFR\Plcd 1 2 3 4 5 wins std min max act a-s
ATL 0.568: 627 268 71 27 7 81.8 6.2 63 101 90 8.2
NYN 0.535: 272 354 225 104 45 77.0 6.1 55 95 69 -8.0
FLO 0.497: 50 188 303 269 190 71.0 6.0 51 93 67 -4.0
PHI 0.485: 36 114 232 327 291 69.8 6.0 47 88 69 -0.8
MON 0.467: 15 76 169 273 467 67.3 6.0 49 87 66 -1.3
NL Cent WFR\Plcd 1 2 3 4 5 wins std min max act a-s
CIN 0.586: 608 271 117 3 1 84.4 5.9 60 101 85 0.6
HOU 0.561: 296 451 241 11 1 80.8 5.7 58 99 76 -4.8
CHN 0.525: 95 272 567 56 10 75.6 6.0 58 93 73 -2.6
PIT 0.421: 0 5 44 486 465 60.7 6.0 44 81 58 -2.7
SLN 0.420: 1 1 31 444 523 60.1 5.9 42 80 62 1.9
NL West WFR\Plcd 1 2 3 4 wins std min max act a-s
LAN 0.517: 518 281 157 44 74.5 5.9 55 93 78 3.5
COL 0.501: 318 385 207 90 72.2 5.8 53 93 77 4.8
SDN 0.475: 138 253 380 229 68.4 6.0 50 85 70 1.6
SFN 0.441: 26 81 256 637 63.6 5.7 47 83 67 3.4
AL East WFR\Plcd 1 2 3 4 5 wins std min max act a-s
BOS 0.549: 328 337 321 14 0 79.1 5.6 62 96 86 6.9
NYA 0.548: 334 331 319 15 1 78.9 5.9 61 95 79 0.1
BAL 0.545: 337 323 321 18 1 78.4 5.8 62 93 71 -7.4
TOR 0.428: 1 8 36 785 170 61.6 5.7 46 78 56 -5.6
DET 0.365: 0 1 3 168 828 52.6 5.8 34 71 60 7.4
AL Cent WFR\Plcd 1 2 3 4 5 wins std min max act a-s
CLE 0.668: 997 3 0 0 0 96.2 5.8 80 114 100 3.8
CHA 0.524: 3 707 215 72 3 75.4 5.7 57 92 68 -7.4
KCA 0.475: 0 160 403 391 46 68.4 6.1 43 88 70 1.6
MIL 0.469: 0 129 366 454 51 67.6 5.9 48 84 65 -2.6
MIN 0.376: 0 1 16 83 900 54.1 5.9 38 73 56 1.9
AL West WFR\Plcd 1 2 3 4 wins std min max act a-s
CAL 0.564: 636 284 69 11 81.8 6.1 63 104 78 -3.8
SEA 0.542: 323 493 148 36 78.6 5.8 62 99 79 0.4
TEX 0.477: 22 97 419 462 68.7 6.0 51 88 74 5.3
OAK 0.470: 19 126 364 491 67.6 6.2 49 89 67 -0.6
Table 2. Scoring Summary for the 1000 Simulated 1995 Seasons
1995 runs scored runs allowed left on base
team gms act sim sm-ac std act sim sm-ac std act sim sm-ac
ATL 144 645 653 8 35.6 540 557 17 31.3 988 998 10
CHN 144 693 696 3 35.6 671 661 -10 34.8 956 980 24
CIN 144 747 738 -9 37.3 623 616 -7 34.2 1001 1016 15
COL 144 785 783 -2 39.4 783 786 3 40.0 1017 1011 -6
FLO 143 673 660 -13 34.8 673 662 -11 35.0 1028 1047 19
HOU 144 747 743 -4 38.1 674 649 -25 34.5 1153 1150 -3
LAN 144 634 643 9 34.9 609 618 9 34.9 1049 1065 16
MON 144 621 592 -29 33.1 638 639 1 34.8 970 1001 31
NYN 144 657 661 4 35.6 618 616 -2 34.9 1041 1029 -12
PHI 144 615 628 13 34.1 658 650 -8 34.4 1114 1104 -10
PIT 144 629 628 -1 35.2 736 746 10 38.7 1010 1023 13
SDN 144 668 636 -32 34.6 672 666 -6 35.6 998 1054 56
SFN 144 652 669 17 34.8 776 758 -18 39.5 1046 1017 -29
SLN 143 563 560 -3 32.6 658 663 5 35.6 957 984 27
BAL 144 704 701 -3 36.2 640 636 -4 35.7 1022 1042 20
BOS 144 791 778 -13 38.9 698 701 3 38.2 1084 1095 11
CAL 145 801 785 -16 40.0 697 684 -13 36.7 1056 1082 26
CHA 144 755 772 17 39.7 758 738 -20 38.7 1155 1139 -16
CLE 144 840 867 27 40.3 607 603 -4 34.4 1018 1041 23
DET 144 654 637 -17 36.8 844 848 4 42.1 1017 1015 -2
KCA 144 629 638 9 35.6 691 677 -14 36.4 1018 1034 16
MIL 144 740 687 -53 36.6 747 732 -15 38.4 1012 1074 62
MIN 144 703 689 -14 37.1 889 895 6 44.0 1025 1033 8
NYA 144 749 747 -2 37.6 688 672 -16 35.4 1126 1140 14
OAK 144 730 704 -26 37.6 761 750 -11 37.6 1049 1060 11
SEA 145 796 769 -27 39.3 708 702 -6 37.5 1033 1064 31
TEX 144 691 676 -15 36.6 720 708 -12 36.8 1031 1064 33
TOR 144 642 653 11 34.8 777 759 -18 38.3 1079 1075 -4
----
tot 4032 19554 19393 -161 196.1 19554 19393 -161 196.1 29053 29439 386
Table 3. Summary of Playoff Simulations
PO A/S WSAct Sim Season ->
Team Lg Apps Wins Wins Wins 1991 1992 1993 1995 1996 1997 1998 1999 2000 2001 2002 2003
ATL N 12 0.52 1 1.94 0.092 0.139 0.192 0.149 0.176 0.254 0.246 0.154 0.113 0.081 0.132 0.208
NYA A 9 3.23 4 1.24 0.080 0.078 0.205 0.276 0.141 0.076 0.068 0.138 0.177
CLE A 6 0 0 0.83 0.355 0.217 0.067 0.070 0.089 0.034
OAK A 5 0 0 0.61 0.100 0.124 0.159 0.123 0.104
SEA A 4 0 0 0.60 0.074 0.078 0.131 0.314
SFN N 4 0 0 0.59 0.036 0.224 0.206 0.125
HOU N 4 0 0 0.56 0.127 0.219 0.127 0.088
ARI N 3 2.17 1 0.46 0.196 0.162 0.099
BOS A 4 0 0 0.41 0.073 0.081 0.092 0.162
SLN N 4 0 0 0.39 0.060 0.128 0.093 0.106
PIT N 2 0 0 0.33 0.168 0.160
TOR A 3 6.9 2 0.29 0.098 0.102 0.090
TEX A 3 0 0 0.28 0.186 0.028 0.063
MIN A 3 3.7 1 0.27 0.143 0.059 0.069
NYN N 2 0 0 0.23 0.137 0.090
BAL A 2 0 0 0.22 0.081 0.140
CHA A 2 0 0 0.22 0.102 0.114
SDN N 2 0 0 0.19 0.138 0.051
FLO N 2 10.53 2 0.19 0.092 0.095
CIN N 1 0 0 0.18 0.175
ANA A 1 7.14 1 0.14 0.136
PHI N 1 0 0 0.12 0.117
LAN N 2 0 0 0.12 0.052 0.064
CHN N 2 0 0 0.09 0.029 0.060
COL N 1 0 0 0.04 0.043
MON N 0 0 0 0
MIL N 0 0 0 0
TBA A 0 0 0 0
KCA A 0 0 0 0
DET A 0 0 0 0
Actual World Series winner in RED.
Table 4. 1995 Playoff Teams Winning Probabilities.
(Probabilities are for team in each column against team given in row.)
tm/tm ATL COL CIN LAN CLE BOS SEA
COL 0.5725
CIN 0.4890 0.4115
LAN 0.5616 0.4870 0.5734
CLE 0.4208 0.3545 0.4333 0.3673
BOS 0.5234 0.4622 0.5392 0.4714 0.6084
SEA 0.5284 0.4696 0.5467 0.4759 0.6145 0.5070
NYA 0.5289 0.4632 0.5431 0.4730 0.6110 0.5027 0.4944

Following is an example of this calculation for a 7 game series (N=7).
The series win probability is calculated by summing the 4 terms Pk
for K starting at 4 (Pk is the probability of winning k
games given by the expression being summed), that is for all the series
where there are at least 4 wins. Note the cumulative probability, the
sum of all Pk, is 1.000 as it should be. (N,K) is shorthand
for the binomial coefficient, the ratio of factorials at the end of the
expression for Pseries. Pgame=0.59 thus the N=7
series winning probability of 0.69, corresponds to the upper curve in
Figure 6.
Pg = 0.59 N = 7
K PgK (1-Pg)(N-K) (N,K) Pk Cuml
0 1.0000 0.0019 1 0.0019 0.0019
1 0.5900 0.0048 7 0.0196 0.0216
2 0.3481 0.0116 21 0.0847 0.1063
3 0.2054 0.0283 35 0.2031 0.3094
4 0.1212 0.0689 35 0.2923 0.6017
5 0.0715 0.1681 21 0.2524 0.8541
6 0.0422 0.4100 7 0.1211 0.9751
7 0.0249 1.0000 1 0.0249 1.0000
Pseries= 0.6906