Author’s note: This is the fourth in a series of pieces that will offer a mix of facts, unknowns, and speculation on one of the Hobby’s most iconic brands. This installment examines the relative scarcity of the various cards.
Unlike Topps Project 2020 or Topps Now, where print runs are published directly on the Topps website, older sets generally come with little to no information as to quantities produced. Yes, there are exceptions, such as the 1914 Cracker Jack set…
…where the card’s reverse tells us, correct or not, that “Our first issue is 10,000,000 pictures.” In most cases, however, we simply make educated guesses or leave the topic alone entirely.
In this article, I’ll share my own educated guess at the 1933 Goudey set, but perhaps more importantly I’ll “show my work” and by doing so offer a framework that collectors might find applicable to several other sets.
Population report – Fact or fiction?
Despite the various limitations and distortions inherent, I’ll begin with the PSA Population Report for 1933 Goudey, pulled on November 10, 2020. If you’re not familiar with such reports, what they show is the number of times the grading company has assigned a particular grade to a card in the set.
We can see (or not see, if you’re reading this on your phone) from the report, for example, that PSA has assigned a grade of 8 to Dazzy Vance twelve times. We can similarly see (with a little math) that PSA has graded Dazzy Vance cards 441 times in all, including half grades and qualifiers. Move down the list to Hugh Critz and we find his card graded 323 times.
We avoid the conclusion that Hugh Critz’s card (POP 323) is more scarce than Dazzy Vance (POP 441) since it is common knowledge that Hall of Famers are more likely to be submitted for grading than common players. In reality, both Critz and Vance belonged to the same printing sheet and therefore were very likely produced in identical quantities.
The question, then, is whether we can conclude anything at all from population reports, given their inclination to distort reality.
Order amid chaos
This graph shows the PSA population for each card, 1-240, in the 1933 Goudey set. Good chance you can pick out the Ruth and Gehrig cards, not to mention Napoleon Lajoie.
Here is a look at the same data, this time sorted first by sheet number and next by population. For example, the first 24 cards graphed correspond to cards 1-5, 25-35, and 45-52 (i.e., Sheet 1), and the most frequently graded subject from the sheet, Jimmie Foxx, is shown first. For lack of a better spot, I put the Lajoie card, printed in 1934, at the very end of the graph.
There are now three discernible patterns of interest.
- Each sheet in the set includes several players whose cards are graded disproportionately often. You’d be correct to imagine Hall of Famers and stars here, along with Benny Bengough and Moe Berg.
- Each sheet in the set includes a large number of cards (“generic players”) graded much less frequently: the Hugh Critzes, Ed Morgans, and Leo Magnums of the set.
- Within a sheet, the population for generic players is relatively uniform.
Here is a closer look at Sheet 4, where the tall bars correspond to seven Hall of Famers and the short bars correspond to far less sought after players.
A final property of interest, true for all sheets and not just this one, is that the star player populations cover a very wide range (744 – 476 = 268) while the generic player populations cover a much narrower one (334 – 283 = 51). Another measure of the same thing is that the standard deviation is 96 for the first group and 18 for the second group.
Estimating relative scarcity
We know, therefore, that while a graph like this one might be interesting it may not be telling us anything real about relative scarcity. Perhaps all it’s really showing us is which sheets have the best players.
The standard deviations corresponding to each bar (130, 100, 238, 116, 162, 293, 190, 109, 126, and 109) sound further alarms for treating our data as clean or uniform.
To arrive at data that are meaningful and useful we need to eliminate the undue impact of star players. There are many ways this can be accomplished. The one I’ve chosen is to restrict my data set to the bottom eight players per sheet. In the case of Sheet 4, that would mean the players in the orange rows below.
Examining the sheet’s entire roster, you might wonder why I limited myself to the bottom eight when even the bottom 17 would seem to have worked. The main reason is that some of the other sheets in the set have far more stars than this one. Also, 8 from 24 gave me a nice simple fraction, the bottom third, that I wouldn’t have if I’d used the bottom 9 or 10 players.
At any rate, here is what happens when we restrict our interest to the “bottom eight” on each sheet. I’ll also mention that standard deviations are now 13, 12, 18, 5, 6, 12, 10, 12, 11, and 12, which tell us the data has almost no variability within a given sheet.
Mostly just for fun, here are the two preceding graphs plotted together.
At first glance, perhaps the data aren’t all that different after all. However, there are at least a few instances where the shift from the full data set to the bottom third is instructive–
- Sheets 1 and 2 – Our original graph suggested Sheet 1 cards were more plentiful than Sheet 2 cards. However, our new data suggests cards from each sheet are equally plentiful.
- Sheet 6 – Our original graph suggested Sheet 6 cards were quite common. Our new data suggests Sheet 6 cards are among the more scarce in the set.
YOU REALLY TRUST THIS?
If all we had were these graphs, then it would be reasonable to worry that random variation was the biggest factor behind the differences from one bar to the next. However, the very small standard deviations associated with each data set convince me that the differences here are real. That said, I’d be either crazy or lazy (and have been accused of both!) not to corroborate my results against other sources.
Second only to the PSA population report the next largest source of 1933 Goudey data comes from rival grader SGC’s population report. Across the Goudey set, SGC has graded between 25 and 30% as many cards as PSA, hence more than enough to be of interest. The graph below shows “bottom eight” numbers from SGC alongside the PSA numbers.
Multiplying the PSA data by 0.27 (or any number in the vicinity) puts the numbers on roughly the same scale and facilitates at-a-glance comparison.
As you can see there is very little difference between the PSA and SGC data. This is further evidence to me that the numbers are genuinely meaningful.
Other ways to remove the effects of stars
I mentioned earlier that I landed on a “bottom eight” approach to be sure I didn’t accidentally include any star players from some of the more loaded sheets. Still, it’s worth looking to see how robust the patterns in the data are against other methods.
Since the PSA and SGC data were quite consistent I’ll use this new graph that adds the two as my new baseline for comparisons against other methods.
Here are the sheet averages when restricted to the bottom twelve cards per sheet. There is virtually no change to the data.
Another approach that eliminates the impact of stars is to take the median. An advantage is that this approach also avoids any outliers at the bottom of each data set. As long as the number of star cards on all sheets is less than 12, it may be that the median will reflect the true card populations better than anything I’ve used thus far. Here I will revert to using PSA data only since the PSA and SGC counts are too different to produce a meaningful median. (It would often end up being the average of the least graded PSA card and the most graded SGC card.)
Again, the relative ordering of the bars remains nearly identical. A careful look will show that Sheets 5 and 6 have flip-flopped, but the differences are small enough to regard the two sheets as virtually tied under either measure.
While population reports can be misleading on the whole, I believe they can offer reliable data on the relative scarcity of cards in the set provided disproportionately graded cards can be removed from the analysis in a systematic way.
Where a set is issued in multiple releases or series, the “bottom third” approach offers a methodology that does not require any card-by-card judgments be made, though a global judgement that the set has enough generic players to support the approach would still be required. As has been seen, the bottom third approach could likely be replaced by a bottom half or median without impacting results unduly.
Examples of sets that should be amenable to the same approaches used here include the various Topps flagship sets from 1952-73, though care would need to be taken where a particular series is already known to be particularly tough. For example, the final series of 1967 Topps is so famously difficult that it’s easy to imagine even its common players being disproportionately graded.
While I’ve (mostly) opted for objective data over speculation in this series of articles, I’ll nonetheless close with the reasons this analysis is most interesting to me personally.
As important as the 1933 Goudey set is to the history of the Hobby, it is surrounded by unknowns. It is my hope that various high-effort-low-yield attempts to learn more about the set will ultimately fit together into a coherent and more complete narrative than what we have today.
While population information may be of interest to some collectors on its own–perhaps some of you will head to eBay and start buying up Sheet 9 cards as a result of this article!–I believe it also offers hints at other topics of interest such as the set’s chronology. For example, a conjecture of mine is that the first two sheets of the set comprised a single 48-card release. Such a conjecture is strengthened by the two sheets having nearly identical population data. Meanwhile, the likelihood that Sheets 3 and 4 formed paired releases appears unsupported by population data.
I’ll end with a mini-mystery unrelated to the “big picture” of the set but instead confined to a single card. In reviewing population numbers for literally 240 different cards there was one card that stood out. Maybe you can spot it among the PSA populations for Sheet 2.
In addition to having at least “minor star” status, Jimmy Dykes also has the only known significant variation in the set. (I’m ignoring proof cards, print defects, and copyright cards here.) In case you’re not familiar, Goudey corrected his age from 26 to 36 at the start of the third bio paragraph.
As tends to happen when an error is corrected, both the original error version and the corrected version each acquire relative rarity within the set. As such, I would expect both versions of the card to be disproportionately graded, and I would certainly expect to see a lot more Dykes cards graded than George Blaeholders!
My own conclusion is that the Dykes card is genuinely rarer than the rest of the cards on Sheet 2. Given that the cards were printed together, my personal theory is that Goudey didn’t simply swap in the new Dykes for the old one at some point but instead pulled Dykes entirely during some portion of the interim.
Of course an alternate theory is simply that Dykes no longer gets the Hobby love he once did and that the card’s variations are largely off the radar. Either way, I hope the example illustrates yet another potential use of the population data to tell a larger story about the set.
The next article in this series examines the player selection of the 1933 Goudey set.
7 thoughts on “Overanalyzing Goudey, part four”
Love the statistical thinking in this post. While I’ve dug into other aspects of their business environment, this beats any angle I could take on the hard numbers.
Have you thought about likelihood of specific card “survivability” to the modern hobby? It feels like Ruth/Gehrig/etc would also be submitted more often because they were saved as something precious more often. While I’m not sure how to split hairs by specific reason for higher submission rates, I see it as a contributing factor.
YES! I suspect there are two notable trends, but I haven’t done the heavy number crunching in support–just some spot checks. First, I believe the top shelf superstars survived into the present at higher rates than commons. Perhaps not a surprise. Second, where a star had multiple cards in the set, the overall survival rates appear negatively impacted, as if many collectors felt little need to have multiples of the same player.
Great point about the multiple cards for a player. I bet those extras for, say, Carl Hubbell or Babe Ruth, proved prime trade bait between Giants and Yankees fans. Good thing there’s just the one significant variation for Dykes in the set. If 1933 Goudey were more like 1981 Fleer, this project would be madness to track.