I’m a long time Hearthstone player and a data scientist in training. When I was asked to come up with a project for my Data Science bootcamp, I naturally gravitated towards the world of video games and, in particular, Hearthstone. One of the goals of a data scientist is to take numerical data and turn it into actionable decisions and one of the hardest times I’ve had making decisions as a gamer is in the Hearthstone Arena.
Hearthstone presents a particularly interesting case because of the “casual” nature of the game. There is no API. There are no built-in profiles to track progress. Ranking is provided by icons and stars until legendary rank. All the available data for Hearthstone performance has been collected and recorded by dedicated individuals who want to learn from their past decisions and the decisions of others. The recommendation spreadsheets for arena cards are curated by professional Hearthstone players or groups of players.
In this post, I wanted to cover a portion of my own take on some of the data I got from the webmaster of Arenamastery.com. For those of you who aren’t familiar, Arena Mastery is a site where users can record their progress in the Hearthstone Arena. At the very least, users can record their wins, losses, and rewards from the Arena. Diligent players can even report their deck selection process. After contacting the webmaster, I was granted access to de-personalized Arena data collected on the site from the release of the game up until November 11, 2014 (a few months after the Naxxramas expansion was released). The data included over 75,000 complete 30-card decks (about 50,000 from Vanilla and 25,000 from Naxx).
Anyone who plays Hearthstone in the Arena is familiar with the choices one needs to make during deck selection. In this first part of my Hearthstone Data Science series, I wanted to cover something the user has no control over: card rarity.The arena presents an interesting case where users have no choice over whether or not they will have a legendary, epic, or rare card, only which card they will select. While some uncommon cards are very powerful, others are not always useful for the player.
For all of the analysis here, I used the R programming language. This language was designed for statistics and data manipulation. Python is also a good one, but I’m not so familiar with it. I’ll look at how the mean win rate is changed by the number of un-common cards. Since the data contain information on each card, this is a relatively straightforward process.
How often do rare, epic, and legendary cards appear when forming an Arena deck?
With the deck data in hand, I took the overall appearance of each card rarity as a proportion of total cards for all 75,000 games. Here’s what I found overall:
As a rule, at least every 1st, 10th, 20th, and 30th draft round is among uncommon cards. So what is the breakdown given it’s not one of the special rounds?
By now, this information is probably not very new. Most folks recording their own progress would probably see similar results. However, it was a good warm-up to start looking at this large set of data.
How do uncommon cards affect win rate?
Each class has a different average win rate, making it difficult to describe changes among each class. I was inspired by Reddit user randomechoes’ website as to how to tackle the issue of comparing performance among decks with varying numbers of cards. I looked at the mean win rate for decks grouped by the number of uncommon cards (rare, epic, and legendary) in those decks. For example, all the decks with 0 Legendaries were in one group, all decks with 1 Legendary in another, etc. I averaged the win rate of all such decks and reported them (as long as there were 50 or more decks that met the criteria).
Averages, however, are misleading for two reasons: 1) they need to be normalized and 2) different sample sizes hold different statistical weight. To address (1), I subtracted each class’ overall mean win rate from the mean win rate grouped by uncommon cards. To deal with (2), I utilized error bars (standard error) which take the standard deviation of the wins for decks in that group and divide by the square root of the number of decks being averaged. The narrower the distribution or the larger the sample size, the smaller the error bars. If error bars overlap, we lose some power to say they are statistically different.
So enough with all that, let’s take a look at the change in win rate as it depends on uncommon cards in Vanilla Hearthstone only:
Luckily for the player, there don’t seem to be any glaring trends that say your deck rarity determines your results. There are a few small exceptions, however. For druid decks, it appears that more of each kind of uncommon card yields an improvement in win rate. For paladins and mages, however, it appears as though more epic cards yields a decrease in win rate.
There are two explanations for this: 1) Players are systematically choosing the stronger or weaker uncommon cards or, perhaps more likely, 2) the uncommon cards to which players have access do not benefit them as much as some of the common cards.
What about Naxxramas?
Granted, the copy of my data is old. By now, Goblins vs. Gnomes, the latest expansion, is already in full-swing. But, from three months of Naxxramas data, a similar analysis shows some changes:
In this case, rare cards seem to help Hunters and Rogues, whereas having more than 6 rare cards tends to hurt Druids. For Druids, legendary and epic cards still tend to help.
Good news! It appears as though there are no game-breaking imbalances in place where lucking out and getting 8 rare card draws means an automatic win for your class. The arena does present an interesting test case for card rarity, though: players must choose from the pool of rare, epic, or legendary cards they are shown. Sometimes, a common card would do better than a rare card or a rare card would do better than an epic, and so on.
Finally, I want to make a plug for data tracking sites like Arena Mastery: it’s fun to learn from the data we are generating on a daily basis. I’m not a pro Hearthstone player. My hope is that I’ve maybe started a few discussions with this blog post and maybe caused people to open up a few spreadsheets and dabble in data.
If you liked this post, have any questions/comments, or ideas of things you’d like to see studied, feel free to post below!