Uncovering Steam’s Hidden Genres Through Machine Learning
What if I told you that, beyond Steam’s 447 tags, is a plethora of hidden themes, categories, and genres that nobody’s ever seen before? This is a post about using natural language processing to find and label “latent clusters”—that is, to automatically group games into previously undiscovered categories and give them meaningful names.
Imagine walking into the world's largest library, but instead of neat shelves and helpful librarians, you're faced with a massive vortex of books reaching out into infinity. This is the challenge Steam faces with its vast catalog of 91,470 titles.
Valve does a good job of organizing this by crowdsourcing tag applications, but that doesn’t catch everything, especially if (say) a genre is new or emerging. Turns out, we can apply machine learning techniques both to further organize this vast library and also discover new ways of grouping games that curators haven't thought of yet.
The State of Steam Tags
You got your blockbuster action titles, your Roguelike Deckbuilders, and even your Hidden Object Cat games. It’s great—your modern day equivalent of the Dewey Decimal System for games.
But try finding all the classic Sierra Online/LucasArts style point-and-click adventures, and you'll hit some limitations. The "point & click” tag includes things like Inscryption (primarily a roguelike deckbuilder), while the “adventure” tag brings up Forza 4 (a racing game with adventure elements). If you ask for both tags, together, you still get titles like A Tower Full of Cats (a hidden object game).
Fortunately, using Steam’s Web API, you can mix and match some hard rules to narrow things down—for instance, with We ❤ Every Game, we’re able to index Sierra Online style games on Steam by listing all highly-rated titles that have both “Point & Click” and “Adventure” in their top 5 tags. This approach allows us to surface games that more closely match the classic point-and-click adventure style, filtering out the noise of broader tag applications.
We can apply similar techniques to other niche genres. Puzzle game designer extraordinaire and possible Londoner Alan Hazelden suggested we try logic puzzle games. To do this, we set up a query that a) requires "logic" and "puzzle" tags in a game's top 6 tags, and b) excludes titles with Automation, Strategy, Hidden Object, or Puzzle Platformer in their top 3 tags. You can see the results here—some of those might not be strict logic puzzles, but overall, not a bad curation of something Steam doesn’t natively surface.
Beyond Traditional Categories
However, not everything is possible to categorize in this way. Nicolae Berbece of Those Awesome Guys coined the term “knowledge gate games” for titles like The Case of the Golden Idol. These titles focus purely on uncovering mysteries in their game worlds, rather than (e.g.) progressing through levels or building up character attributes.
This concept overlaps somewhat with the "detective" tag, but that tag also includes games like Oh Deer (where you are, indeed, detectiving fake deer, but gameplay is more about blending in and pooping than about piecing together a mystery).
So, how do we surface games like Obra Dinn for Nick?
Got Infinite Time?
You could sit down and go through all Steam’s 91,510 titles (for the eagle-eyed, this number has increased by 50 since the time I quoted it in the first paragraph) and categorize them manually. This is pretty time-consuming for a single person or even a small group, which is why Valve opened tagging up to all players. But with well over ten thousand titles appearing on the platform each year, and new genres emerging all the time, manual categorization alone is still missing stuff, like those knowledge gate games!
What’s the solution, then? If phrases like "unsupervised soft polythetic clusters" get you excited, you can already guess what I’m going for here. If not, that's okay, because it’s just a fancy way to say this:
Categorizing Games and Discovering New Genres through Magic Automation
To categorize a loose bag of things, you need meta-information about them. In this experiment, I decided to focus on four pieces of info about every game:
The title - To identify the game we’re analyzing.
Marketing copy - Developer- and publisher-supplied short and long descriptions.
Tags and genres - To give us a basic categorization framework.
Top player reviews - Crucial because players often discuss themes and gameplay mechanics that the above marketing copy overlooks.
But how to then process it all?
You might have heard of BERT (Bidirectional Encoder Representations from Transformers), a powerful machine learning model for natural language processing. It’s a tool capable of understanding context, making it a good place to start thinking about the problem. However, while it’s powerful, it can only process small chunks of text (on the order of a few thousand characters) at a time.
That was a problem here, since a comprehensive description of a game might easily cover 20,000 characters or more. To overcome this hurdle, I began searching for a more suitable tool—either a BERT descendant or a related model capable of handling larger volumes of text.
Riding the AirTrain
I happened on a Hacker News post by the folks at Airtrain: semantic clusters and embeddings for 500k Hacker News comments. In layperson’s terms, they grabbed half a million user comments from the site, tossed them into a hopper, and let their computer categorize them. You can peruse the results here in an interactive dashboard. This seemed perfect for our game categorization challenge!
As an initial test, I gathered the aforementioned 4 pieces of metadata for 8,600 highly-rated Steam titles (games with 80%+ player ratings and at least 50+ reviews). Then I stuffed that into JSON Lines format:
{"data":{"tags":["Life Sim","Indie","Multiplayer","Sandbox","Open World","Moddable","RPG","Level Editor","Massively Multiplayer","Simulation","Action","Casual","Adventure","Shooter","Singleplayer","First-Person","Violent","Action-Adventure","Nudity","Retro"],"about":"Everything can be modded in this city sandbox with jobs, NPCs, apartments, garages, gangs, and more. Broke Protocol is an open-world action game with a strong focus on RP and custom content. You define your own goals and identity in a persistent and reactive world.","genres":["Action",…
(…and so forth!)
While this was a good start, the sheer volume of text was still just a tad too large. If you’ve ever pored through Steam reviews, you’ve probably noticed two things:
Some are high-quality but tend to ramble (below, left).
Other reviews are highly-rated by other players, but don’t contain any useful information (below, right).
Both of these take up our precious character/token limit!
Reading through them all and summarizing them before stuffing them into Airtrain would work, but—again—that’s a huge undertaking. Fortunately, here in 2024, we have some very capable large language models.
My current favorite for doing this relatively cheaply on large datasets is Claude 3 Haiku. So, I dumped each game’s pile of reviews into the Haiku hopper, had it extract themes, gameplay mechanics, verbs, and so forth, and ended up with a fantastic, dense package of information to send to Airtrain. I sent that over and waited a half hour.
Success: Clusters Ahoy!
While Airtrain is still developing a feature to make the interactive aspects of dashboards publicly shareable (stay tuned for another post!), here are a few of the classifications their system discovered:
Cluster #1: Communist Struggles
It identified a group of games themed around communism, creating a distinct category for them:
While Steam does have a “capitalism” tag, they don’t have a “communism” one (I’d argue that’s for the best—you don’t need tags for every little thing), but Airtrain correctly identified and named this cluster, even from this small dataset. I imagine that if we had included Papers, Please, in our 8,600 games, it’d appear here as well.
Cluster #2: Dark Hand-Drawn Adventures
The system also identified a trend in hand-illustrated horror games, grouping them into their own category:
It also came up with Fishing Creature Collecting, Anime Magic Girls, Backrooms Horror Games, Microscopic Animal Saviors, Toilet-Themed Games, and a bunch of others:
Gladiator Management RPGs and Cyberpunk Strategy Games - Totally something you could surface with tags, but not all combinations of tags yield anything valuable or sensible, so it’s handy to not have to try all 199,362 pairs of tags (or 88,716,090 triplets) by hand.
Purrfect Apawcalypse Games - I like that it got creative with the name.
Comedic Athens RPGs - I was unaware that there were enough of these to constitute a genre, but I learn something every day.
Also a category called "Deep Dark Gachimuchi Games," which is all about oily wrestlers, and if you’re not familiar with that, you should probably not google that while you are at work.
Computers + Humans = Win
The results are awesome, but it’s still good to have human eyes on it. We have some racing games like "City Zoomer" and "Race The Sun" categorized as skateboarding games, grouping them with actual skateboarding titles like "Tony Hawk's Pro Skater" and "SkateBIRD." While these games all involve fast-paced movement through urban environments, their core gameplay mechanics are quite different.
The system also created some broad categories, such as "Unique Puzzlers," which included games as diverse as "Q.U.B.E" and "Wordle." While both are puzzle games, their gameplay mechanics and overall experiences are vastly different. This highlights the ongoing challenge of balancing specificity and inclusivity in categorization.
Better Data In = Better Genres Out
This really was just my first stab at this, and I am certain that improving the data I toss in (e.g., review summarizations especially) will yield better output.
But even then, perfect automated categorization for every game isn't necessary for this approach to be useful. My overarching hypothesis is that humans can use modern machine learning tools like exosuits—augmenting our abilities rather than replacing them entirely.
For instance, if I wanted to curate a collection of "Relaxing City Builders," the system was able to group "Polyville Canyon" (which has the Relaxing tag) together with "Dream Town Island" (which doesn't), which means that it picked up something a straight tag search would have missed. This is humans and computers working together, hand-in-gripper!
Next Steps
The implications of this tech seem promising. For players, it could mean being able to find niche games that perfectly suit their tastes (or, for journalists and streamers, their audience’s tastes).
“I want roguelike deckbuilders that evoke the essence of the French Art Deco movement circa 1919.”
- At least one player, probably.
All this is promising that I’ll explore it further in coming months. And I hope it’ll lead to some new ways to think about and categorize games, not only because it could enhance discovery for players, but also provide interesting insights for developers and publishers about emerging trends and untapped niches in the gaming market.
Stay tuned for more!
Other Stuff to Read
That’s the end of the post! If you’re curious about other topics, like generative AI on Steam, I did a post on The Surprising Number of Steam Games That Use GenAI. (Spoiler: one thousand.)
And if you haven’t checked it out, take a gander at We ❤ Every Game, where we’re creating the world’s most comprehensive video game recommendation engine.
Check out what we’re doing with our automated video content—we have a YouTube show that updates thrice weekly.
Finally, if you’re in the games industry, or just want to hear more of this stuff, connect with me on Twitter/X!