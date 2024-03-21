In case you've been living under a rock for the past 15 years or so, Minecraft is one of the original open world sandbox video games. It doesn't have a predetermined story; there’s no beginning, no end. Players are enabled to do whatever they want in the game, and we see all sorts of behaviors: engaging in creative game mode, survival game modes, playing with their friends, engaging in battles with mobs, being super creative — you name it. It's a really, really big ecosystem.

Minecraft allows players to engage across a number of platforms, be it a mobile phone or a tablet, console or PC. We also have titles available on XCloud. So from a data standpoint, you can probably imagine that when we're operating with 22 different endpoints, it makes it a pretty serious task to normalize data that's coming from all of them.

Minecraft is also a very global game. We see engagement from all corners of the world; At one point we saw activity happening in Antarctica, and we were wondering what could possibly be going on in Antarctica? Is there a research station over there that's just happily playing Minecraft?

And last but not least, but very importantly, Minecraft is the best-selling game of all time, with over 300,000,000 copies sold. This was a milestone that was announced last year, and we're super proud to be able to cater to such a large audience — but you can imagine the task we have when it comes to representing at least 300,000,000 players at scale, using data.

Data: The Intersection of Art and Science

I want to talk a little bit about the intersection of art and science, and how Minecraft is really both. It's a product that’s heavy on art; my analytics and data science teams work very closely with people from creative backgrounds who make decisions on a day-to-day basis about very abstract questions. But there's also a science aspect to making decisions in video games, and really trying to understand which players need updates from the game team, for example, or things like connectivity, which can hugely impact the quality of the play experience. That's the science part, and we try hard to bring in people that can live in that intersection.

I'm going to rewind a bit and tell you a little bit of a story around the journey that we’ve embarked on over the past four or five years, trying to level up our data ecosystem to be able to represent our players at scale.

Back in 2019, we were capturing data from our various endpoints, and it was just too much data for our systems to be able to process. I’ve watched many teams have this happen over the years — in video games analytics, we sometimes have a tendency to capture too much data, to the point where our analysts, data scientists and engineers don't know what to do with it. We were also living in a world where the data was very fragmented: there are multiple game teams at the studio, and they can potentially make decisions around the services that support those games, and also there can be multiple client implementations of telemetry. Other teams produce data too. We see commerce, business, marketing teams, and others produce data that is not always consumed.