![]() We'll be retrieving data from both APIs and combining them to form our dataset. It provides a number of useful metrics including an estimation for total owners of each game. SteamSpy is a Steam stats-gathering service and crucially has data easily available through its own API (documentation here). Luckily we can easily get this data from another website, SteamSpy. We'll be able to get good information about the details of each game from the Steam API, however we're still missing information about popularity and sales. This documentation of the StorefrontAPI will be particularly useful. Unfortunately this documentation doesn't include all access points, but others have documented this for us. Typically an API is a great way for developers to allow access to databases and information on a server. An API such as this allows anyone to interface with data on a website in a controlled way, usually providing a host of useful features to the end-user. ![]() Fortunately Valve (the company behind Steam) make one available at. Often when generating data the best place to start is to check for APIs. Obviously we could search the web (and especially kaggle) for existing datasets, however to avoid letting someone else get away with all that hard work (and mainly for the purposes of learning) we'll be acquiring all the data ourselves from scratch. There are a number of ways to get this information. ![]() Metrics like ownership and ratings should help define the success of a title. We can then interrogate the data, and investigate whether particular attributes tend to result in more successful games. Finally we'll summarise our findings in a non-technical report which would be sent to the fictional company in question.Īt the end of the data collection and cleaning stages, we'd like to end up with a table or database like this: name In the future we'll look at cleaning the data, transforming it into a more useful state, then on to data exploration and analysis. The first step will be tackling data collection - the actual retrieval of data from Steam's servers and databases. We will imagine that we have been approached by a company hoping to develop and release a new title, using the findings we provide them to inform decisions about how best to manage their budget and hopefully increase the success of their next release. The motivation for this project is to download, process and analyse a data set of Steam apps (games) from the Steam store, and gain insights into what makes a game more successful in terms of sales, play-time and ratings. With that in mind, if we can construct a dataset from Steam's data, we will have access to a wealth of information about nearly 30,000 games released since 2003, when Steam first launched. Whilst other platforms are emerging and gaining traction, there is likely no better resource for examining gaming over the last decade. ![]() It's a bit like Google's Play Store or Apple's App Store for phones.Ī large part of Steam's success as a platform is due to its use of frequent sales, convenience as a unified digital game library, and the aforementioned shift to digital over physical. It hosts a variety of community features, allows pushing game updates to users automatically, and gathers news stories relevant to each title. In case you are not familiar, Steam is a digital store for purchasing, downloading and playing video games. Whilst physical copies still just about feel at home on consoles, the PC market has long since moved digital. Fast-forward a little over 10 years and the Steam Store is huge, ubiquitous as the home of PC gaming and distribution.
0 Comments
Leave a Reply. |