Rare Has Been Developing Video Games for More Than Three Decades
Total Page:16
File Type:pdf, Size:1020Kb
Rare has been developing video games for more than three decades. From Jetpac on the ZX Spectrum to Donkey Kong Country on the Super Nintendo to Viva Pinata on Xbox 360, they are known for creating games that are played and loved by millions. Their latest title, Sea of Thieves, available on both Xbox One and PC via Xbox Play Anywhere, uses Azure in a variety of interesting and unique ways. In this document, we will explore the Azure services and cloud architecture choices made by Rare to bring Sea of Thieves to life. Sea of Thieves is a “Shared World Adventure Game”. Players take on the role of a pirate with other live players (friends or otherwise) as part of their crew. The shared world is filled with other live players also sailing around in their ships with their crew. Players can go out and tackle quests or play however they wish in this sandboxed environment and will eventually come across other players doing the same. When other crews are discovered, a player can choose to attack, team up to accomplish a task, or just float on by. Creating and maintaining this seamless world with thousands of players is quite a challenge. Below is a simple architecture diagram showing some of the pieces that drive the backend services of Sea of Thieves. PlayFab Multiplayer Servers PlayFab Multiplayer Servers (formerly Thunderhead) allows developers to host a dynamically scaling pool of custom game servers using Azure. This service can host something as simple as a standalone executable, or something more complicated like a complete container image. These servers are globally located in Azure data centers around the world and can be easily configured to run in the geographies that matter for any game. Rare uses this PlayFab service to host the persistent game worlds in Sea of Thieves. As players join the game and choose their crew, they are placed on a server with other players. Once the maximum number of players is reached on a server, the next set of players are assigned to the next available server in the pool, and so on. As players finish with their play session, they drain off these servers in a fragmented fashion, potentially leaving very sparsely populated servers. This is a negative for two reasons: 1) Players may be alone in their version of the world, and 2) the developer is paying for many more servers than they need. Rare solved this problem for Sea of Thieves by creating a custom game manager which can seamlessly merge players and their state off one server and onto another, allowing the player to join a world packed with other players, maintaining an engaging experience. The empty server can then be decommissioned, saving money. To ensure this server migration is completely seamless, Sea of Thieves uses a deterministic simulation to synchronize all worlds on all servers. Mathematically, all servers are instantiated using the same “seed” value, so each simulated world on each server will be in an identical state to all others, including time of day, location and size of waves, and locations and quantity of objects. So, moving a player to a different server won’t be a jarring experience. If a boat is at the top of a wave on “Server 1”, when they are moved to the identical x/y/z coordinates on “Server 2”, that wave will be in the same place. After release, Rare determined that there were key times of the day when time play was most active, and when play was least active. Using PlayFab Multiplayer Servers, they have been able to scale available servers based on demand for the game. Azure’s geographically distributed data centers are another important part of what makes Sea of Thieves work so well. Shortly after launch, Rare discovered that they had a large community of players in South America. Given that PlayFab Multiplayer Servers are available around the world, they were able to quickly stand up game servers in Brazil that allowed players in South America to have a much better experience with improved performance and reduced lag. Service Fabric In addition to PlayFab Multiplayer Servers, Sea of Thieves makes use of Service Fabric. Service Fabric lets developers easily orchestrate microservices and containers that are scalable. Rare has defined what the service infrastructure should look like, the number of instances of each microservice that should be provisioned and ensures that this declared setup is maintained through errors or other events. While the multiplayer servers described earlier maintain the state of the game world for the current gaming session, the microservices running on Service Fabric maintain the long-term state of player progression, perform matchmaking, handle commerce functionality, and a host of other tasks. In the diagram above, the Client Bridge, Server Bridge, Server Management and Player Progression services live in the same Service Fabric Cluster and are partitioned into 3 scale sets – Client Bridge and Server Bridge each have their own, and Player Progression and Server Management share a set. Service Fabric also easily handles scaling. If there is a need for more or fewer instances of a service during peak and low times, Service Fabric can and will automatically scale the services across available nodes, ensuring Rare is not paying for unused capacity, but also meeting player demand. Game Updates and Configuration With Sea of Thieves, both the client and server code will need to be updated through the lifetime of the title. Client deployments are handled through the Xbox and Windows Store processes; however, the server-side code can be updated far more often with little to no impact on the players in the game. With Service Fabric, microservices can be deployed and updated in-place minimizing player interruption. Rare’s CI/CD pipeline can go from “code checked in” to “deployed and running on Service Fabric” in under 20 minutes. For the servers running the game simulation on PlayFab Multiplayer Servers, the host binaries can be redeployed, so all newly provisioned servers for a game session will play using the new binaries, however those players in-progress will not be affected and will remain on servers running on the old binaries until their session is complete. The server side of Sea of Thieves uses a system of “configuration flags” to quickly enable and disable features without having to redeploy the client or server. This has come in handy quite a few times. For example, Rare discovered that during one new quest, there was a game breaking bug which occurred when a player handed another player a banana underwater. They were able to quickly mitigate this issue by flipping a configuration entry on the server which disabled this feature until they were able to patch the problem properly in the next release. Development and Testing During the development and testing phase, it was very easy for Rare to standup fully configured server setups using Azure Resource Manager templates. ARM templates allow a developer to fully define the configuration of all Azure services using a JSON document. They could use these templates to stand-up brand-new development environments per developer, or setup a test environment during the various stages of their internal, invited, and open beta periods. Telemetry and Analytics The Sea of Thieves client running on Xbox or PC sends telemetry data to one of the many microservices on Service Fabric for processing. This data contains crash and error information, details around frame rate and performance, and other bits of information that can help Rare fix bugs and fine tune the game for optimal play. This data is received by RabbitMQ instances running on simple Azure Virtual Machines. RabbitMQ is a third party, open source message broker. These messages are then routed through to the backend services which exist in a Service Fabric cluster of its own, as shown in the diagram above. To view this data in meaningful ways, Rare uses Azure Data Explorer for a real-time view of the data for immediate ad-hoc reporting and dashboarding. Azure Data Explorer is a fast and scalable data exploration tool for log and telemetry data. Using its custom query language, users can create ad-hoc queries that will run against huge data stores and return data quickly in tabular formats as well as visual formats like bar and line graphs. The data is also used by an Azure Data Factory pipeline where it is pushed into Azure SQL Data Warehouse, a relational database with massively parallel processing, for cold storage. From here, Rare can aggregate the data from the lifetime of the game, and do long-term reporting using standard SQL queries and other reporting tools. Conclusion Without the cloud, a game like Sea of Thieves would be practically impossible to build, run, and scale. With the ability to rapidly create, deploy, and scale services through the development and production lifecycle of the project, there is no need to purchase and maintain servers, estimate server sizes and quantities, upgrade hardware to handle spikes in usage, or see that hardware sit idle during periods of low usage. Azure handles these situations simply by configuring things like PlayFab Multiplayer Services and Service Fabric. Azure also allows Rare to acquire, query, and make sense of millions of pieces of telemetry data which they can use to update their game to improve performance, fix bugs, and add new features to keep players engaged. For more information on the items in this article, please see the following: Azure PlayFab docs Azure Service Fabric docs Azure Data Factory docs Azure Data Explorer docs Azure Virtual Machines docs This document is for informational purposes only.