8/23/2020 Sky Case Study | Google Cloud

Sky: Scaling for success with Sky Q diagnostics

Using managed serviAcbbeosou uott n SSkkyy Google Cloud Platform, Sky replaces its on-premiBsasseeedds i ni n Lb oLnoidgnodno, Snk, yS iksy o nise oEnureo Epeu’sro lepaed’sing lmeaeddiain agn md teedleicao amnmdu tneicleactioomnsm coumnipcaantiieosn, s data platform in recocoropdemra ptainingmi eins t,e hoep UeKra, Itrienlagn idn, Gtheerm UaKn, yI,r Aeluasntrdia, , Italy, and Spain. to meet the increasedGe rnmaenye, Adustria, Italy, and Spain.

Industries: Media & Entertainment of its next generationIn Sdukstyri eQs: Media & Entertainment box, in millions of Location: United Kingdom homes. Location: United Kingdom

Products: Cloud Pub/Sub (https://cloud.google.com/pubsub/), Operations Google (https://cloud.google.com/products/operations ) Cloud Zero diagn, Doatastowic (https://cloud.google.com/dataow/), Results BigQuery (https://cloud.google.com/bigquery/), data loss fACrloobudmo Stuoratg eDatatonic

https://cloud.google.com/customers/sky-uk/ 1/11 8/23/2020 Sky Case Study | Google Cloud

Replaces millions ofG ( hoStotgpklse: /yC/clo louQudd P.groemogieler. Pcoamrtn/setro Dragtaet/o)n, Cicl oduedliv IeorTs out of bCiogr de a(htatt apnsd:// mclaocuhdi.ngeo loegalren.cinogm s/oiolutt-cioonrse /f)or capacity boxes telecommunication, media, retail, and nance on- clients. premises platform in only six Cloud Pub/Sub (https://cloud.google.com/pubsub/) weeks About Datatonic Ability to Operations capture (hGttoposg://lecl oCulod.ugdo oPgrleem.coiemr /Pparordtnuecrt sD/oaptaertaotnioicns)

diagnostic Dadtealivoewrs ( hbtitgps d:/a/ctlao uadn.gdo mogalec.hcoinme/ ldeaatranionwg/) data from solutions for telecommunication, media, BigQuery (https://cloud.google.com/bigquery/) all Sky Q retail, and nance clients. boxes to Cloud Storage (https://cloud.google.com/storage/) match Cloud IoT Core (https://cloud.google.com/iot-core/) demand with no additional DevOps Establishes a hub for all diagnostic data used to enhance the Sky Q customer experience

Sky (https://www.sky.com/) is one of Europe’s leading media and communications companies, providing Sky TV, streaming, mobile TV, broadband, talk, and line

https://cloud.google.com/customers/sky-uk/ 2/11 8/23/2020 Sky Case Study | Google Cloud

rental services to millions of customers in seven countries. Delivering customer service at such scale is a major challenge, so to help ensure the best possible user experience, Sky collects diagnostic data from its millions of TV boxes, ready for analysis, insight, and action to help ensure service uptime and delivery.

For many years, that meant gathering data on an old Hadoop cluster, as Oliver Tweedie, Director of Data Engineering at Sky, explains: “Sky’s Hadoop cluster was built in 2013 to the specications of its time, but things moved on fast, both in terms of diagnostics data volumes and in what companies want to do with data. With the introduction of the new Sky Q boxes, we started to see bottlenecks in the diagnostic data collection setup.”

“The data will sit right at the heart of Sky's future strategy. It will help ensure that our products are intuitive and easy to use and that we can keep

https://cloud.google.com/customers/sky-uk/ 3/11 8/23/2020 Sky Case Study | Google Cloud

seamlessly connecting customers with the content and services, they know and love Sky.”

—Oliver Tweedie, Director of Data Engineering, Sky

Those bottlenecks had serious consequences, leading to processing backlogs of up to 50 percent of daily data, which compromised feedback and limited the usability of the entire dataset. Sky looked for a solution that could handle the data volumes, but also collect additional usage information to create a rich dataset for analysis.

“By collecting diagnostic data, we can create an essential feedback loop from inside the home,” says Oliver. “The data will sit right at the heart of Sky's future strategy. It will help ensure that our products are intuitive and easy to use and that we can keep seamlessly connecting customers with the content and services, they know and love Sky.”

https://cloud.google.com/customers/sky-uk/ 4/11 8/23/2020 Sky Case Study | Google Cloud

Cost-effective scaling with zero DevOps

On-premises infrastructure can create problems for businesses looking to put big data at the heart of company strategy. Providing for peaks may mean maintaining idle servers, while failing to meet those peaks can cause data loss that compromises entire datasets. At the same time, increasingly diverse data processing possibilities add to a DevOps burden that may make solutions dicult to scale.

For Sky, an increased ow of diagnostic data from its TV boxes ran up against the limitations of its existing, on-premises Hadoop cluster. “As we rolled out Sky Q, we had more trac and more diagnostic metrics from the TV boxes,” says Oliver. “The on-premises infrastructure was struggling to meet increased demand. We were chasing our tails to x bottlenecks in the network stack as they emerged. Up to 50 percent of the data on any given day was held up waiting to be processed. Because the boxes would report back in bursts, we would get spikes in ows of data. That made the biggest problem one of infrastructure scale.”

As data reliability became a critical issue, Sky began mirroring diagnostic data in . “Without diagnostic data, we cannot assess service quality,” explains Oliver. “Sky Q is a sophisticated system that relies on several features working well, so if some of our boxes are invisible to us, it means we are blind to the customers’ experience.” However, mirroring the

https://cloud.google.com/customers/sky-uk/ 5/11 8/23/2020 Sky Case Study | Google Cloud

data proved an expensive workaround rather than a true solution. Sky looked to create a new, cloud-based architecture that could scale, with the potential to handle not only diagnostics from millions of Sky Q boxes, but data from all Sky products to work towards improving the entire Sky experience.

To do that, Sky worked with Google Cloud Premier Partner Datatonic (https://datatonic.com/) to create a solution on Google Cloud Platform (https://cloud.google.com/) (GCP). By landing diagnostic data from set-top boxes directly in Cloud Pub/Sub (https://cloud.google.com/pubsub/docs/), Sky eliminates data loss caused by bottlenecks in server capacity. Data is then parsed through Cloud Dataow (https://cloud.google.com/dataow/) to Cloud Storage (https://cloud.google.com/storage/) and BigQuery (https://cloud.google.com/bigquery/), monitored on its way by Stackdriver (https://cloud.google.com/stackdriver/) , which triggers email and Slack alerts should issues occur.

"This project went into production in less than three months’ development

https://cloud.google.com/customers/sky-uk/ 6/11 8/23/2020 Sky Case Study | Google Cloud

time, due to the serverless architecture and NoOps design. Since the launch, there’s been no data loss and no noteworthy incidents, and we continue to scale out to more set- top boxes and more countries without friction or rework."

—Louis Decuypere, Founder, Datatonic

“Sky publishes between 200 and 300 million events per day, and up to 600 million at peak, with that number doubling next year,” says Louis Decuypere, Founder at Datatonic. “Cloud Pub/Sub can handle that volume straight out of the box with no need for manual interference. You just publish your event and Pub/Sub makes sure it gets delivered. It's massively

https://cloud.google.com/customers/sky-uk/ 7/11 8/23/2020 Sky Case Study | Google Cloud

scalable and globally distributed, so Sky could launch the system in another country with very minimal additional setup required.”

Due to the entire pipeline being built with Google managed services, the solution scales automatically to match the peaks caused when set-top boxes report in bursts. In addition, should Sky choose to move away from batch processing in the future, the solution’s combination of Cloud Dataow and Apache Beam (https://beam.apache.org/) will help ensure an almost seamless transition. “With changes to a couple of lines of code, we could use the same framework to switch from batch to real-time processing,” says Louis. “At this level, that's unique to this technology.”

BiqQuery lies at the heart of the solution. As well as supplying dashboards in Tableau, BigQuery acts as a hub for all of the collected data, making it available on a self-serve basis for a range of teams and tools. Both raw and enriched events are also stored in the near- innite capacity of Cloud Storage, made especially cost-effective with the Coldline and Nearline (https://cloud.google.com/storage/docs/storage-classes) archival storage classes.

"This project went into production with less than three months’ development time, due to the serverless architecture and NoOps design,” says Louis. “Since the launch, there's been no data loss and no noteworthy incidents, and we continue to scale out to more set-top boxes and more countries without friction or rework."

https://cloud.google.com/customers/sky-uk/ 8/11 8/23/2020 Sky Case Study | Google Cloud

Forming a strategic relationship

Sky switched cloud provider when it moved from the data mirroring measure to the GCP solution. “With Google, we could see the opportunity for a true strategic relationship,” says Oliver. “We established a framework agreement and a Platinum support package. On our own, it wasn’t cost effective to build and maintain our own infrastructure. Technology was changing too quickly. Google Cloud has allowed us to capitalize on that technological change, and not worry about the scale of the infrastructure. When we had issues with the way data was ingested into Pub/Sub, we were put through to the Pub/Sub support team who came up with a resolution really quickly.” For a company of Sky’s scale, the proven performance of Google offered additional reassurance.

“Google uses a version of this architecture to process the diagnostic data that comes back from Android mobile phones,” says Oliver. “We’re using a technology that had been battle tested with that number of devices and on a global scale. It's mature technology that is now commoditized and opened up to the rest of the world.”

https://cloud.google.com/customers/sky-uk/ 9/11 8/23/2020 Sky Case Study | Google Cloud

“This fully functioning Google Cloud Platform solution will act as a blueprint for future Sky projects. We can capture all diagnostic data in GCP and use it to inform our future strategy. Sky management sees this as the beginning of a new era in data management, analytics, and data science.”

—Oliver Tweedie, Director of Data Engineering, Sky

https://cloud.google.com/customers/sky-uk/ 10/11 8/23/2020 Sky Case Study | Google Cloud

Collecting the data to drive decisions

It took Sky and Datatonic just six weeks to develop, test, and go live with the new solution. Since then, Sky reports all diagnostic data has been successfully collected from Sky Q boxes. Sky now plans to collect all of its diagnostic reporting on the new solution, either expanding its current pipeline, or the architecture with Cloud IoT Core (https://cloud.google.com/iot-core/). By combining set-top box diagnostic and viewing data with streamed and batched information from reference feeds and other resources, Sky will create a data warehouse on BigQuery as a one-stop shop for all queries.

“This fully functioning Google Cloud Platform solution will act as a blueprint for future Sky projects,” says Oliver. “We can capture all diagnostic data in GCP and use it to inform our future strategy. Sky management sees this as the beginning of a new era in data management, analytics, and data science.”

https://cloud.google.com/customers/sky-uk/ 11/11