605 TV: Developing Consumer Data Products at Petabyte Scale

Media

D&A Strategy

Modern Data Platform

Axis was instrumental in helping us execute on our vision. The standards and frameworks they piloted created a bridge across our teams and helped unify our processes.

Jiafu Xu, Director of Product Management, 605 TV

Meeting the Challenge

How does a data product company democratize data product development and unify its data strategy in a cloud-first world?

It's hard enough to design a data platform for a company that serves a few thousand customers from over a dozen data sources. But how do you architect a solution that effectively manages 30 million data producers that generate hundreds of millions of data points each day? Data products are hard to build, maintain, and scale.

When you're using Big Data as an input to your product, the challenges can seem insurmountable.

But it doesn't have to be that way.

This case study highlights how Axis Group partnered with 605 TV to design and construct a scalable, sustainable data platform capable of serving the needs of a range of 605 stakeholders as they work with data at scale.

605 quote scott-1605 is a next-generation measurement and attribution company that delivers the fastest, most robust insights in the industry. By tapping into multi-source data across STB (set-top boxes), ACRs (smart TV's), and DVRs cross platform, 605 captures granular data from over 34 million households, helping its clients analyze viewership behavior at second-by-second granularity.

The volume and velocity of 605's data is breathtaking: more than 2.5 billion daily ad exposure and behavioral events comprise a data estate of more than 6 petabytes.

This adds up to more data in a single day than many companies generate in a year.

Ultimately, the data helps 605 clients by informing decisions that quantify and demonstrate the effectiveness of driving real business outcomes across the media lifecycle.

Once your data reaches petabyte scale, every aspect of your data operations and architecture becomes substantially more challenging to manage, calling for specialized architectures, processes, tooling, and expertise. Data platforms of this scale require a complex web of components to handle ingestion, storage, and processing. And through it all, modern systems still require sound, traditional data management practices to maintain appropriate performance, security, and compliance.

605 had a vision to design a data platform and data lifecycle that adapted as fast as their product team and let them stay flexible in an incredibly fast-moving market.

That's when they called Axis Group.

Our Solution

With a cloud-first architecture and a curated reference data management model, Axis helped 605 democratize the development of their complex data products.

Working hand in hand, 605 and Axis Group designed a way to bring development teams together under a unified data platform strategy, with universal standards and scalable frameworks that could be readily understood and adopted.

605 and Axis agreed that we needed to enhance their cloud-based data ecosystem, and further support it by creating standards, practices, and frameworks that every team could readily utilize to get the data they needed.

605 relied on Databricks for a flexible infrastructure to run their data pipelines, since it provides a rich set of features and integration, and enabled development teams to deliver product updates quickly while maintaining high-quality output.

Axis started from the ground up by developing a custom framework to help the team better standardize, modularize, and extend their code. This laid a solid foundation to build upon, letting the company develop business logic while Axis focused on the plumbing. We then developed and facilitated a plan to migrate their core ETL code in-framework and created a modern testing methodology designed to accelerate development and improve quality.

The next step was to rationalize and consolidate the framework so it could be managed centrally. Here, the teams opted for AWS-managed service offerings, since it automates away much of the overhead normally associated with such systems. On the road to production, Axis also worked alongside 605’s agile development team to develop coding best practices and standards, including developer workflows, sharable libraries, modern CI/CD practices, and Infrastructure-as-Code methods.

One other area of great interest to 605 was how to manage its reference data—dimensional information needed to enrich their facts. While reference data management (RDM) is often treated as an afterthought, at petabyte scale it becomes its own unique engineering challenge beyond the capabilities of traditional RDM tools: how to maintain accuracy and consistency while still processing timely changes across billions of records.

RDM needs are magnified in the media viewership space, where a change to a single attribute for a TV station can cascade down to modifications of millions of viewership records. 605 needed a robust system to track slowly-changing dimensions and the post-commit lineage of any modifications made to their reference data. To meet these requirements, Axis developed a custom RDM solution, using Python, Airflow, and MariaDB, that lets even low-level analysts update reference data and examine its impact at scale, all using everyday tools like MS Excel.

Together, the changes to architecture, data models and tooling—along with Axis's mentorship and enablement—helped democratize 605’s development of data products and created an entirely new way to do business. 605’s data teams are now unified under one set of data and a single set of standards—"one true framework to rule them all"—helping the company accelerate its roadmap for the future.