Running a programmatic exchange requires PubMatic to process around 800 billion bids a day and 80 million bids a second – which generates 100 terabytes of compressed data per day.
And this number has only risen as header bidding multiplies the number of daily bid requests.
Data analysis in real time instead of a day later requires massive investments in infrastructure, time and overhead.
PubMatic reduced all three of those by working with data processing engine Wallaroo Labs.
PubMatic now uses just 20% of the infrastructure and half the operational overhead compared to if it used another data processing system, Apache Spark Streaming. It also went to market in half the time, and realized these results shortly after it started working with Wallaroo Labs six months ago during Q2.
“Ad tech pushes the boundaries of big data and its scale requirements. On top of that, the margin structure requires you to do it very efficiently. It’s super important if you want to run a solid business,” said PubMatic founder and Chief Growth Officer Amar Goel.
PubMatic can now look at timeout rates, DSP response times and latency when connecting to a publisher all in real time, and make throttling decisions.
“A lot of these issues are transient. You want to look at operational metrics in real time, not looking at what timeouts were yesterday,” Goel said.
The stakes are high: “If you’re timing out with a DSP, you could be hurting their systems, hanging up our own systems – and meanwhile revenue is going down for publishers and buyers can’t achieve their goals,” Goel said.
The efficiency of the new data stream makes it easier for PubMatic to pipe that data into its other systems. While PubMatic already offered real-time reporting to publishers and buyers, the new internal data processing technology does so more efficiently at a lower cost. So it will eventually rewire its publishers and buyer systems based on these internal monitoring tools.
Wallaroo Labs was formed to help companies analyze high-volume, high-speed transactions in real time. Its founder and CEO, Vid Jain, spent a decade at a large bank working on solving data challenges around algorithmic trading.
When he left, he wanted to see if he could solve a similar problem for companies in other verticals – like programmatic exchanges – who could use this data to improve business outcomes.
“If you’re Google, Twitter, Goldman Sachs, NASDAQ, you can afford to develop and maintain the capability we offer. Our goal is to democratize this for the other 95% of the world,” Jain said.
Wallaroo Labs finds out what problems its customers want to solve, and then customizes its tech for each client. They pay a managed services fee to maintain it.
Companies who don’t currently have these real-time data processing capabilities either wait until the following day to process it or use only a fraction of the data they do have.
“Most businesses are using less than 10% of their data,” Jain said. “But when you can understand insights from your data as that data is being generated, you can act on it quickly – which opens up new opportunities to drive your business.”
After raising a $3 million seed round in summer 2018, Wallaroo Labs has just started going to market, focusing first on digital marketing and internet of things companies.
PubMatic is one of its first three programmatic customers. “They were willing to take a chance on a small startup,” Jain said. “And now we’ve stress-tested what we built with a company that’s one of the tip-top in terms of volume.”