'Scary' clickstream data boosts News Corp readership, ad sales

After a few false starts, News Corp's efforts to deepen its engagement with customers are finally paying off – but it has required the accumulation of what data services manager James Hartwright calls "scary" amounts of clickstream data.
Written by David Braue, Contributor

After several false starts, News Corp is finally getting keeping readers and advertisers more engaged as it uses advanced analytics tools to comb through masses of clickstream data and tailor content delivery around customer interests.

Having managed the growth from a half-terabyte Oracle database to a 40TB DB2 database as the company's online efforts intensified over the years, data services manager James Hartwright told attendees at the IBM Solutions Connect conference in Melbourne that the current run rate – clickstream data is accumulating at around 100 GB per day – had become “scary”.

Declining newspaper readership pushed News Corp to build an analytics-driven model for better online reader engagement. Image: CC BY-SA 3.0 C'est lancé

With readers accessing News Corp content both online and through mobile apps, the volume of clickstream data that the company was collecting “is quite amazing,” Hartwright said. “The news.com.au site has around 20 million devices interacting with us every month, equating to maybe 10 million or 5 million people depending on how many devices they have.”

The company recognised some time ago that better analysis of that data could be used to improve customer retention, reading time and other key metrics across the company's 150 brands – but had previously run into technical challenges as it tried to find the right way to do so.

It took a few tries before the company found the right level of data that its users were comfortable providing. When it initially began requiring registrations to access its content, users often provided false information or insulting comments; over time the company realised all it really needed to start with was a name, suburb and email address.

Building on that information, over the past six or seven months the company has been able to use a range of analytical tools to develop a more meaningful understanding of the way readers consume their content. This included building a streamlined data flow that fed online and offline data through a series of processes to ensure data quality, linkage with other behaviour, ensure customer privacy, and then run analytics to identify key trends.

“Once we do that we can enrich our [content offering]”, Hartwright said. “We're getting a fuller view of what that person is, and how we can tailor what we provide back to him. But we have to make sure we don't fall into that salacious trap of putting in new products and asking too much.”

The end result is a single source of truth that has not only helped the company move away from its siloed technology legacy, but has given it unprecedented visibility of its customers and their media-consumption preferences.

“Once we get those consistent information flows in place, we can start turning off all those old systems,” Hartwright said. “We've reduced the amount of information moving around the organisation – especially when that's personal information that we need to keep a good, tight control on.”

Better customer knowledge, in turn, has helped drive strategic marketing and content-building efforts in and between the company's 150 media brands.

“We've gone through four iterations of this trying to build enterprise data warehouses. We'd consider how to do the design and the content, then three years later we'd say 'we're ready to take the content' – but in those three years the business had moved on.”

“There's less gut feel and no more of this 'I think' or 'it seems',” Hartwright said, noting that the maturity of today's online technologies had finally allowed the company to get the level of control, interaction and rapid development that had been lacking in previous customer-retention efforts.

“We've gone through four iterations of this trying to build enterprise data warehouses,” he said. “We'd consider how to do the design and the content, then three years later we'd say 'we're ready to take the content' – but in those three years the business had moved on.”

Rapid maturation of Web-based content management and analytics tools, however, had changed this.

“The industry has matured and the price of software and hardware dropped,” he explained. “The software has open APIs that have been very helpful: you get data interchange and add-ons, and can push data out and in much more coherently than we could have done three or four years ago.”

This has translated into increased readership, with daily newsletters seeing a 10 percent increase in clicks and 10 percent more articles read per session. Users viewing The Australian's content through its iPad app are spending twice as long reading content, and reading more articles per session.

This success, in turn, is driving “strong” adveritisng growth as the company's various business units identify new opportunities for targeted marketing.

It hasn't all been smooth sailing, however: the journey to build that single source of the truth necessitated a degree of compromise from business units that had for a long time been quite possessive of their own customer data – using it to secure ads, often at the expense of other News Corp titles that were often competing for the same readers in the same physical regions.

The shift towards high-volume analytics therefore required a significant change in culture and process, with the establishment of common consumer targeting methodologies and procedures for ensuring consistent policies around privacy and other issues.

“Getting that common privacy set working across the whole information flow means that stumbling across the privacy issue isn't going to be an issue for us,” Hartwright said.

“Before, we had exernal tracking through behaviour. Now we have internal tracking. We have lots of data; we just needed to pull it together – and we needed to put that privacy piece around that before we push it out to other areas of the organisation.” 

Editorial standards