Big data can get the truth from Google

As so often happens, regulators are having a hard time keeping up with a market driven by transformational technology — search. There's one way to even the balance.
Written by Rupert Goodwins, Contributor

Google is in trouble, or so some say. With European and US regulators looking at whether the company promotes its own services unfairly in search results, and the threat of Android being classified as predatory, the chances of huge fines or detailed regulatory control are being talked up.

But take care over who's doing the talking. Google's competitors, who have either failed comprehensively to actually compete or are looking at eroding market share, are more than happy to lead the assault from behind. Take press reports with plenty of salt, and check the credentials of any 'grassroots' organisations that claim to be working for a fairer market. 

How can we, and the regulators, decide whether Google's search results are actually fair?

Which begs the question — how can we, and the regulators, actually decide whether Google's being unfair and abusing its powers?

Google has certainly defined the market and changed the rules, and so far consumers haven't done badly out of it at all. It's also done some questionable things, ranging from snaffling the wrong sort of Wi-Fi information (a story that's never really made sense, except perhaps to illustrate a certain febrile chaos within Google's management structure) to copping a half-a-billion-dollar fine for advertising Canadian pharmaceuticals in the US, which perhaps illustrates a certain febrile chaos in the US drug regulation system.

Other things, like not properly managing Safari's do-not-track system and being flaky with Google Buzz's privacy settings, are more worrying, although they aren't prima facie evidence for the sort of systematic abuse that others claim. 

Hard to find evidence

But then, such evidence is going to be hard to find. Google says that its own services appear high up in its search rankings because they're extremely popular, rather than vice versa. Correlation is not causation.

How can we, and the regulators, actually decide whether Google's being unfair and abusing its powers?

In previous cases of proven monopoly abuse in IT, such as IBM with its control of the mainframe market in the 1970s and Microsoft's 1990s stranglehold on the PC desktop, the amount of data to consider and the complexity of the market were much less — and the judicial system still took so long to reach its conclusions that they were practically moot on arrival. That's before you factor in the way changes in political administration can destabilise the whole process: if Romney had won the November election, the FTC's ongoing Google case would have been parked. 

For regulation to work, it must be fast, accurate, unbiased and independent. Curiously, these are exactly the attributes Google claims for itself and its search system — and the key to working out whether it's being honest. For unlike questions such as "Is bundling a web browser with your monopoly OS perverting the market?" or "Is giving away a free, open mobile OS perverting the market?", questions of accurate search results are open to statistical analysis. 

Enter big data

If the world's regulators ran their own search engine with reliable analytics attached, then the actual behaviour of commercial products would be measurable and demonstrable. Bias would create signals that can be detected through big data tools — and once in place, the same tools could find all sorts of other important trends.

If big data works for big companies, how much more important is it to government? Now, obviously the Federal Trade Commission, let alone Ofcom, can't be Google. But they don't have to be — without the need to serve the results to the teeming billions and run the entire commercial infrastructure to support advertising, the concept can be realised for less than the cost of one abortive high-tech military project. 

It's even possible, not to say probable, that such phantom Googles exist already within one or more three-letter government agencies; as with many Cold War top-secret projects, the cleansed data could be useful for civilian purposes.

But that would defeat one of the primary benefits of an independent search system; it could and should be transparent and vibrantly public, And secretive government agencies aren't necessarily free of their own sources of bias.

Search as a public service

An open, non-commercial search engine makes a lot of sense. Think of it like public-service broadcasting — or, if that sticks in the craw for ideological reasons, the big data equivalent of GPS, a state-run utility that provides such benefits for everyone that its existence is beyond political quibbling.

The public search engine would not only act as an essential check on the probity of the commercial organisations on which our entire global economy increasingly depends, it would keep government and administrators data literate.

This would foster the essential skills without which it will soon be impossible to reach policy or implementation decisions, while opening up new ways to visualise the raw information that powers our lives — and threatens to disrupt them.

Nobody doubts the ability of technology to outrun the regulators. It should be used to help them learn how to keep up.

Editorial standards