Blinkx eyes up video search market

Q&A: Can a start-up corner the video search market? Suranga Chandratillake, Britain's answer to Google's Page and Brin, says Blinkx is doing a good job of applying enterprise tech to the consumer space

When Blinkx first appeared on the scene in 2004, claiming it had developed a better way of searching for information, many in the technology scene were excited. But less than two years on, the company has shifted its focus, moving away from text-based search and into video search.

This week the company announced its first revenue-generating deal, a tie-up with TV news network ITN. This agreement positions Blinkx as the firm that media companies will turn to when they want their video content online. But Blinkx is playing in a marketplace dominated by the might of Google and Yahoo — two giants who won't welcome a newcomer who steals valuable market share.

Having cut his teeth at UK search pioneer Autonomy, Suranga Chandratillake counts as a dot-com veteran even though he hasn't yet turned 30. ZDNet UK sat down with him to hear about Blinkx's progress, plans for the future, and why it relies on Linux.

Q: What encouraged you to start Blinkx?
A: We looked at the market, and wondered why consumer search companies were doing so well, when their technology was often a couple of years behind that of enterprise search companies like Autonomy and Verity. The trigger for Blinkx was when we decided to see whether technology honed in the enterprise space could be moved to the consumer space.

When you first announced your plans in 2004, you said that Blinkx would let users search their hard drive, their emails, Internet news sites, blogs and the wider Web. Why has your focus shifted onto video search?
Desktop search is a pretty boring business to be in. It's becoming commoditised, and it will be a non-industry within the next 12 months. That functionality is getting built into the operating system now. It's very hard to persuade users to download extra tools when the operating system does the job so well.

And how does your system work?
We use Web crawlers to find video files of all format types, and we've identified "hot areas" on the Web that put a lot of video content online, such as CNN and the BBC. Our software then breaks the audio stream into phonemes — the building blocks of words — and we then use probability theory to translate that list of phonemes into a transcript of the video.

But isn't that very, very difficult? Do you actually manage to translate every video stream correctly?
It's not perfect. It depends how good the signal is, how much noise is in the background, and whether more than one people are speaking. We achieve 70-95 percent accuracy — 70 percent for podcasts, up to 95 percent for a BBC newsreader. Our actual accuracy is higher, though, because there are very few searches for the kind of words we get wrong. For example "of" and "on" are easily confused, but they're not typical search terms.

So what are people searching for?
On Monday, we saw a lot of people searching for the lady who had the first face-transplant. That was the first time she appeared in public since the operation — so people must have read about the event and wanted to see it themselves.

When the new pope was being elected, we had a lot of people searching for "white smoke" or "black smoke" — looking for signs of a decision.

We see less interest in politics — unless there's been a single big debate.

Processing all these video clips must need a lot of computational power. What are you running on?
We have a huge array of IBM boxes. The typical machine is dual-CPU with 2GB of RAM — we've got hundreds of them.

And what software do you use?
All the search software is proprietary — either our own or licensed, such as the Autonomy technology we use. For the operating system, we use SuSE Linux.

Why did you pick an open source operating system?
Because it's cheap and seems very reliable. Also because it gives us the ability to explicitly control what each bit of hardware does, and that's very valuable. We know which parts of the process are software-intensive, and which are hard-drive intensive. Linux lets us make a profile for each box, based on what its job is.

Blinkx creates a transcript of each video and podcast you process. Does that mean you could allow people to search within those files?
That's a very annoying area. We know what people said and at what time they said it. The problem is, there's no way to point to a video file on someone else's server and start it playing at, say, two minutes in.

You're competing with Google. Aren't you worried that you'll get crushed?
That's the classic question. In the 1980s, if you were a software company they asked how you'd compete with Microsoft. Now, it's Google.

But, in terms of video search, Google is the worst of the major players, in my view. Yahoo is much better. And if you look at Google Talk, or Gmail, or Google Earth, they're either using technology which Google acquired, or they're "me too" products. Fundamentally, that goes to show that technical innovation is not trivial. Saying that, Google's core search and its text ads are both exceptional.

Blinkx is often mentioned as a possible takeover target. Are you looking to be acquired, or are you looking not to be?
We're very focused on building a real company. I went through the first boom, when companies were "built to flip", to be sold on. Those companies focused on the wrong things.

There were reports recently that Rupert Murdoch might buy Blinkx. Are those rumours a distraction, or is it a nice thing to happen?
It doesn't make a difference.

So what's the focus for the next year?
The deal with ITN that we're announcing this week is our first source of revenue. We'll be pursuing similar relationships with other content partners. We'll also be looking at distribution. Now we have the content, we need to raise our profile.