Summarizer: First Take

Online text summarising tools promise to help you digest large amounts of content quickly and efficiently. We test a beta service, with unconvincing results.

Information overload is a fact of everyday life for many of us: there's more information available, on a wider range of devices, more of the time, than ever. How do you find the time to digest it?

Various tools can help — you can try text-to-speech utilities, for example. But if you think that nuggets can be obtained without going through entire texts, then automatic summarising tools are available.

One example is Summarizer from Connexor. According to the company, the service uses 8 million lines of code to produce "an accurate and meaningful summary in seconds" and that it's used "across the world by companies ranging from Microsoft to Motorola". Summarizer is also currently available in public beta for free use.

Sounds like it's worth a shot.

The first and most obvious test had to be of the Summarizer press release, which you'll find here if you want to replicate my efforts.

Using Summarizer is easy. Paste what you want to summarise into the text box, choose whether you want a Twitter-length 140 characters or longer 250- or 500-character versions, and then click the big 'Summarize now!' button.

Summarizer, currently in beta, is simple to use.

The original press release was distilled down to just its opening sentence on the 250-character option, while the 500-character option gave me the first two sentences.

This is hardly an inspiring opening gambit, as it's probably what most journalists would read before deciding whether to continue or move on to some new reading matter. However choosing the 140-character option fared even less well, telling me: "The text cannot be summarized. Too short text?".

Did it mean I was aiming for too short a summary, or that the source text was too short? Time to try something longer.

I moved on to an article from the UK's Guardian newspaper about the Team GB triathletes Alastair and Jonny Brownlee. It is 1,800 words long (10,000 characters if you count the spaces).

This time the 500-character summary picked out a few sentences from different parts of the article. They hang together, but don't really get across the gist of the article:

500 characters, attempting to summarise 1,800 words.

The 250-character summary had very little relevance to the original article:

250 characters: little relevance to the original.

This time I was able to produce a 140-character summary — but really, when the system is working from an 1,800-word original, you can't expect miracles:

140 characters: a challenge too far for an automatic summariser?

On the basis of this admittedly limited test, it seems that you simply can't take a large number of logically argued words, push them through a 'black box' and get a useful degree of meaning, context and content from a much smaller number of words.

To have a chance at that, I suspect the human brain has to intervene in order to extract nuance, place information in context, and, quite probably, use words the original writer didn't in order to capture its meaning.

Still, why not try Summarizer for yourself and see what you think.