Cloud apps, big data and the wisdom of swarms

Cloud apps, big data and the wisdom of swarms

Summary: Siri's approach to deciphering voice recognition has lessons for SaaS vendors who are debating how to mine their stores of big data for value.


It was only after my wife acquired an iPhone 4S this week that I fully understood the importance of big data for SaaS vendors. Did you need to train Siri to recognise your voice, I wondered? What I found from a brief Internet search was a revelation. I'm old enough to remember the PC-based voice recognition systems of the late 1990s from the likes of IBM and Dragon Software. Those systems had to be trained over a period of a week or more to recognise the sound of the user's voice. Siri doesn't do that. Instead, it matches the voice it hears to a library of voice patterns and uses the closest match to interpret what you say.

What's happening behind the scenes is that Siri has analyzed tens of thousands of voices and identified patterns that run across all of those voices. As a cloud-sourced app, it can continue to refine and hone that central library of voice patterns based on what it encounters in the field. This is where the cloud approach really wins. The old, PC-based systems could learn a person's voice, but they couldn't use that learning to improve their ability to learn the next user's. A cloud app like Siri can continuously evolve its core capability with every new user.

So instead of having to perfectly predict every anticipated type of voice (and inevitably fail on the unexpected edge cases), the cloud-based app can simply react to what it finds. As my wife started using the phone on a family car journey, Siri dealt effortlessly with the mixed voices of my wife and two excited children. If you'd set out to develop a voice-recognition system, would you have thought to include that use case in the spec? Siri doesn't have to make that judgement — in fact, Siri doesn't judge at all, it just works with what it finds. That, along with the broad base of data that it gets to work with, is what makes it so powerful.

This is what makes big data so important for SaaS vendors. It's not simply the ability to analyse huge pools of data. What really matters is the broad base of that data, gathered from a large mix of users within which patterns of behavior can be analysed and then applied elsewhere. Think of it as swarm data — lots of individual, autonomous behaviors that collectively add up to reusable patterns.

A few weeks ago, I wrote about cloud collaboration vendor Huddle's new file synchronization capability. This uses analysis of prior behavior across its user base to decide which shared files to download to a user's local device, and then continues learning from behavior patterns among the user and their colleagues to make its predictive downloading more and more accurate. Like Siri, the historic analysis of its existing broad base of user behavior gives it a head start in delivering accurate results from the get-go.

SaaS vendors are in a unique position because of the collective behaviorial data they're able to amass. For the past year, email provider Mailchimp has employed a data scientist on what it calls its Email Genome Project, looking for patterns in the millions of emails and campaigns its customers generate. It's been instrumental in finding and shutting down malicious email accounts, as well as generating benchmark stats that customers can use to evaluate their performance. These are useful advances, but what I'd really like to see next is to have those benchmarks brought into the app to evaluate mailings while they're being created.

I know that many SaaS vendors think of big data as a mine of potentially useful information but are uncertain where they are most likely to unlock significant value. To my mind, it's the behavioral data that holds the most promise because, in analyzing how their swarms of users behave, they can discover new ways to automate common patterns of behavior. Those reusable patterns that can shortcut a learning process and deliver faster results are going to be like gold dust for those who are first to surface them.

Topics: Cloud, Apple, Apps, Data Centers, Emerging Tech, Networking, Telcos

Phil Wainewright

About Phil Wainewright

Since 1998, Phil Wainewright has been a thought leader in cloud computing as a blogger, analyst and consultant.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Very Interesting -- but...

    take it one step forward: segment.

    why wouldn't siri use demographic information about the user (which already has supposedly and likely) to limit the 10,000s of available samples to just a few 1,000s. you could say why does it matter? more is better - no? actually, no.

    just like with big data, being able to contain the data and analysis variables yields faster, and better, results --- in other words, i am most likely to sound like a 45 year old male from argentina speaking english than an 18 year old female from nigeria speaking english -- being able to limit the non-possible scenarios by segmenting,swarm intelligence is far more interesting to study (if you look at ants or bees or termites -- the ones that pay attention to worker bees, for examples, are other worker bees with the similar job. in other words, the ones that can benefit the most from seeing where they re going and what they are bringing back --- in other words, swarm intelligence is incredibly more powerful when segmenting).

    now, interpolate that to big data and think what happens if you can segment (non-revenue segmentation, mostly non-demographic also -- depending on the data you have) the data by likely impacted population? or product? or function? or ---?

    thanks for a v interesting posting
  • LinkSource Technologies has acquired Unleashed Networks

    LinkSource Technologies has acquired Unleashed Networks to offer a broader range of IT and desktop support services and products for clients. LinkSource works with businesses of all sizes requiring reliable tech support, cloud services, mobility & data solutions. View the full release at
    LS Technologies
  • Saas ERP

    SaaS vendors to manage entirely their API infrastructure. SaaS vendor, are either lacking behind and should add API as a top priority onto already far-too-long development roadmap, or are already in the API game and can benefit further from getting it managed more efficiently. Here I have one company < a href=> SaaS ERP </a>