How social media and big data will unleash what we know

Today social media generates more information in a short period of time than was previously available in the entire world a few generations ago. Making sense of it and understanding what it means for your business will require all new technologies and techniques, including the emerging field of big data.

With this development -- as the world continues to become more and more social -- competitive advantage will come to those who understand what's happening better than their peers and can directly connect it to their business outcomes and other useful pursuits. Social networks and enterprise social software has long been driven by two things: The connections between the people that use them and the information they share.

Just as Facebook uses the insights gleaned through its analytics on how people behave to enable personalization and better user experiences, the same phenomenon has been happening on the Enterprise 2.0 side, most recently exemplified by last month's acquisition of Proximal Labs by Jive Software.

While gleaning insight and contextualizing interaction in social environments is nothing new, the challenge in doing so has been pushing the boundaries of available technology for some years now. Organizations across the social business spectrum -- the consumer and workplace both -- are only now beginning to understand the vast intelligence that can be derived by looking at millions of conversations taking place, mostly out in the open, between those engaging in social media. While there are certainly interesting privacy, legal, and regulatory issue with doing so, even with internal social networks, it's not likely to delay the growing adoption of such capabilities, given their potential value.

In the shorter term, the ability to analyze and mine data in scale within social networks is enabling a range of intriguing and useful applications that can plug into social media networks and make use of the knowledge inside them. Doing it well, however, has proven to be more than non-trivial, such as making making analytic sense of content types that are very large and opaque, such as high definition video, or sensing the connections between the thousands of unstructured natural language messages. Each of these require technologies that can handle the enormous scale, complexity, speed, and computation requirements in a way that remains cost effective inside a rapidly-rising exponential window. From this you can begin to see the challenge of traditional approaches to large data, which tend to break down fairly soon under large geometric growth.

The Intersection of Social Media and Big Data

Perhaps more to the point, and where the discussion of big data comes in, is that the key to social media interactions between people is that it leaves knowledge behind for others to find and reuse. This can be the original content that started the conversation or the subsequent comments, discussion, ratings, ranking, retweets, etc. These conversations will remain on the network afterwards, usually for a digital eternity, and forms an invaluable history and knowledge repository of society, culture, and business that can be discovered, brought back to life, shared, learned from, ad infinitum. Of course, some of this isn't inherently valuable by itself (much has been made about the signal-to-noise ratio of social media). Also, finding what one is looking for in the vast sea of a million or billion human conversations is a difficult task. Thus, separating the wheat from the chaff is where big data, and the analytics it makes possible, go hand-in-hand.

Related: How an accidental IT future is becoming reality.

But the deeper issue is not just finding the valuable nuggets in a galactic sea of social media. Rather it's knowing what you are able to know, or even what its possible to know as social media becomes a dominant form of communication (probably now the dominant form.) Last year at Defrag, I spoke of this as the difference between being able to find a specific needle in a haystack versus having the ability to discern the shape of the haystack itself. Social analytics has arisen precisely to help us make sense in the large of the endless flow of our activity streams and social news feeds.

Social Media: A Big Data Inflection Point

Though social networks may soon contain the visible sum of humanity's communication and interaction, the challenges of deriving what is increasingly called social business intelligence are two-fold. First, big data sets itself apart from previous approaches because it applies new ways of thinking about the capture, storage, and processing of truly vast amounts of data, precisely the kind that emanates from today's social media ecosystems. This includes the supporting technology, often starting with emerging tech such as data mining grids or MapReduce infrastructures (see my exploration of one example, Hadoop, here) as well as software architecture that is often surprisingly non-deterministic and non-linear in design. For a quick example, see this discussion of LinkedIn's challenges and counter intuitive solutions to data scale in social networks. In practice, this means that there is a distinct generational and technical divide between how most organizations are dealing with data today, and the very different things they'll need to do in the future.

Computerworld: As 'big data' grows, IT job roles, technology must change.

The second issue goes back to the old adage that "you manage to what you measure." In the big data world of social media, this means that ones analyzes what you know of to analyze. However, one of the things I hear frequently from business users of social media is that they'd like to "spot trends", to "know what's going to happen before it actually happens", to "get ahead of the conversation and see where it's going." Sentiment analysis, knowledge mining, aggregating conversations into trends, these are all possible when you know what you're looking for (and you have the technology that's capable.) It also helps -- and it's no mean feat -- if the tools you use are smart enough to tell you what there is to know, but that you don't know to ask for.

Right now big data still remains a specialized niche of technologies and techniques that makes fundamentally new assumptions when it comes to tackling and understanding vast amounts of data. While consumer social networks have long used big data techniques in the custom platforms they've built for themselves, enterprise social networks are just now starting down the road to catching up. Big data is now moving into the realm of mainstream IT. With this development -- as the world continues to become more and more social -- competitive advantage will come to those who understand what's happening better than their peers and can directly connect it to their business outcomes and other useful pursuits.

Where social media and big data are headed

For now, however, here's the short list of what the growing intersection of social media and big data will likely result in:

  • Big data-enabled applications that are plugged into consumer and enterprise social networks. As 3rd party software becomes increasingly embedded into social networks, these are analytic applications that can give you the latest social media "weather report", allow you to data mine, ask queries, view top trends, and display insights of various kinds only because big data approaches make it possible to happen quickly enough in the face of the full sum of social media knowledge.
  • The blurring of external and internal big data. The interconnectedness of everything is only continuing as the Internet drives the IT conversation to a greater and greater extent, including a growing blend of the consumer world and the enterprise. Tomorrow's big data solutions, particularly when it comes to business intelligence, will include a fully integrated view of this entire social media landscape.
  • Privacy, governmental, and regulatory concerns will grow. Much of this will look like Big Brother vacuuming up every scrap of people's behavior and knowledge and using in ways that were never intended. This in fact is already a significant issue in the social media space but the increasingly mechanization and maturity of big data, social media, and analytics will only make it more pronounced. Those engaging in crawling, analyzing, and sharing data garnered from social media, whether it is from consumers or employees, will have to think carefully about the implications. Firms should carefully play the role of steward as issues from intellectual property to ownership of data are resolved in the industry.
  • Analytics that finds you. Knowing what to ask for is an essential skill when it comes to extracting value from today's social media landscape. But the confluence of social media and big data will allow vital new intelligence to find you, before you know you need it. The field of predictive analytics, for example, requires very smart software combined with the ability to quickly run enormous amounts of speculative queries in a timely fashion. Big data will make that possible, at least until we reach the era of Really Big Data.
  • Cloud big data analytics emerges. The sheer data and compute volumes, combined with the highly decentralized nature of social media, is ideal for a cloud computing approach. Look for products that combine big data, analytics, and social media for on-demand, cost-effective solutions. Companies like IBM and HP are likely to lead the way here.

For another good indicator that big data represents one of the next big things in IT, I urge you to read Mark Gomes overview at Seeking Alpha of recent M&A activity in the big data space. In the meantime, the strategic roadmaps of those engaging in social media for business should now include big data and social business intelligence as a key component for reaping full ROI.

Show Comments