X
Business

Twitter exposes historical data in partnership with Datasift

Twitter has entered into an agreement with Datasift which will fundamentally change the way you tweet. This is a significant move by Twitter which could have far reaching implications for us all.
Written by Eileen Brown, Contributor

Update: See below

Twitter has entered into an agreement with Datasift which will fundamentally change the way that you think about your tweets.  Now companies will be able to mine historical data on Twitter to discover long term trends.

Datasift, the social data platform company has today launched Historics, a cloud-computing platform that enables entrepreneurs and enterprises to extract business insights from Twitter’s public Tweets dating back to January 2010.

Datasift is one of only two companies in the world with a license to make Twitter data commercially available for non-display purposes. DataSift provides brands and companies with granular insight into how they are perceived by social customers.

"No-one's ever done this before." said Tim Barker, Datasift's marketing manager.

The data gold mine

This is a really significant move by Twitter.  And it could have far reaching implications for us all.

Twitter has been trying to monetise its business since it became popular in 2007. But its only asset is its data. Twitter owns the data -- and it heavily restricts access to the gold mine through its api.

There are lots of analysis tools and data mining applications that have access to the current stream of data.  These tools can give you real time current trends and sentiment analysis.

But this announcement is different.  Now Datasift can access old data.

Through DataSift’s platform, companies can filter and interpret vast-volumes of social data to help create insights for their business. For example:

  • Social media monitoring companies can analyse trends in customer conversations and brand mentions;
  • Business Intelligence companies can correlate point-of-sale data with social data to identify trends in product sales and market sentiment;
  • Marketers can evaluate conversations around marketing campaigns and adjust key-messages and offers based on feedback;
  • Financial organisations can analyse popular sentiment, trends and indicators relating to businesses and economic events;
  • News and research organisations can surface new trends around historic and popular culture events.

Slice of the pie

There is a lot of noise on Twitter.  A LOT of noise.  Perhaps over 99% of the tweets you read are irrelevant.  Finding useful data in this noise is like panning for gold.  You are trying to look for that tiny nugget of useful and valuable information in this river of data.

“DataSift is focused on simplifying how businesses can extract insights from the petabytes of social ‘Big Data’ being created,” said DataSift founder & CTO Nick Halstead. “With Historics, we are now democratising the Big Data industry to enable entrepreneurs and enterprises to easily create socially-intelligent applications. No data-scientists required, no Hadoop expertise needed.”

Most of the tweets we send are ephemeral.  These tweets are only important to you and the people directly involved in your conversation.  Once the conversation dies, then the ephemera disappear -- the relic of your day to day life.  We move on.  We forget what we have tweeted.  Tweets disappear.

Now someone is going to make money from your trivia.  You historical disposable information can now be accessed by brands that want to discover more about you, expose new trends, and discover historical baselines.

As my good friend Jon Honeyball says, If someone is making money from my trivia, should I also be entitled to a slice of the financial pie?

Trivial conversations

Picking through the detritus of my Twitter stream might bring insight to a tiny number of people.  Giving access to my historical minutiae seems that Twitter is walking a fine line here between market information and intrusion.

My old data will now be crawled, indexed and pattern matched for anyone who chooses to pay for the information.  Will I be happy that this is going to happen, or will I search for a tool to delete my Twitter history?  Am I happy that the trivial data I transmit is a potential revenue stream for someone else?

And if I find such a tool, will Twitter still keep a record of the data in the deleted tweet for advanced data mining by third parties and Twitter partners?  Can my history ever be deleted -- or is it just too valuable to let go?

Brand advantage

“The sheer volume of data being produced by Twitter represents a huge challenge for companies trying to extract insights from past events,” said DataSift CEO Rob Bailey. “Historics solves this problem, providing businesses with a platform to intelligently filter and extract meaning from two years Twitter data".

Brands could really benefit from this extra information. Access to your data is proving popular. Over 1000 companies have already joined the wait list for Historics.

Globally, 1.3 billion people are on social networks.  Brands need to be able to listen in to these trends and tap into the public sentiment.  With more than 250 million tweets per day, that is a lot of data sorting.

Data privacy

How much data will be exposed to these data mining tools?  If I used to protect my Twitter stream, but now it is open, can the data tool access my protected Tweets?  If I now protect my stream, are my historical tweets fair game -- for a price?

Can I be certain that private accounts and deleted tweets will not be indexed? If I'm concerned about privacy, how worried should I be about companies that want access to my old data?

I'm sure that for the majority of us, access to our Twitter history is not a problem.  After all, we have committed our tweet to the public forum, and we are happy to continue to broadcast.  But for some, this news might cause concern.

Perhaps Twitter should implement an 'opt out' button so our ephemeral comments remain just that...

Update:

DataSift have emailed me about my concerns over privacy:

How much data will be exposed to these data mining tools?  If I used to protect my Twitter stream, but now it is open, can the data tool access my protected Tweets?  If I now protect my stream, are my historical tweets fair game — for a price?

a. Any tweets that are made while an account is protected are not accessible. b. If you later protect your account those that were public stay public. But any future Tweets will be kept private.

Can I be certain that private accounts and deleted tweets will not be indexed? If I’m concerned about privacy, how worried should I be about companies that want access to my old data?

a. Private Tweets are not delivered to DataSift - the Twitter Firehose is only public tweets. b. Any Tweets you delete are also deleted from DataSift's storage.

Related content:

Editorial standards