In our industry, we're all pretty opinionated.
This is a good thing in that it creates debate, but not such a good thing when it comes to actually deciding things. It would be good is we had some definitive way of knowing that iOS was better than Android, or that this year will be the "year of Linux on the desktop", and so on.
On these very pages over the past week or so my ZDNet colleagues have been debating whether the Chromebook is aor .
And it's not just on these pages -- in peer publications around the web the Chromebook debate is raging. Plus, it's all kicking off on Twitter.
The question is -- can we apply some science to understanding whether the Chromebook is a good idea.
In theory, we can use something called "sentiment analysis" to look at text and understand whether an article generally positive, generally neutral, or generally negative. Is we look at all the articles on a topic, we can say whether the corpus in its entirety is generally positive, generally neutral, or generally negative.
You can also do the same thing with social networking artifacts, such as tweets. In theory, if we had a sentiment analysis engine we can feed all this content in and come up with a definitive "Yes, Chromebooks are the awesomest!" or "No, Chromebooks are epic fail!"
Does it work? Having done it, I'm not sure. Let's go through things together and have a look...
Back when we all used to argue about whether Windows 8 was a good idea or not, I wanted to use sentiment analysis to get a "definitive" answer. At the time, I decided this was way too difficult to bother with.
What I'd forgotten was that we software developers can now get away with being incredibly lazy, and avoid having to understand how anything works by simply wiring together web services.
I found a sentiment analysis service from Datumbox that looked like it would do what I needed. They offer a number of APIs, and they have two for sentiment analysis -- one for long-form content, and one specifically for tweets. (Tweets are short, and hence sentiment analysis needs to be differently tuned.) All I had to do was get the content and throw it at the service for analysis.
I firstly wrote an engine that would go out to Google News and find articles on Chromebooks from ZDNet and a number of peer publications.
I then wrote another engine that would go out and get tweets that mentioned "Chromebook". This ended up being pretty interesting. For one thing, I got a lot of non-English tweets that would throw off the sentiment analysis, and hence used a language service called Detect Language that I would use to chuck away any non-Englsh language tweets. Also, loads of the tweets ended up linking to articles rather than being straightforwardly opinionated comment from the masses. Most of these articles tended to be spam (giveaways. etc). Thus I also chucked away any tweets that contained links.
Taking a relatively small sample of 101 tweets that were just English language comments without links, we get examples like this:
Flagged as positive:
- "Yes I got my Chromebook!"
- "Got a #chromebook 11 for my birthday - best present ever."
- "You guys, the Chromebook is everything."
Flagged as negative:
- "Anyone got a #chromebook HP 14 'Haswell'? I want one #technology #google #laptop"
- "my chromebook is always dying or dead "
- "Successfully running ubuntu on my chromebook after weeks of command-lining the thing phew #Geek"
Flagged as neutral:
- "Tweetin from my Chromebook."
- "Anyone own a Chromebook?"
- "Anyone have opinions on iPad vs Chromebook for portable #writing? Any writers use either? #nanowrimo"
Does that work? I'm not sure -- the above is just to give a flavour, but even having spent hours looking at the sample set manually, I can't necessarily make sense of the data. It probably doesn't do much worse than I would do if I was manually sorting them. Anyway, of the 101 sample tweets, this is what I got:
This suggests that half the people out there talking on Twitter about Chromebooks aren't that enamoured. But I'll come back to that.
If we look at the ZDNet content and run a sentiment analysis on that we get some interesting results.
It sort of works.
James Kendrick recently wrote "". I wrote " " a while back, and Steven J. Vaughan-Nichols wrote " ". These were all correctly flagged as positive.
Ed Bott's article "" was correctly flagged as negative.
David Gewirtz's article "" was flagged as negative, although it say it should have been neutral.
An oddity was that David's "" was negative (when it's positive), and Larry Seltzer's " " was flagged as positive (when it's negative).
Taken together, we get this:
So at ZDNet, we seem to be generally more positive about Chromebooks.
What about if we expand that out to look at our peers? We get this:
Much more positive. But this point though it's worth pointing out that most of our peers' content on Chromebooks are reviews of actual devices. ZDNet's coverage has tended to be more analytical of the whole Chromebook proposition. Chromebooks tend to review well because they are cheap and good at what they do.
The question is, have we learnt anything here, either about sentiment analysis or about Chromebooks?
Sentiment analysis is very interesting when we look at big data. It helps to be able to zero in on set of people within a wide audience who are either flag bearers for your brand, or who are struggling., and also
What it gives you though is very fluffy, especially if you're the sort of software engineer like I am that likes very black and white results. It seems to get things mostly right.
And to the question, can we use sentiment analysis to tell us whether Chromebooks are a good idea or not?
In the rarified position of writing about technology, it appears that we bloggers and writers are more positive about Chromebooks than actual users because we're either a) thinking about them from our specialist perspective, or b) we're reviewing them.
But, with the context that I have from having fiddled with the tweets for hours getting the engine to work properly, most of those people tweeting tended to be students who have been given them to work on. So I, with a more intimate knowledge of the working set, have a better idea as to what the data means.
I'm not sure I'm any clearer as to whether Chromebooks are a good idea or not having done this. How about you?
What do you think? Post a comment, or talk to me on Twitter: @mbrit.