Revolution Analytics' Chief Community Officer, David Smith, came by to talk about big data, data science, and the important role the human element plays in successful applications of big data. I thought that was a given, but learned that some organizations are so wrapped up in the tools they're using they don't bring in the appropriate experts.
Many, Smith pointed out, focus on the tools (such as Hadoop), using the R language, MongoDB, MariaDB, Splunk, and a few others, rather than really understanding what data is available, what data is needed, and what role an expertise in computer science and applications, modeling, statistics, analytics, and math play. Organizations can easily find themselves having done quite a bit of work to analyze machine and operational data and still not have a clear understanding of what's going on. Even worse, Smith said, they can believe that they understand what the data says and really have only proved out the old maxim "Garbage in/Garbage Out."
Smith says organizations need to realize that data science is built upon, but is different than, business analytics. Business analysts would take the results provided by a business system, such as a customer relationship management system, and make predictions about how a product will sell or what's important to customers. Data Scientists would collect data from other systems, such as retail point-of-sale data, weather data, or other information and then ask many different questions.
Smith pointed out that staff that focus on a single source of data often come to incorrect conclusions. Subject matter experts in the areas of information processing, mathematics, business, manufacturing, or other relevant areas of expertise are needed in the organization's line of business to determine what questions to ask, where to find data that shines a light on that question, and only then conduct analysis to learn about the story the numbers are telling.
Is this a painful study of the obvious? If your company is deploying big data solutions, does your team include a data scientist? Please comment and add your thoughts to the discussion.