30 big data project takeaways

30 big data project takeaways

Summary: Here's a look at the big data lessons learned in the field from a bevy of technology execs.

SHARE:
TOPICS: Big Data, TechLines
2

Technology executives are hopping on the big data bandwagon at a rapid clip as they run Hadoop pilots, eye internal information streams and struggle to find talent.

Here's a look at 30 big data takeaways over the last two weeks via a conference at Temple University as well as ZDNet's TechLines roundtable discussion last week.

 

  1. Where do you start a big data project? Skunk works projects were a popular route and then those groups evolved to become dozens of employees and petabytes of data. Other options included the underserved business unit. Some companies had business leaders as sponsors.
  2. Leaders will have to take a few chances on big data projects. Translation: Trust your people, spend some money and take the leap.
  3. Use cases for big data abound. Among the possibilities:
    • Network optimization. 
    • Fraud detection.
    • Seeing what the customer experiences. 
    • Healthcare simulations.
    • Consumer focused marketing efforts require more social networking analysis and predictive capabilities. Consumer data is inherently unstructured. 
    • Travel and expense management to make intelligent decisions about costs. For instance, a company could notice it is sending too many people to one conference with aggregated data across 200,000 employees.
    • Marketing support and tracking of attrition rates in a subscriber-based business.
    • Closer ties between partners and suppliers via collaborative data and insight sharing. 
    • Christine Twiford, Manager, Network Technology Solutions at T-Mobile, said analytics gave the wireless provider confidence that it could offer an unlimited data plan without crushing the network.
  4. Analytics and business intelligence are bridging into big data applications. Historical data from years back has been usable, said Michael Cavaretta, Technical Leader, Predictive Analytics & Data Mining at Ford. In the future, Cavaretta said Ford will focus on data from the vehicle, but the real win may be the stream of information through the manufacturing process.
  5. The big data Petri dish will be the healthcare industry. "There's a lot of incentive out there to use big data to improve healthcare," said Katrina Montinola, Vice President of Engineering at Archimedes.
  6. Facebook is another big data Petri dish. Facebook could use big data techniques to make more money---while treading carefully on privacy. Conversely, Facebook is a huge data set by definition. After all, one billion users are sharing gobs of data. Facebook data could "provide an X-ray view" of what's going on in a customer's head. Companies could optimize that data to improve experience. Montinola said that Facebook would provide an ideal population for clinical trials. Skytland said Facebook could be "an amazing platform for collective action."
  7. "Big data is the oil of the information age," said Nicholas Skytland, Program Manager, Open Government Initiative.
  8. Shared analytics services are commonly used as a way to harness big data and blend in predictive techniques.
  9. Storage will be an ongoing big data issue because data scientists are pack rats---even hoarders---but there's a budget limit. T-Mobile can only keep 10 days of its clickstream data, said Twiford, who noted the company is trying to process more information in flight. Storage limitations will result in sampling.
  10. As for data sampling, data scientists will ultimately make the call on what information is hoarded and what's sampled.
  11. Data scientists will be in high demand and serve as investigators that test hypotheses. Data scientists will be paired with business domain experts. What's unclear is how many of these data wonks you need. In many respects, we'll all be data scientists to some degree---or at least data literate. Twiford said there's a talent challenge. There's also a challenge in recruiting big data talent and companies should look beyond Silicon Valley.
  12. Big data talent is tough to find. One company appointed internal people with business knowledge and supplement with a partner who had statistic and analytics wonks available (consultants). The long-term talent strategy for this company is to recruit heavily from universities to build an analytic employee pool. Talent has to be able to use data.
  13. Visualization tools and crowdsourcing may alleviate the big data talent crunch, said Skytland. Perhaps "citizen scientists" will bridge the gap, said Skytland. Visualization tools can bring big data to the masses.
  14. Universities and retraining will also bridge the big data talent gap.
  15. Too much time is being spent preparing big data and not enough actually analyzing it. Discovery and decision-making is being short-changed for preparation. Data preparation should be automated.
  16. When pitching big data to business leaders you need to start with this question: What business questions need to be answered?
  17. Most corporate big data projects are in their infancy. As a result, many are looking to combine data warehouse information with other data to be prescriptive. One company was looking to build a data warehouse on steroids.
  18. Partner with companies that can provide visualization tools via APIs. Of course, you have to liberate your data and open it up first, said Skytland.
  19. NASA is planning missions that will collect 24 terabytes of data a day. "We want to make sense of that data and actually navigate it," Skytland.
  20. There are thousands of silos in corporate America and sharing data is the biggest challenges. Big data could be a way to bridge those corporate silos.
  21. Big data applications are rolling first at business to consumer questions because they tie together experience, sales and analytics. Social media and multiple channels also mean that companies need to look for patterns in streaming data, said James Kobielus, IBM's big data evangelist.
  22. Hadoop clusters are surfacing everywhere in corporate America. If 2012 was the year of enterprise Hadoop pilots, 2013 will a ramp of usage.
  23. NASA initially created its own big data systems, but is using more commercial applications ranging from Amazon Web Services and a cloud infrastructure.
  24. Big data isn't new, but now has reached critical mass as people digitize their lives. "People are walking sensors," said Skytland.
  25. Social media is hyped in big data applications, but the diary of consumers' lives is great market intelligence. Chief marketing officers are pushing social media and big data projects. Cavaretta said Ford is using social data because it goes beyond what consumers provide in surveys and "represents what they are thinking."
  26. IT practitioners said that they wanted the largest data sets possible. The idea is that companies wouldn't have to rely on samples. However, there's a business challenge in determining what information is worth keeping and what should head to the archive or tossed.
  27. Making archived data usable for big data projects is going to be a running challenge.
  28. Governments and the ability to provide datasets can create entire industries. Under this theory, governments will essentially be data providers as one of its primary functions.
  29. Twiford said that T-Mobile is using big data techniques to learn more about the preferences of no-contract customers, which don't offer as much profile information as contract ones.
  30. Data analytics as a service and data visualization as a service will become commonplace. Third party vendors will move toward big data as a service to make it consumable for the masses. Tech vendors to go this route are likely the big market share leaders today (IBM, SAP, Oracle, Salesforce.com).

Topics: Big Data, TechLines

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

2 comments
Log in or register to join the discussion
  • Re:

    Larry, very informative article. We are seeing an increase in businesses seeking specialized skills to help address challenges that arose with the era of big data. The HPCC Systems platform from LexisNexis helps to fill this gap by allowing data analysts themselves to own the complete data lifecycle. Designed by data scientists, ECL is a declarative programming language used to express data algorithms across the entire HPCC platform. Their built-in analytics libraries for Machine Learning and BI integration provide a complete integrated solution from data ingestion and data processing to data delivery. More at http://hpccsystems.com
    H-M
  • Intent Data

    Great overview Larry and some really poignant takeaways there. 2 great points for me in there, hopefully covered by us on our site in more detail:

    15. Too much time is being spent preparing big data and not enough actually analyzing it. Discovery and decision-making is being short-changed for preparation. Data preparation should be automated.

    http://intentdata.com/the-lifeblood-of-your-marketing-engine/

    24. Big data isn't new, but now has reached critical mass as people digitize their lives. "People are walking sensors," said Skytland.

    http://intentdata.com/what-is-your-personal-data-worth/
    IntentData