X
Business

The evolution of search over traditional BI

Today's business intelligence tools are great at tracking tradional numbers but fail to quench business's thirst for more information says
Written by Sid Probstein, Attivio, Contributor
Commentary--A recent Gartner report predicts that IT’s involvement in business intelligence (BI) will diminish in time as business users adopt new technologies to quench their thirst for information. There are two reasons for this shift: depth of insight and ease of use. The next generation of information access solutions will reach out to the entire enterprise, bring together all of the information and make it universally accessible. As this transition begins, legacy solutions can retain value by pursuing an integration strategy: bringing their mature reporting, visualization and dashboard capabilities to bear on unstructured and semi-structured content.

The status quo
Today’s mainstay BI tools are extremely good at tracking raw transactional numbers like sales figures and profit margins. What they fail to adequately address are the root causes, or drivers, of trends in those numbers. Moreover, they are typically able to tell what happened – but not explain why (unless it is evident in some other numeric data), let alone alert the business as a change emerges. Modern business executives will gain advantage by moving beyond simply knowing what happened to preemptively understanding the forces acting on their business. What is causing a delay of payment from our largest customer? Why are sales in the southwest region down? How is user sentiment impacting our newest product?

Answering these types of questions with the average BI tool is challenging: at best, it takes a great deal of time to gain even one additional level of insight. The cost of these investigations is often high. Large numbers of IT staff must collaborate to extract, transform and load the data into a warehouse, update data dictionaries and then reconfigure the layers of OLAP, summarization, reporting and dashboarding. Despite these efforts and a slew of recent corporate acquisitions, many questions remain beyond the reach of such systems.

To provide greater value, BI tools must evolve in two ways. They must enable users to answer deeper, sometimes “fuzzier” questions about the enterprise. Then they must make it possible for general business users to easily obtain information.

Answering deeper business questions
Deeper questions are questions that require more thought than usual. In the enterprise, more thought translates to more content. So the first challenge for BI is to gain access to more data – including unstructured content like emails, documents and PDFs – breaking down digital silos throughout the enterprise and integrating the content together. The downside is that this is a challenging activity because mapping unstructured data into structured storage can be exhausting. The usual ETL tools do not apply; data warehouses and marts are designed to consume pre-integrated, de-normalized, structured data. Overcoming these challenges requires integration with modern information access systems that can infer structure and relationships in unstructured content and use that to link it back, at query time, to structured data.

For example: an analyst staring at payment histories may never discover the reason behind an obvious anomaly. This is because he can’t seamlessly navigate to the relevant contracts and see variations in contract terms - highlighted in such a way that the root cause becomes obvious. (More on that later…)

Capabilities like entity extraction and fuzzy search, long staples of unstructured search engines, have put these types of discovery activities entirely within reach.

One of the most valuable repositories in the typical enterprise is electronic mail. How much of the knowledge in your company is contained in these files? Email is semi-structured, so it offers a variety of unique insights. Your CRM system can tell you that your sales team in the southwest region isn’t on track to achieve their goal this quarter, but can it tell you why? A combination of entity extraction, automated sentiment analysis and social network analysis might just turn up the problem account, internal resource or impossible customer requirement. The structured data leads the way, identifying the transactional problem, complemented by the unstructured data, which fills in the underlying cause.

In e-Commerce, social networks, and anywhere user content is generated, breaking the structured/unstructured silo can create real competitive advantage. Correlating what users say (in text) with what they do (in transactions) provides deep insight that can change average order value, average revenue per user (ARPU) and dramatically reduce customer defections, commonly referred to as churn.

Providing widespread access to business intelligence
Until recently, most users had to query systems using one clearly defined question, plus an understanding of how the data model in use could possibly answer it. In today’s ultra-competitive business climate, where innovation is critical, the question is not always clear. Flexible query capabilities allow users to ask open-ended questions without understanding the data model, and then drill into a mass of results using dimensions recommended by the system. Instead of trying to hunt down disparate pieces of information, or rely on highly trained IT experts, the average business manager or analyst is presented with potential answers through exploration and discovery oriented interfaces that help drill into results by focusing on a particular facet of the data.

This same approach can be used in the enterprise to help navigate through highly disparate sets of data. Going back to the previous example of anomalous payment histories – let’s imagine that an analyst has been assigned to investigate. In the legacy BI world, they observe the anomaly and come up with theories as to what might drive it. They call the sales and operations people who are involved with the account. They may review the contracts. They may even see the root cause, but it will take time, and support from other staff will be required.

In the new world of flexible querying and integrated data silos, the analyst can put the name of the account into the search box; this brings back a huge amount of data, but there is a facet on “type”. The analyst clicks on payment histories; this surfaces the anomaly. Then they click on “contracts”. The contracts are displayed in a timeline with differences highlighted. The analyst adds the payment histories to the timeline, and remarkably, the anomalies surface after new contracts. By examining the differences, the analyst realizes that it is a change in payment terms that is causing the different payment timings. And they have done this very rapidly, without understanding the data model and/or engaging costly support staff.

Of course, the average unstructured keyword search engine is not yet capable of this sort of interaction – but more modern technologies, as they emerge, can now automatically identify patterns and correlate disparate sets of data.

Conclusion
Today, BI is accessible by the few, supported by a legion of analysts and IT resources to make it work. This is the “BI pyramid”. The next generation of information access technologies will bring together structured and unstructured data and make querying with the precision of SQL and the “fuzziness” of search a reality. Their ability to help the user analyze and navigate disparate results will make information more accessible, inverting the pyramid to bring BI to the masses and creating a legion of knowledge workers, consuming information as they need it, supported by a few IT staff and commodity IT resources.

biography
Sid Probstein is CTO of Attivio.

Editorial standards