Israeli data discovery firm Octopai is today announcing a new platform, Data Lineage XD, to take on and automate the discovery and documentation of data transformations and flows within customer data estates. The "XD" stands for cross-dimensional, and that name is not mere marketing gimmickry. The Octopai system can map out what the company calls cross-system lineage, inner-system lineage, and end-to-end column lineage.
Axes of lineage
Executives from Octopai briefed ZDNet on Data Lineage XD and demoed facets of the platform. They explained that cross-system lineage maps the flow of data across systems, from initial ingest and extract, transform and load (ETL) through to reporting and analytics. It provides color-coded network diagrams that illustrate flows and dependencies, in the form of interactive visualizations. The figure below provides an example.
Inner-system lineage maps data columns' transformation within an ETL process, report, or database object. In order to achieve this, Octopai has connectors not just to data sources and destinations but to individual ETL platforms, business intelligence (BI) back-ends and self-service BI visualization tools. Rather than treating everything as a database, Octopai's connectors have contextual understanding of how to read the metadata and the code within SQL scripts and stored procedures. It can also read the "code" (detailed transformations and dependencies) within ETL and BI assets.
End-to-end column lineage details data column-specific lineage between systems. It's especially germane to regulatory compliance, impact analysis and root cause analysis. Because the connectors Octopai offers are so component-specific, the column lineage Octopai can produce is very granular and detailed.
Support is provided for eneterprise and cloud data warehouse platforms like Teradata and Snowflake; ETL platforms like IBM DataStage and Informatica; enterprise BI platforms like IBM Cognos and SAP BusinessObjects; self-service BI tools like Tableau and Qlik, and five different Oracle products, from database and data warehouse to ETL and BI.
For adherents of the Microsoft stack, Octopai supports SQL Server (relational), SQL Server Analysis Services multidimensional and tabular modes (BI), SQL Server Integration Services (ETL), SQL Server Reporting Services and Power BI. On the Microsoft cloud side, Octopai supports Azure SQL Database, Azure Synapse Analytics (SQL pools), Azure Analysis Services and Azure Data Factory.
Applications of this technology fall into a few buckets. First off, the platform can work really well to help teams document what they've built, and use it to troubleshoot bugs in data pipelines more quickly. But the platform can also be leveraged by consulting firms/systems integrators and in-house development organizations that are charged with supporting, enhancing and/or migrating systems they themselves did not build. This ability to analyze, visualize and help teams learn how systems are built can also work very nicely in merger and acquisition scenarios, where one IT org must take responsibility for the assets of another.
Data Lineage XD provides support for several more platforms than those listed above, with more slated for release in the future. And in addition to the core data lineage facilities, Data Lineage XD provides a business glossary that is fed by its data discovery capabilities and assisted by the XD's detailed understanding of specific platforms.
Empathy begets utility
Octopai, the company, was founded by a group of BI pros who clearly were involved in project implementations and understand the pain points involved in building or taking responsibility for such systems. For organizations in the trenches with complex enterprise BI/analytics systems, with a need to get them under the control, the Octopai platform is worth a look.