In the spirit of the last couple of years, we review developments in what we have identified as the key technology drivers for the 2020s in the world of databases, data management and AI. We are looking back at 2021, trying to identify patterns that will shape 2022.
In principle, we try to approach AI holistically. To take into account positives and negatives, from the shiny to the mundane, and from hardware to software. Hardware has been an ongoing story within the broader story of AI for the last few years, and we feel it's a good place to start our tour.
For the last couple of years, we have been keeping an eye on the growing list of "AI chips" vendors, i.e. companies that have set out to develop new hardware architectures from the ground up, aimed specifically at AI workloads. All of them are looking to get a piece of a seemingly ever-growing pie: as AI keeps expanding, said workloads keep growing, and servicing them as fast and as economically as possible is an obvious goal.
And what about the upstarts? SambaNova claims to now be "the world's best-funded AI startup" after a whopping $676M in Series D funding, surpassing $5B in valuation. SambaNova's philosophy is to offer "AI as a service", now including GPT language models, and it looks like 2021 was by and large a go-to-market year for them.
According to the Linux Foundation's State of the Edge report, digital health care, manufacturing, and retail businesses are particularly likely to expand their use of edge computing by 2028. No wonder that AI hardware, frameworks and applications aimed at the edge are proliferating too.
The other thing that's likely to continue to grow, both in terms of sheer size as well as in number, is large language models (LLMs). Some people think that LLMs can internalize basic forms of language, whether it's biology, chemistry, or human language, and we're about to see unusual applications of LLMs grow. Others, not so much. Either way, LLMs are proliferating.
Recently, EleutherAI, a collective of independent AI researchers, open-sourced their 6 billion parameter GPT-j model. In addition, if you are interested in languages beyond English, we now have a large European language model fluent in English, German, French, Spanish, and Italian by Aleph Alpha. Wudao is a Chinese LLM which is also the largest LLM with 1.75 trillion parameters, and HyperCLOVA is a Korean LLM with 204 billion parameters. Plus, there's always other, slightly older / smaller open source LLMs such as GPT2 or BERT and its many variations.
Beyond LLMs, both DeepMind and Google have hinted at revolutionary architectures for AI models, with Perceiver and Pathways, respectively. Pathways have been criticized for being rather vague. However, we would venture to speculate that it could be based on Perceiver. But since we're in future tech territory, it would be an omission not to mention DeepMind's Neural Algorithmic Reasoning, a research direction promising to marry classic computer science algorithms with deep learning.
On the other hand, that's not necessarily a bad thing, for two reasons. First, there is a major uptake of the technology in the mainstream. By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making, Gartner predicts. Reporting on use cases from the likes of BMW, IKEA, Siemens Energy, Wells Fargo, and UBS is no longer news, and that's a good thing. Yes, there are challenges associated with building and maintaining knowledge graphs, but these challenges are, for the most part, well-understood.
As we have noted, knowledge graphs are practically a 20-year old technology whose time in the limelight seems to have come. The ways to build knowledge graphs are well-known, as well as the challenges that lie therein. It's no coincidence that some of the most in-demand skills and areas for development in knowledge graphs are around using Natural Language Processing and visual interfaces to build and maintain knowledge graphs, as well as ways to expand from single-user to multi-user scenarios.
And to tie this conversation to the broader picture of AI where it belongs, common challenges seem to be around operationalization and building the right expertise in teams, as those skills are in very high demand. Another important touchpoint is the hybrid AI direction, which is about infusing knowledge in machine learning. Leaders such as Intel's Gadi Singer, LinkedIn's Mike Dillinger and Hybrid Intelligence Centre's Frank van Harmelen all point towards the importance of knowledge organization in the form of knowledge graphs for the future of AI.
There is also another important touchpoint between the broader picture in AI and knowledge graphs: data meshes and data fabrics. You'd be excused for mixing up those 2 and the plethora of data-related terms flying around these days. Simplistically, let's just say that a data fabric is meant to serve as the technical substrate for the data mesh notion of decentralized data management in organizations. That is actually a very good match for knowledge graph technology, and a few vendors in that space have identified that and positioned themselves accordingly. Even Informatica seems to have noticed.
We don't expect the world's honeymoon with graphs and graph databases to last forever, and after the hype, disillusionment will inevitably follow at some point. But we are confident that this technology is foundational and will find it its place despite hiccups.