John Montgomery, Microsoft corporate vice president of AI Platform told ZDNET, "Increasingly, companies are looking for tools to customize their AI builds and fit their organization's needs and budget. Our goal is to make it easy for any organization to innovate with AI for things like building chatbots using organizational data and transforming search and document intelligence to get valuable insights and make better business decisions."
Let's dive in.
Azure AI Vector Search preview
As we've come to know the capabilities of large language models (LLM) like ChatGPT, one of the most amazing (and controversial) features is their ability to find and describe data in response to a query or a prompt. Performing analysis of enormous datasets requires very powerful search capabilities.
Most folks are familiar with the traditional keyword search, where a word or phrase is used as a lookup key for resulting data. But AI engineers have been working with a different kind of search, called vector search.
Vector search not only looks at text, but it can also search media types like images, video, and audio. This chart gives a quick overview of how keyword search differs from vector search:
Numerical representations of data, search based on similarity
Text string word matches
Text, image, audio, video
Compares vector representation of query and content
Compares query string to content
Can capture the "meaning" of content, providing a level of context-based understanding
Matches based on string, so a search on "chair" means roughly the same thing to the engine whether searching for a desk chair or committee chair.
Can use both pre-trained and custom models, depending on project requirements.
Relies on word/phrase matches, which may not be as flexible.
Rather than using keywords, vector search creates a numerical representation of the data (called vector embeddings).
Vector search is part of Azure's Cognitive Search offering, which has a number of search mechanisms. Search applications can call on a hybrid set of these search options to perform large data searches, providing a deep bench of search capabilities.
While LLMs offer tremendous knowledge analysis (and a little bit of randomly generated hallucinatory filler), they lack the ability to process corporate information.
This lack of capability is driven by three key factors:
Data in public LLMs (ChatGPT in particular) have temporal bounds. In other words, ChatGPT doesn't know about anything after 2021.
Public LLMs definitely don't know about your corporate information, whether it's the contents of user manuals and repair manuals, or financial and strategy planning documents.
It's fortunate that there's no way to insert deep corporate information into a public LLM's dataset, because you definitely don't want all that proprietary information out there in public.
But there's enormous potential in combining the capabilities of generative AI chatbots with deep corporate data and documents. Imagine interacting with your proprietary user manuals, or being able to ask questions based on meeting notes going back years. The discovery potential, as well as the cross-departmental knowledge integration potential, is enormous.
That's where Document Generative AI, which is a collaboration between Azure AI Document Intelligence (formerly Azure Form Recognizer) and Azure OpenAI Service, comes in. This capability, also just announced at Inspire 2023, provides support for multiple document types, OCR (with AI-driven error correction), and information extraction.
On top of all that is all the security that Microsoft and Azure bring to all their cloud-based offerings, so you can upload your proprietary data and be assured that it will remain locked down and under your control.
Beyond that, the real key to the Azure-based solution is scalability. Individual documents can be small all the way up to huge, and the library of documents can be enterprise-scale as well. Microsoft today announced that you can try out the service via this GitHub repo.
Whisper Model preview
The third main AI Azure AI announcement that caught our attention is the service's new Whisper Model, which handles audio transcription at enterprise scale. This, too, is a collaboration with OpenAI, providing the OpenAI services on top of Azure.
Whisper Model has the ability to understand 57 languages. It supports what Microsoft calls "enhanced readability," which essentially means the AI can understand what's being said in context and can create transcriptions with more colloquial wording, as appropriate.
A big feature here is scalability. Whisper Mode on Azure can scale to transcribe hundreds or even thousands of documents. It can, for example, transcribe thousands of customer service conversations, and index them using some of the tools we've discussed previously.
This also relates to the customization and Azure integration aspects of the service. As we've seen, Azure AI has a wide range of offerings and the new Whisper Model integrates into those offerings, allowing users to build out very customized applications.
The real differentiating factor with the Azure/OpenAI offering is how tightly it integrates into other Azure solutions and includes Azure security capabilities, making it a very natural component to add along with other aspects of custom applications.
Whisper Model was announced at Inspire 2023. Once it is available in preview, users will be able to apply for access to the Azure OpenAI Service, which will then open the door for Whisper testing.
More AI from Inspire 2023
Microsoft's Montgomery said, "From our supercomputing infrastructure and trusted cloud capabilities to cutting-edge AI models and robust responsible AI tooling, we're focused on continuing to make Azure the best place for training, deploying and scaling AI models -- both frontier and open."
In this article, we spotlighted three AI technologies announced at Inspire 2023. Microsoft clearly has put a lot of time and investment in providing AI tools for its Azure clients. Keep reading ZDNET for our ongoing coverage of Inspire 2023, as well as other AI initiatives Microsoft is announcing.