In this article, we're going to take a quick look at a ChatGPT-style AI chatbot from a company improbably called Hugging Face. I couldn't look at the product seriously until I got past the name, so I asked its chatbot HuggingChat, "Why is your company called Hugging Face?"
The name "Hugging Face" comes from our goal to build AI models that can interact meaningfully and positively with humans on a personal level, similar to how one might hug someone they care about.
As you can see from the HuggingChat interface, the company is definitely going for that ChatGPT vibe:
But there are some big differences, the most notable being that it's open source. It's right here on Github.
While HuggingChat will appeal mostly to developers and AI tinkerers, anyone can play with the tool online. One thing I should note: almost all discussion of the online HuggingChat interface says it collects no information and does not require a login. That might have been the case for the very earliest release, but that's no longer the case:
Using the Web version of HuggingChat
I had to set up an account with Hugging Face to gain access to the UI. And since I did not disable sending chat data to the developers for them to review and examine, I'd like to apologize to those developers now for the profanity. I just wanted to see what it would do. What!? I was curious.
One more clarification is necessary. Throughout this article, when I'm talking about the web version of HuggingChat, I'm referring to the version that Hugging Face is providing at hf.co/chat. There are likely hundreds of other web-based versions of HuggingChat, by hundreds of other developers. That's the nature of open source. But for the next few moments, we're talking about the factory store version -- the one Hugging Face is making available to show off the tech.
The web version of HuggingChat uses a large language model (LLM) called OpenAssistant, which is also available on Github. As it turns out, there are a whole bunch of LLMs available on Github -- and yeah, we'll come back to that thought in a while, too.
As with ChatGPT and its ilk, you ask HuggingChat questions, and it answers. Unlike ChatGPT, there's a little button right above the query field that says "Search web", which means HuggingChat can reach out to the live web to do its research. For example, with "Search web" turned off, when I asked it, "Why did the submersible Titan sink?" I got back the reply:
I apologize, but without additional context, it's difficult for me to determine which specific submersible Titan you're referring to.
But when I turned on "Search web" and asked the same question, I got back:
Without more information, it's hard for me to say definitively why it supposedly imploded and submerged during operations related to the RMS Titanic wreckage recovery effort off North America's east coast (42°N 49°W / #TitanRover1).
Since the sinking of the submersible is a relatively recent story as I write this article, it shows that the pre-built LLM doesn't have any data on this big news item. However, by choosing to include web search data, you can inform the chatbot's results.
Overall, you're probably not going to use HuggingChat instead of ChatGPT or Bard or Bing. Answers aren't all that complete. They sometimes end in the middle of a sentence, and "continue" doesn't bring them back on track. Most answers aren't formatted, and are presented as one big paragraph. Responsiveness is fairly slow in comparison to ChatGPT. Sometimes, the answers don't have all that much to do with the question.
To be fair, the web version of HuggingChat is a very early release and is mostly intended as a feature demo for the software itself. So, you're probably not going to use it to cheat on your homework assignments. Instead, you're more likely to use it to tinker with your own AI projects.
And that's what we turn to next.
Building your own chatbot
If you're super-geeky, you can build your own chatbot using HuggingChat and a few other tools. To be clear, HuggingChat itself is simply the user interface portion of an overall chatbot. It doesn't contain the AI or the data. It just gets the question into the GPT and displays the answers.
You'll also need a language model. As we discussed, OpenAssistant is one such language model, which can be trained on a variety of datasets.
This is where HuggingChat proves to be particularly interesting for AI geeks. You can use a variety of open-source LLMs, trained on a variety of open-source datasets, to power your own chatbot.
Taken to extremes, one possible application for this technology could be building a chatbot for use inside a corporation, trained on the company's proprietary data, all of which is hosted within the enterprise firewall -- never to be available to the internet at all.
In fact, by combining HuggingChat, an inference server, and an LLM, you can run your own chatbot on your own hardware, completely isolated from the internet.
This YouTube video gives a basic tutorial on how to do that in a container. Actually, the video also shows how to split the inference portion of the job from the UI portion of the job, and host part locally and part in the cloud. Basically, once you have the source information to play with, you can build pretty much whatever your imagination can think of -- and that your gear can run.
Yeah, it does require a hefty server with some serious GPU horsepower. But that's a small price to pay for your very own chatbot.
Seriously, though, this development is a big thing. ChatGPT is a powerful tool, trained on who knows what data, running who knows what algorithms, and producing answers based on who knows what source information.
If you want something that you control, you can use HuggingChat to build a chatbot where you have visibility into every aspect of its functioning. You can choose to make that chatbot available online to other users and provide transparency to all users. Or you can build a locked-down, special-purpose unit, which is hidden behind a firewall, and accessible only to your own employees.
With HuggingChat, the choice is up to you -- and that's cool.
So, do you expect to build your own chatbot? If you could, would you train it on any specific data? Since you now have the freedom to make the chatbot in your own image, what would you do with that power? Let us know in the comments below.