With communal lunches catered by an on-site chef, bright and open workspaces, and an impressive supply of chilled La Croix, the Pittsburgh headquarters of Duolingo looks more Silicon Valley than Southwest Pennsylvania. But for the AI-powered language learning platform, founded by renowned Carnegie Mellon University computer scientist and CAPTCHA creator Luis von Ahn, the Google-esque vibe mirrors its progressive outlook on technology and industry transformation.
As one of the up-and-coming stars of Pittsburgh's burgeoning technology scene, Duolingo has grown into the largest language learning platform and one of most downloaded education apps worldwide since its launch in 2011. Behind its success is the usability of its core app, which breaks down language education into bite-sized, interactive lessons; as well as its embrace of machine learning and AI, which drives everything from the company's product development and user retention efforts to its recruitment priorities and revenue strategies.
"AI plays a big role in our ability to scale high-quality education to everyone who needs it," said von Ahn, Duolingo's chief executive. "We started this company with a mission to make education free and accessible to everyone in the world. At the time, there was no good way to learn a language for free."
Duolingo's AI journey ramped up in 2012 with the hire of Burr Settles, currently the company's research director, who used his background in machine learning and computational linguistics to figure out what data models worked best in language learning. He eventually spearheaded the company's use of the space repetition model, which in the context of language education posits that people learn better through short study periods spread out over time rather than cramming.
That schedule of how often one should practice a language was a core part of the launch of Duolingo and its game-like curriculum, Settles said. Another early AI project dealt with course level placement, figuring out where to put new language learners so that they felt challenged enough to keep coming back.
"When you first start a Duolingo course you have to take a placement test," Settles said. "So at first we lost a lot of intermediate and advanced learners because it was too simple for them. With AI we created a computer adaptive placement test that is very accurate."
Duolingo's computer adaptive placement test is a responsive exam that automatically adjusts question difficulty depending on the test taker's response, with correct answers leading to harder questions and incorrect answers leading to easier ones. One of the big perks about this style of testing is that it's able to assess someone's ability quickly based on a small number of questions.
"While we were developing [the placement test], the idea came up that we could also enter the language assessment market," said Settles. "Traditionally these assessments are very cumbersome and are barriers to university for those in the developing world. We decided we wanted to build a computer adaptive online language proficiency test."
Settles is referring to the Test of English as a Foreign Language (TOEFL) exam, a standardized test to measure the English language ability of non-native speakers applying for admission to English-speaking universities. The TOEFL process can take six to eight weeks, from the time a test is taken in a required testing center to when the results reach a desired university.
The entire TOEFL industry, with an estimated worth of around $5 billion, is occupied by just a handful of companies. Naturally, Duolingo saw an opportunity for disruption.
"The last four years or so, 90 percent of our AI development has been in that proficiency test," he added. "We use AI to verify the person's identity, generate the test items, score them and administer the test adaptively. That way we can administer the test, from start to finish, in 45 minutes. And it can be done anywhere in the world with internet access."
Over 300 US universities now accept the Duolingo language proficiency test, including Yale, New York University, UCLA and Duke.
Settles said Duolingo is now expanding the use of AI in its core language learning app. The key areas of focus include content development, personalization and cognitive modeling. On the content development side, Duolingo is using AI to develop course curriculum, and deep learning to extrapolate English coursework into other languages and to improve understanding of semantics and word frequency.
"The deep learning models we use are, for the most part, things we had to invent ourselves," said Settles. "The problems we encounter are not standard and are specific to the data we have, so we don't always have a lot to lean on."
"In terms of the computing resources, we deploy on AWS for cloud infrastructure," he added. "But for machine learning, the actual models and AI architecture, we are inventing them."
Looking at its business model and long term plans, Duolingo said the bulk of its revenue comes from advertising, and subscriptions from people who pay to remove the ads. This year subscriptions took over as Duolingo's top revenue source, and the company is planning to roll out new AI-powered features to make the subscription option more compelling.
Duolingo said it's on track to do $40 million in revenue this year, roughly breaking even. Hiring has ramped up across the company, and the AI team this year expanded to nine people. The company's endgame is to become a publicly traded company, with a possible IPO in the next three years.
"It always seemed unfair to me that the people who most needed to learn a language for access to better jobs and schools were also the ones who could least afford it," said von Ahn. "It's truly rewarding to see the progress we've made in the last six years, teaching languages to over 300 million people, keeping all of our learning content free, and still building a sustainable business with ads and subscriptions."
"I would like to see Duolingo become a public company in the next couple of years," he added. "We're on track to make that happen, but of course it depends on a variety of factors."