How Trulia tackled machine learning challenges to build an in-house AI platform

Deep Varma, Trulia's VP of Engineering, explains how the real estate site has successfully confronted problems like weak data sets and the nationwide shortage of ML engineers.

Artificial intelligence will soon enough be a ubiquitous part of our digital life, powering consumer and business technology alike. At the moment, however, enterprises are navigating through largely unchartered territory as they try to infuse their products with AI, coming across a series of challenges for which easy solutions don't yet exist.

feed1.png


Trulia, the residential real estate site, started the journey early: More than four years ago, it started building its own AI platform with the goal of creating a more personalized, predictive experience for its users. So far, Trulia says the effort has paid off with a double-digit increase in consumer engagement.

Deep Varma, Trulia's VP of engineering, spoke with ZDNet about the platform Trulia has built, why it was built in-house, and how it's approached some common AI challenges. Here's a lightly edited version of what he had to say:

An AI platform based on three pillars

Trulia decided to invest in AI to further its core mission, Varma said: to help consumers "discover a place they'd love to live."

To advance that mission, Trulia started building an AI platform founded on what it calls the "personalization piece" -- what it knows about its users unique preferences. It knows, for instance, when a user is searching for a three-bedroom house with a pool in a quiet neighborhood.

On top of that foundation, Varma said, Trulia got to work building three machine learning-based pillars: computer vision tools, a "recommender" system and a consumer prediction engagement model.

Investing big in computer vision

Computer vision algorithms obvious utility for a site like Trulia: "We have trained those computer systems where they can look into images and can say, 'I'm looking at an image of a home, this is the front yard, the bedroom, or this is the bathroom.'"

However, "for us, just stopping there was not the solution," Varma said. Given its specific needs as a residential real estate site, the company invested more in building its object detection system. "We can look into a kitchen and say this is a kitchen -- a kitchen with white granite. Or we can look into the living room and says it has hardwood floors."

Trulia's computer vision technology also looks for what it calls the most "attractive" photos. "We know with our own experiences of home buying, when we start the search... and start looking at the photos, if those are not engaging, you will go to the next photo, or the next listing," Varma said.

A photo's "attractiveness" is calculated with three variables: whether it is an appropriate image, the quality of the image and the relevancy. An image of a backyard with a pool would be dubbed "appropriate" if, for instance, a user was searching for homes with pools. However, it would not be a "relevant" image if it were simply the image used on a real estate agent's business card, for example.

"By investing in our computer vision and having those product experiences of visual browsing as well as the most attractive photos, we saw a double-digit increase in inquiry for our listings," Varma said.

Offering recommendations and predicting consumer engagement

Along with these enhanced visual products, Trulia has invested in a "recommender" system that aims to introduce users to properties that may appeal to them, even if they fall outside of their specific, stated preferences. That system is based, in part, on a "collaborative filtering" technique.

feed2.png

"If you're looking into a neighborhood and some consumers are also looking into the same neighborhood, when those consumers move into a different neighborhood or a different property, we use this collaborative filtering technique to recommend to consumers... a broader view," Varma explained.

To round out its AI-powered tools, Trulia built a consumer prediction engagement model. "When we send you content, we look at what content you engage with and what you are not engaged with," Varma said, "and we only send you what you want rather than overwhelming you in this journey and sending you emails and push notifications that are not relevant."

The challenges of building and deploying ML models

The three pillars of Trulia's AI platform have all been built in-house with dedicated teams of applied scientists and machine learning engineers.

For computer vision, the company has trained models with open source frameworks like Caffe while exploring support for TensorFlow. On top of that, it's invested in its own servers with GPUs. They're using languages like Python to write some machine learning models, while Scala and Java are mostly used on the serving side.

Trulia learned early on the importance of quality data when training models, Varma said. Years ago, the Trulia team learned this lesson when its computer vision models were seemingly able to correctly label images of windows. However, they were also misidentifying some images of mirrors as windows, when the mirror was reflecting a window.

"That's how a small data set can basically destroy machine learning models," Varma said. To address this challenge, Trulia focused in the early stages on finding the relevancy of images. Then it jumped to "appropriateness" and image quality, Varma explained.

Meanwhile, early on, Trulia would train its models every six weeks and deploy them every fourth week. "We could see accuracy going down," Varma said. Now, Trulia uses models that are trained and deployed in real time to keep up accuracy.

Cultivating in-house talent

"The reality is across the US, finding strong applied sciences and ML engineers is getting tougher and tougher," Varma said.

To find talent, Trulia has gone directly to schools, in some cases partnering with universities on AI research. Meanwhile, the company also invests in retaining talent as well as hiring. The majority of applied scientists at Trulia have been there more than four years, Varma said.

Trulia also thought about the way its AI teams are structured, separating the machine learning branch from the applied sciences branch. The company created a branch of applied scientists -- "the researchers and the explorers," Varma said -- to build models, while the ML engineers focus on deploying to production systems.

What's next for Trulia

Trulia is now actively exploring running models directly on mobile devices.

"From an operational point of view, you reduce overhead" when you don't need to go back to a server "to tell you this is an image of a kitchen," Varma said.

executive guide

What is machine learning? Everything you need to know

Here's how it's related to artificial intelligence, how it works and why it matters.

Read More