Can big data help people find true love? Online dating website eHarmony thinks so.
Famed for its proprietary matching system that pairs people up based on compatibility, the US-headquartered company has nearly 33 million users across 150 countries looking to find themselves a significant other.
Users are required to fill out a questionnaire when they sign up to the website, which contains 200 questions that cover everything from characteristics to values.
The information is then run through an eHarmony-patented algorithm, where matches are found for users. It's not the most romantic way to find a spouse, perhaps, but eHarmony claims that between the years of 2008 and 2009, an average of 542 people married every day in the US after meeting through the website.
That's a better success rate than your humble journalist performing drunken mating dances at a club with sticky floors.
"We are different than anybody else; we have this huge computational challenge over traditional photo-browsing dating sites," eHarmony managing director for Australia, Canada, and Japan Jason Chuck told ZDNet. "Just that matching system alone puts a lot of pressure on the whole system."
Though he couldn't put an exact figure on how much data eHarmony holds, Chuck estimated that eHarmony has "terabytes of data" in its large internal data warehouse based in the US. The company has a centralised operation, so all the information gets fed back to a central US hub, which is then backed up in different locations.
"We use Hadoop to help us back up parts of our data, and it's a good way for us to be able to distribute the processing in a large number of machines," Chuck said. "We layer different bits of software on top; for example, we use MicroStrategy as one of our software packages for a lot of our business intelligence."
As well as the questionnaire, eHarmony also mines "a ton of data" on its users, mainly from their activities on the website. User behaviour is an integral part of how eHarmony works, as it helps predict the success of users on the site.
An example of this is predicting communication levels for different people using eHarmony. There are a number of factors involved, including how many times a user logs on to the website, how many photos they post, and the number of words they use to describe themselves on their eHarmony profiles.
"Form that data, you can tell who is more introverted, who is likely to be an initiator, and we can also see if we give people matches at certain times of the day, they would be more likely to make communication with their matches," Chuck said. "It kind of snowballs from there. We use a number of tools on top of that, as well."
The matching process is more or less automated, with eHarmony able to generate matches for users in a matter of minutes, though in some cases it may take longer. It all depends on the individual, but eHarmony is very aware that people would be anxious to receive their matches as soon as possible.
Because the whole matching process is free, it is important for eHarmony to encourage users to subscribe to its service in order to make contact with their matches. It's hard to make a dime when you have no matches for an individual on the get-go.
"You want people to get excited, and you want to capitalise on that excitement," Chuck said. Considering eHarmony was valued at between US$700 and US$900 million in 2011 by GreenCrest Capital, it's safe to say that the company has no problems with doing just that.
eHarmony is very much a data-based business, with nearly half of all employees being either technology engineers or dealing with data on a regular basis. It has a full-time data analytics team, as well as a division dedicated to matching users.
"The matching team works on algorithms that deal with the data, and they try to find ways to optimise best match potential," Chuck said.
eHarmony's algorithm varies from country to country, and the matching team regularly tweaks it based on user behaviour and new relationship research from educational institutions.
That is what causes huge computational and mathematical challenges for the company. In Australia alone, there are more than 1.5 million registered users on the site.
"You are essentially comparing 1.5 people to one another across hundreds of different variables, and trying to understand who are the best matches, and then send that list out on a daily basis," Chuck said. "There's the automated version of the process, then we have the matching team on the back end developing hypotheses and analysing data to see which variables should be tweaked."
Big data also shapes eHarmony's marketing efforts, telling the company when is a good time to send out promotions to individual users.
eHarmony is looking at ways to be smarter with processing its big data, and is considering doing more computational work in the cloud. However, no solid plan has been set to date.
"There are a lot of little things we are constantly testing out to try and understand how we can promote success in matching up our users," Chuck said.
"Data is something that permeates throughout the entire culture of eHarmony, from the very top down."