GEDmatch highlights security concerns of DNA comparison websites

DNA matching can produce interesting data on family trees, but may also expose us to serious risk.

DNA data storage offers great potential for parallel processing A couple of years back, even researchers would wave off using DNA to store data as something too futuristic. Today, you can run PostgreSQL on DNA. Read more: https://zd.net/2KyplhF

DNA testing is no longer simply a tool in the medical field -- in recent years, DNA profiling has become a product offered by private companies and third-party services. 

These tests, often conducted with a home swab and posted away for analysis, can reveal family matches and possible connections, as well as clues to our ethnic heritage. 

As records pile up in the databases of companies including Ancestry.com and MyHeritage, third-party websites -- such as GEDmatch -- can also be used to compare DNA sequences submitted by other people. 

It is undisputably interesting to learn more about our genetic traits and family trees, but as noted by academics from the University of Washington, there may be a trade-off when it comes to our privacy and security. 

See also: Microsoft: This is world's first automated DNA data storage, retrieval system

GEDmatch is the focus of new research into the security risks of DNA profiling. The paper (.PDF), published by University of Washington academics and accepted at the Network and Distributed System Security Symposium for presentation in February, explains how small numbers of comparisons made through the platform can be used to "extract someone's sensitive genetic markers," as well as construct fake profiles to impersonate relatives. 

"People think of genetic data as being personal -- and it is. It's literally part of their physical identity," said lead author Peter Ney from the UW Paul G. Allen School of Computer Science & Engineering. "This makes the privacy of genetic data particularly important. You can change your credit card number but you can't change your DNA."

The researchers created an account on GEDmatch and uploaded different genetic profiles by sourcing data from anonymous genetic profiles. The platform then assigned these profiles an ID. 

When one-to-one comparisons are made, GEDmatch creates graphics to show how two samples match or differ, including a bar for each 22 non-sex chromosome. It is this bar that the researchers honed in on, creating four "extraction profiles" to try and deduce the target profile's DNA by making continual comparisons. 

CNET: Uber in talks with Los Angeles as scooter location data lawsuit looms

"Genetic information correlates to medical conditions and potentially other deeply personal traits," added co-author Luis Ceze. "Even in the age of oversharing information, this is most likely the kind of information one doesn't want to share for legal, medical and mental health reasons. But as more genetic information goes digital, the risks increase."

Millions of us have already submitted our DNA for tests, and as more individuals jump on the trend, the risks are likely to increase. Another GEDmatch graphic, together with 20 experimental profiles, revealed that larger samples could be exploited to target a single record with an average of 92 percent of a test profile's unique sequences becoming harvested with roughly 98 percent accuracy.

False relationships, too, are a possibility. The researchers created a fake child containing 50 percent of its DNA from one of their experimental profiles. After launching a comparison, GEDmatch came back with an estimated parent-child relationship.

By doing so, it is theoretically possible for attackers to also create any family relationship they want by changing shared DNA fractions. 

TechRepublic: How to avoid malware on Android in one easy step

"If GEDmatch users have concerns about the privacy of their genetic data, they have the option to delete it from the site," Ney said. "The choice to share data is a personal decision, and users should be aware that there may be some risk whenever they share data."

The academics reached out to GEDMatch prior to publication and said that the platform is "working to resolve these issues."

The research was funded in part by the University of Washington Tech Policy Lab, with the help of a grant from the Defense Advanced Research Projects Agency (DARPA) Molecular Informatics Program.

GEDmatch told ZDNet:

"As a result of the Washington and another study, we have made several changes and are working on others.  We appreciate the concerns these studies have brought to light." 

Previous and related coverage


Have a tip? Get in touch securely via WhatsApp | Signal at +447713 025 499, or over at Keybase: charlie0