Writing in Technology Review, Mark Williams explains that the problem with the Transportation Security Administration's watchlists - which famously produce reams of false positives - is based on a very old algorithm called Soundex.
Latanya Sweeney, director of the Data Privacy Laboratory at Carnegie Mellon University's School of Computer Science, in Pittsburgh, explains:
"Soundex is an old patent that's been used for a long time, whenever they have two databases where they're trying to match up records." Indeed, Soundex dates back to a time when Hollerith punch cards were the newest thing in computing technology.
As Williams explains, it is a crude filter, indeed.
Soundex works by taking the first letter of a name, dropping all vowels, assigning a number to each of the next three consonants (with similar-sounding consonants like s and c getting the same numbers), then dropping any remaining consonants. Thereby, the algorithm reduces all names to a letter followed by three numbers.
Consequently, Soundex assigns to the name Laden the code L350, as it does Lydon, Lawton, and Leedham. This is, in other words, an algorithm so deficient for identification purposes that it confuses al Qaeda's Osama bin Laden and the Sex Pistols' Johnny (Lydon) Rotten.
Unlike the Dept. of Homeland Security - which has been prevented by privacy advocates and Congress from collecting data other than names - private industry uses many more identifying data and has much better technology to eliminate false matches. Indeed, what they know about you is pretty scary.
Evan Hendricks, editor-publisher of the Washington-based Privacy Times, says, "Though most Americans don't know about ChoicePoint, it's a company that knows a lot about hundreds of millions of Americans."
ChoicePoint doesn't only possess copious records on U.S. citizens (its subsidiary, VitalChek Network, provides the technology to process and sell birth, death, marriage, and divorce records in every U.S. state). It has also acquired data on some 300 million citizens of Mexico, Brazil, Colombia, Argentina, Nicaragua, Guatemala, Honduras, El Salvador, and Costa Rica--a fact that emerged in 2003, after the company disclosed that it had bought data (reportedly including even passport numbers and unlisted phone numbers) on Mexico's entire roll of 65 million registered voters.
Other products and services provided by ChoicePoint include the DNA identification of the victims of the Word Trade Center attacks on September 11 via its subsidiary, the Bode Technology Group (sold off by ChoicePoint in March), and SmartSearch, which performs "wildcard searches" that can construct a comprehensive personal profile in minutes, starting with only a first name or partial address. More controversially, ChoicePoint subsidiary Database Technologies (also known as DBT Online) was contracted to assemble a list of voters barred from voting by the state of Florida and was responsible for an alleged 57,700 people--primarily African-American and Hispanic Democrats--being incorrectly listed as felons during the U.S. elections of 2000. Other ChoicePoint divisions deliver all types of credential verification, employment-background screenings, drug testing, criminal records, motor-vehicle records, mortgage-asset research, tenant screening, database software, medical information, and services for the life- and health-insurance fields.
Williams wonders it would be wise for government to outsource the watchlists to companies like ChoicePoint, since they would surely have many fewer less false positives? Answer: Bad.
At the Big Brother Award ceremonies held annually by U.K.-based Privacy International, ChoicePoint has twice been a winner: in 2001, as "Greatest Corporate Invader" for "massive selling of records, accurate and inaccurate to cops, direct marketers and election officials," and again in 2005, as "Lifetime Menace Award" for its continuing efforts to build dossiers on individuals.
It's extraordinary, of course, that private corporations should have had the means to accumulate--and trade--more personal data on Americans than the U.S. government possesses.