Big data and the TSA: A match made in purgatory

Just how accurate are the results of data mining using big-data tools?
Written by David Chernicoff, Contributor

It's hard to avoid the stories of malfeasance and buffoonery that are published about the Transportation Security Administration (TSA) on almost a daily basis. While it seems that the application of simple logic and a touch of common sense could cut down on the number of stories that make the news it is often clear that the problems that arise are usually the result of a simplistic approach to the issue of providing a secure travel experience with minimal impact on the traveling public.

The TSA continually claims that it is improving the training and customer relationship skills of its officers, actions desperately needed if only to prevent another fiasco like the incident surrounding the late General Joe Foss and his Congressional Medal Of Honor, but regular fliers are still often exposed to what seems like a slavish by-the-book attitude that shows little room for the realities of day-to-day life. I recently witnessed a frustrated passenger trying to explain to a screener that a decorative belt was attached to her outfit and could not be removed. Perhaps that was the wrong choice of outfit to wear on a flight, but the screener couldn't get past the fact that the "remove your belt" direction was not being followed.

Under consideration as another tool to determine who can be allowed on an airplane is the use of big data; in this case commercially collected consumer information and the data generated by commercial data brokers. Data, as is pointed out in the story on the nextgov.com website, of unknown accuracy and, from the user's point of view, unknown provenance. And this is a huge potential problem. The massive amounts of data generated by and about the average consumer means that there are bound to be significant inaccuracies when trying to extrapolate behavior about a single person. Data that is suitable to deliver targeted ads to a user on a website is unlikely to be sufficient to determine if that user should be allowed to board an aircraft.

It also makes the old computing phrase "garbage in, garbage out" even more appropriate. In this case, erroneous data can easily contaminate otherwise accurate sources, and result in actions being taken that have no basis beyond that data inputted in error or applied to the wrong set of John Smiths. Data brokers cannot guarantee 100 percent accuracy of their data, and when that data is going to be used to limit the rights of people any significant error in accuracy is unacceptable.

Add this to the fact that getting erroneous data removed from your permanent school record is incredibly difficult. Just ask anyone who has had to go through the hoops surrounding correcting an error on their credit report, an area where there is already significant regulation and involvement from the federal government.

When using big-data tools to extrapolate consumer shopping habits, the accuracy of the results can easily be off by a large percentage. In fact, the extrapolations can be more often wrong than right and still be of value to retail marketing. But that level of accuracy doesn't even come close to what is needed before big data can become the tool of choice for Big Brother.

Editorial standards