Information fingerprinting

February 28, 2006, 8:16pm PST | Length: 00:03:39
Data protection solutions have typically filtered content by matching patterns and keywords. Raj Dhingra of PortAuthority Technologies introduces a new method called 'information fingerprinting.' It uses filters to actually learn the context of data.

Transcript

Information fingerprinting

Hi, I m Raj Dhingra, Vice President of Product Managementand Marketing at PortAuthority Technologies, and today we re going to talkabout information fingerprinting, a technique for being able to accuratelyidentify and stop information from leaking. In our previous video we talkedabout why content filtering is not enough. Let s talk about some othertechniques that can be used to protect your good stuff, your confidentialinformation, whether that s sitting in a customer data database or is actuallya document sitting on a file server or document management system.

Let s take the example of customer data. You ve got name,social security number, zip plus four, account number. I m going to go into thedatabase and I m going to extract out ten records that contain name, socialsecurity number, date of birth and zip. Let s take a look at the different kindof filters that can help stop this information from leaking.

The first class is global filters, which relies primarily onfile type. So a global filter can stop an encrypted file from leaking. You cansay stop encrypted files. On the other hand, when it looks at an Excel file,it will not stop it from leaking because Excel files might be allowed in yourcompany policy.

Let s take the next class of filters, which is tokens. Inthe case of tokens, tokens are using keywords, patterns and expressions. So wetake this example of my Excel spreadsheet that contains the name, socialsecurity number, date of birth and zip plus four. When tokens are being used toidentify social security numbers, while they will correctly identify the SSNs,they will also pick up the zip plus four as SSNs. As a result, you have falsepositives. So we have a limitation where this information is now going to leakout, creating an unsecure environment.

The third class of filters is contextual filters. And thatis very different from either global or token-based filters. In the contextualfilter case, we re now starting to learn the actual data and the context of thedata. That is what information fingerprinting is about, and informationfingerprinting will actually go through, for example, your entire database thatmight contain 100,000 records, or it might contain one million records. And itwill learn the specifics of the name, the actual social security numbers, thezip plus four, the account numbers.

So as a result, when the fingerprinting is complete, thisinformation about the actual customers is now stored securely in a fingerprintdatabase. As a result, when we start to use filtering techniques forinformation that might be leaking, let s say this Excel document now isconnected to an email, attachment email, the information fingerprinting basedidentification will now very accurately and precisely identify that this socialsecurity number maps into a particular customer s name and address. As aresult, we ve got very precise identification of sensitive informationoccurring when we start to use information fingerprinting. In this particularcase, when it sees zip plus four, it s not going to identify that as sensitiveinformation.

So to summarize, contextual information does a few things.First, it learns your data; it precisely builds the fingerprints down to thevery granular level. It can then accurately and reliably identify what thesensitive information is. It is also resilient to being manipulated from a dataperspective, so really provides very accurate and very reliable identification.As a result, now you can stop sensitive information from leaking from theorganization.

Next generation of business intelligence

Next generation of business intelligence

Data warehouses collect gigabytes of data everyday but the information is not always meaningful....

Enterprise Mashups

Enterprise Mashups

Enterprise information has traditionally been stored in silos, with employees connecting to them...

Business Technology Optimization

Business Technology Optimization

Christopher Lochhead of Mercury argues that running IT like a business, or optimizing the...

IT governance

IT governance

IT execs are adopting best practices and software applications to optimize IT governance to gain...

Holographic storage

Holographic storage

The next big thing in storage? Three dimensional holographic images enable more information to...

Should CIOs abandon all hope?

Should CIOs abandon all hope?

The demands on IT are great no matter what the economic climate. Intel CIO Stacy Smith shows us...

How to secure your data

How to secure your data

Parts of the corporate network, such as disk drives and servers, can be at risk of intrusion...

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

White Papers, Webcasts, & Resources

Facebook Activity