GDPR: What the data companies are offering

The EU's General Data Protection Regulation goes into effect today. Here's what some of the specialty and Hadoop platform vendors are offering to help their customers be compliant and stay that way.

Today is GDPR day, the day the European Union's (EU's) General Data Protection Regulation goes into effect. Companies with customers in the EU are bound by the regulation, and there are very stiff fines for non-compliance.

GDPR has become mainstream, as evidenced by my 13 year-old son who, as I was writing this post, asked me why everyone's updating their privacy policies. You've probably been getting numerous privacy policy update emails in your inbox, too. If so, you know that while the GDPR is about data, it's geared to all companies, not just vendors in the data and analytics space. In fact, those companies have the same liabilities as their customers and are still busy figuring out the finer GDPR details.

But that doesn't stop our fearless analytics and data management vendors from offering products and features now to help their customers comply. This post is an attempt at rounding up several of these GDPR helpers.

Read all about it
A few companies have issued press releases around GDPR recently, so let's start with them. But keep in mind these announcements are more GDPR-timed reminders than they are breaking news, as the features aren't all new.

In March, StreamSets announced its Data Protector product that can discover, secure and govern personal identifiable information (PII) as it arrives from a batch or streaming data source or moves between compute platforms.

Dataguise is a company that has been dedicated to data protection since its founding. While the company got started offering data masking technology for relational databases, it's now branched out to machine-learning based active data protection. Back in November, the company announced data detection and security support for GDPR Article 17 (the Right to Erasure) and Article 15 (Right of Access by the Data Subject). And in a release earlier this month, Dataguise highlighted six of its features as GDPR compliance-relevant: sensitive data detection; policy-based data governance; controls over sensitive data; automated sensitive data auditing; real-time GDPR compliance violation alerts; and pre-planning and documenting data in and out of compliance.

On Tuesday, Arcadia Data issued a press release explaining that its Arcadia Enterprise includes several GDPR compliance-assisting features, based on its integration with three Apache Software Foundation open source projects: Spot (for anomaly detection) as well as Sentry and Ranger (both for role-based access control).

And just yesterday, Attunity announced its Gold Client for Data Protection, which provides data masking and PII detection for SAP environments. The announcement came not just on the eve of GDPR day but in the run up to SAP's annual Sapphire conference being held June 5-7 in Orlando, Florida. The product also offers configurable erasure-policy rules, SAP change log protection and a user-defined "undo" period (to define the period after which data masking is irreversible).

Elephants in the room
What about the major Big Data platform vendors (once categorized as Hadoop distribution vendors)? Cloudera's big bet is on its Navigator Product, which is focused on data cataloging and data lineage, and Apache Sentry -- already discussed, above -- for which Cloudera is the prime backer.

Hortonworks has placed its chips on Apache Atlas, a data governance/data catalog-focused project, and the aforementioned Apache Ranger; it is the key commercial backer for both projects. Apache Metron, which the company also backs, is a security solution for streaming data that Hortonworks also offers up as part of its overall GDPR solution.

MapR says its Converged Data Platform (MCDP) offers PII auditing and governance (with no performance degradation); data lineage for personal data; real-time alerts for breach notifications; and file- and volume-level security with Access Control Expressions (ACE).

Data management and governance vendors a go go
You may find what I've covered so far is helpful. But lots of other vendors are in the data protection game:

  • Alation, which focuses on data catalog technology, explains that such technology must be the very underpinning of GDPR complaince
  • Collibra says likewise about its data governance-focused products
  • Unifi focuses on data preparation and data catalog functionality but also offers GDPR written and technical audits (including AI-based PII detection).
  • Waterline Data offers sensitive data discovery and GDPR compliance status reporting
  • Podium Data offers what it calls Intelligent Data Identification, enabling real-time detection, reporting and alerting of PII and duplicate data.
  • Big ID offers PII identification and protection; data flow mapping; and other governance features
  • Informatica offers sensitive data detection via its Secure@Source product. Informatica Data Masking and Informatica Data Archive (which has data purging features) come in handy too
  • Spirion offers data protection and classification for sensitive data
  • Privacera does too, and says its sensitive data discovery is AI-based
  • Io-Tahoe offers machine-learning based discovery of sensitive data and redundant data, and integrates both facilities into its data relationship and data flow discovery. (Disclosure: I work with Io-Tahoe and therefore will comment no further.)
  • Infogix, GroundLabs, Ataccama and ZL Technologies have offerings in the GDPR space, too. And to be very candid, I have yet to learn the details. But I intend to.

Hopefully you're getting the picture: GDPR is creating a bumper crop of products, features and -- more than likely -- a lot of closed deals for these vendors. That's how regulation works: compliance is mandatory, and compliance is hard, but the right technology can help companies follow the rules in an automated fashion, and still conduct business.

Happy GDPR day everyone. I wish your inboxes some peace and quiet over the holiday weekend observed in the US, UK and elsewhere.