**This post originally appeared on TechCrunch**
Data ethics is a subject our industry has largely ignored, avoided and failed to acknowledge as important. This neglect is almost certainly caused by fear — fear that examining the question would expose us as doing wrong; fear that ethics might stifle innovation; fear that the ethical questions are insoluble and intractable.
Empathy is important; these fears should not be dismissed. But after some work with big data ethics analyst Kord Davis last year, I came to realize there’s also a good chance these fears are unfounded and overblown. Doing good needn’t contradict doing well. In most cases, data ethics are achievable, as long as standards and procedures can be thoughtfully explored, established and agreed upon.
Perhaps more importantly, adherence to those standards and procedures can be made feasible with good technology. In other words, ethical data use can be productionalized and, in large part, automated, through the use of good tooling. Seen this way, data ethics is really a specialized area of data governance and stewardship.
Some ethics specifics
We have to be careful not to oversimplify this, of course. Clearly, data ethics is not a problem that can be solved just by “throwing technology at it.” As an industry, and as individual organizations, we need to discuss what should and shouldn’t be done with data. To what level is profiling of individual customers or constituents acceptable, versus doing everything at an aggregated level? What individual information is okay to store in plain text, in encrypted form or not at all?
Where is it okay to cross-reference a customer? For example, if an individual is both a customer of cable service and a mobile phone/data plan, both under the same company’s umbrella, is it fair game to correlate their location history with their channel watching history and share that information with advertisers? If doing that at an individual level is too invasive, what about aggregating to the block, neighborhood or ZIP code level?
For a great number of such questions (even if not those specific ones), many companies haven’t formulated their policy. And it’s possible that neglecting these questions is as much a cause of fear as it is a reaction to it.
If companies could examine these questions rigorously, then devise policies they’re comfortable with and proud to share, that would almost certainly enhance customer relationships and trust. That’s likely to be a revenue-positive outcome. For a while, it would be a competitive advantage, and, soon enough, it would become a competitive requirement. That would be good for the analytics world, overall.
But the burden here shouldn’t all be on the customer. Those of us who build analytics software have a role here, as well — a big one, in fact. We should be building tools that make data governance and audit easy, and we should be communicating to our customers why it’s so important.
Our tools also should also have policy-driven data ethics checks — including alerts that pop up dynamically as audit information is being recorded, warning users of potential ethics transgressions and asking for confirmation to proceed, or blocking the action entirely.
These wouldn’t be “lecturing” features, based on generic rules, either. Instead, the checks would be driven by internal organization policy — enforcing rules that a customer would want to follow. Such functionality would make compliance easy, rather than overwhelming. This should be about assisting, not reprimanding.
In advance of using such functionality, more mechanical governance features are of utmost importance. Data lineage, role-based access controls, encryption of data at rest, audit logging, version control and integration with external governance systems together make up the essential foundation of an automated data ethics environment.
Customers should take advantage of these capabilities. Those of us who build them should help foster a greater awareness of data ethics and greater facilitation of customers to implement ethics policies and assure their own compliance.