About Us Icon About Us Icon Business Analyst Icon Business Analyst Icon CEO Icon CEO Icon Datameer Icon Datameer Icon Envelope Icon Envelope Icon Facebook Icon Facebook Icon Google Plus Icon Google Plus Icon Instagram Icon Instagram Icon IT Professional Icon IT Professional Icon Learn Icon Learn Icon Linkedin Icon Linkedin Icon Product Icon Product Icon Partners Icon Partners Icon Search Icon Search Icon Social Networks Icon Social Networks Icon Share Icon Share Icon Support Icon Support Icon Testimonial Icon Testimonial Icon Twitter Icon Twitter Icon

Datameer Blog

How-to: Find Data Relationships With Column Dependencies in Datameer

By on March 21, 2014

**In this weekly blog series, engineering and support staff at Datameer share their favorite features in Datameer.**

As a data scientist at Datameer, I might be biased, but my favorite feature is the Column Dependencies algorithm you get if you have our Smart Analytics module. The idea of the algorithm is to help you quickly and easily identify the strength of the relationships between any columns in your data sets. This algorithm can help you confirm a suspicion like, “Does a person’s weight correlate with having a certain disease?” or discover relationships you might not even have considered like, “Does a person’s home state correlate with a certain disease?”

It really as simple as selecting the Column Dependencies button, selecting and dropping the columns you want to analyze into a drop-zone in the dialog box, and you instantly get a heat-map indicating the strength of the relationship.

Column Dependency Dialog Box in Datameer

The algorithm that is running behind the scenes works on any kind of data, because it is calculating mutual information, which doesn’t care if your data is numerical data, or categorical string data, for example.

When I’m happy with the columns I’ve selected, I can simply “create sheet” and then a new column appears in my dataset that shows the numerical value I just saw in the column dependency visualization.  Then I can easily sort the sheet to instantly order my data by which columns have the strongest relationship. 

Column Dependency Results in Datameer

See it in action:


Connect with Datameer

Follow us on Twitter
Connect with us on LinkedIn, Google+ and Facebook


Hans-Henning Gabriel

Henning is a results-oriented professional services consultant with a strong engineering and data science background. Specializing in generating business value from data, he's an expert problem solver with proven success in delivering solutions to analytical challenges.

Subscribe