Once a month or so at Datameer, the engineers get to take a day to hit ‘pause’ on our current development projects and instead work on something that we’ve personally envisioned for the product. Adobe calls this “JDI” days (Just Do It), Atlassian calls them “ShipIt” Days, and at Datameer, we call it our GeekOut.
For this month’s GeekOut, I chose to develop what will hopefully become a new visualization feature in a future release of Datameer — a variant of the Hive plot. Hive plots are graphical tools that allow perceptually uniform visualizations of network data that show connections between graph nodes.
Take for example, the Apache Hadoop user mailing list. To create a Hive plot visualization of the list’s email communications, first we create a workbook to analyze certain parts of the email list data including the creator of an email thread, and a list (or a JSON array to be precise) of all the people who replied to that thread started by the creator. This is done with a few of our pre-built point-and-click analytic functions – GROUPBY, GROUP_JSON_ARRAY, etc.
Once we’ve done that, we would move over to the visualization module, choose the Hive widget, and then attach the data we worked with in the workbook.
So how do you interpret this visualization ? The nodes along the upper axes represent the email addresses of people who started email threads and did not participate in any other conversations. The left axis contains nodes of users who participated in email conversation without starting their own original thread. The two axis in lower right quadrant map people who both started new email threads and participated in threads created by other mailing list participants. Nodes on these two axes are duplicated for more transparent visualization of between-node connections (i.e. no curve starts and ends on the same axis).
There you have it, a brand new widget that you might just see in a future version of Datameer. This Datameer Hive plot visualization is powered by the D3.js library and is an adaptation of Mike Bostock’s Hive plot script. To see it in all its interactive glory, check out this demo video I created. Enjoy!