Today we’re featuring Howard Dresner, Dresner Advisory Services’ Chief Research Officer. He’ll provide us with his perspective on business intelligence, big data analytics and the future of both.
This is part of our podcast series on big data thought leaders. Be sure to subscribe to our blog to get updates as soon as they’re published!
Transcript, lightly edited for clarity:
Andrew Brust: Increasingly, the worlds of BI and big data analytics are colliding. Business intelligence, of course, helps you find answers to questions you know. Some say big data analytics helps you find the questions you don’t know to ask. Today we’re fortunate to speak with Howard Dresner, also known as the Father of Business Intelligence, about BI, big data analytics and the future of both. I’m Andrew Brust, and this is The Big Data Perspective.
Howard, thank you so much for joining us.
Howard Dresner: Thanks, Andrew.
How Will Big Data Continue to Shape Industries Such as IoT and Artificial Intelligence?
Andrew Brust: We’ll go ahead and get started. In October, the Dresner blog mentioned big data’s use cases for IoT, artificial intelligence and machine learning. My first question is, do you still believe that big data and BI have a role to play here? How do you think big data will continue to shape various industries that are using the technology?
Howard Dresner: There are several questions packed in there. First let’s define big data simply as just vast amounts of very low-level data, perhaps collected from sensors or other sources. I do believe that when you’re dealing with vast amounts of information that are collected from a variety of sensors, such as from your smartphone, car, smart meters, etc., trying to make sense of that as humans using traditional tools is extraordinarily difficult. Maybe impossible, really, because you’re dealing with such vast arrays of data and the noise-to-signal ratio is extremely high. You have to use technology to distill that down to something that makes sense to us mere mortals.
Do I think big data plays a role there? Most certainly, once again, if you’re dealing with technologies that can ingest in a reasonable time-frame these tremendously large sets of data and allow for their analytic processing, perhaps in a real-time fashion in terms of streaming types of analyses focused on operational analytics.
How to Find Relevant Business Questions From Big Data?
Andrew Brust: Whether you know it or not, you’ve segued into the second question, maybe even said a good portion of the answer. It’s not just about the volume of data — it’s also about the number of sources of that high-volume data and the fact that so many of them are coming from outside of the organization. What’s the poor business person to do? There’s quite an overwhelming potential for the analytics they can do. I guess for folks like us who are focused on analytics that’s exciting, but there really can be I suppose a burden of choice.
I’m wondering if you have general advice on how folks who really just need to get on with their business can make some intelligent decisions about which data sources to use and just how not to get lost in the whole process merely of sourcing it so that they can get beyond that and start getting some insights.
Howard Dresner: That’s a fairly thorny issue. The good news is that there is more data than ever before. You’re right, most of it is on the outside, on the other side of the firewall. It allows you, if you can avoid the landmines that are out there, to create much more or a much better perspective on what’s going on in the real world than you could’ve done before, so being able to combine internal sources of data with a variety of external sources of data, syndicated data, government sources, whatever the case may be. If you can use those to paint a picture of the market, of your customers, of the competitive landscape, which is more accurate than you could otherwise do, then that’s a win.
That said, it’s extraordinarily complex. You’re also dealing with users who may not be as sophisticated as BBAs or data analysts that are engaging in this. The downside to that is they may be engaging in analyses that are erroneous and may lead them to false conclusions about the marketplace, which then might be invested in and may wreak havoc.
A lot of organizations are starting to bring in Chief Data Officers to start to achieve some policies around the use of data within the organization. The volumes and the complexity associated with data probably are not going to go away. We need to have some policies within the organization that allow us to certify various data sources, not all of them of course, but to be able to say that, “All right, these are the blessed data sources and here’s how you use them.” Of course, that goes hand in hand with education, which I have to say is something that organizations have not invested enough in.
As a matter of fact, we’re having a conference in July, and one of the sessions I’ve been toying with having is what I call “fear of data.” Fear of data is, how do you get people to overcome their fear of data, and not just their fear of data but their ignorance of data? That has nothing to do with technology. It has to do with making an investment in people and curriculums that can help educate the rank-and-file users within your organization around the data that’s available to them, and how they might use that. Most organizations simply just haven’t done that. You need a fluency around data within organizations. If you can do that, if you can raise the overall, we’ll call it “data IQ” of the organization, then that reduces some of the risk around these other sources of data that are brought in and combined with our internal sources.
Andrew Brust: That’s interesting, because I was also going to pick your brain on what cultural and behavioral changes you thought might be necessary to achieve success around analytics. You’ve kind of addressed that. Maybe if I can interpret and extrapolate a little bit from what you’ve said, it sounds like that kind of data literacy is really where companies need to go. That is, if they really want to pursue what’s now getting called digitalization and if they really want to become “data-driven.” It sounds like what you’re saying is there’s actually an issue of trepidation out there.
Howard Dresner: You’re absolutely right. You hit the nail on the head. Trepidation is exactly the right word. Please continue.
Self-Service in the BI and Big Data Worlds
Andrew Brust: Good. So far my paraphrasing is good. That’s excellent. It seems like sometimes that can lead to maybe erroneous analyses. Maybe even more often it preempts analyses, because people don’t want to risk making those mistakes. It sounds like that’s very sage advice, that maybe the right thing to do is address that trepidation head on. You also mentioned this notion of having blessed or certified datasets, which makes a lot of sense and certainly would help folks with the issue of being overwhelmed, with being able to get data from so many places.
Help me, though. I want to make sure we’re not predicting a zero-sum game here. If there’s one big difference maybe between big data and BI, it’s that with big data things have seemed to be more self-service. You haven’t had to agree upon the perfect star schema in the data warehouse or the OLAP cube. That’s given a certain amount of agility and autonomy to people. If we’re going to get to the place where datasets have to be blessed before we use them … I have a feeling there’s a good answer to this, but tell us why that’s not the same as being forced back to the schema of reference that a lot of people thought made things slow and sticky.
Howard Dresner: I try not to get religious about particular technologies or particular roles, but I see with big data the advent of the data scientist which, once again, centralizes that power within a single individual or a single group within the organization and is actually at odds with self-service. I’m a big fan of self-service. I coined another term back in the day, 1993, “information democracy,” which is very much aligned with self-service. It’s about, how do we get the right insights into the hands of the individuals that really need them in a timely fashion? That’s really very much what self-service is all about. I’m a huge advocate of that. I do agree with you that we don’t want to create obstacles to people having access to data and access to insight, but at the same time there needs to be some order. We don’t want to have data chaos. I think that does happen in a lot of organizations.
Going back to your comment about culture, my second book, Profiles in Performance, was all about what I called performance-directed culture and how to achieve one of those cultures. Realistically, it has to start at the very top of the organization where they see fact-based decision making as critical to the organization’s success and their personal success. I mean, the CEO has to believe this. Everybody talks about sponsorship and applying dollars to things. What about real ownership on the part of the C-levels in the organization? It’s amazing how they, and perhaps only they, can literally transform the culture in relative short order. The culture does matter a lot. People have to value data and fact-based analysis and if the culture embraces that, if everybody at the top of the organization uses it, it’s amazing how quickly people will align to that reality.
I think culture is a big part of it. You need to have technologies, but I try not to get hung up on the various technologies. I’m much more concerned about culture, organization and then the processes. It may not be an individual certifying data sources, but perhaps collectively we need to agree on, “All right, these are the ones that we know we can trust. They’re from trusted external sources. They’re from trusted internal sources.” Then, certainly give people the latitude to do more, to go beyond that, without corrupting those internal sources. I may want to combine it with some social media types of analyses or external syndicated sources, and I shouldn’t be prevented from doing that as long as I don’t change the original source data, the certified source data internally.
I think you want the best of both worlds. You do want to have trusted sources, but you also want to have the freedom to explore and to combine and to discover new insights using external data.
Andrew Brust: This is audio, so you can’t see me and the audience can’t see me, but I’m nodding a lot. It sounds like what we’re saying here is, “No, we don’t want to go back to the days where everything is so limited and prescribed that by the time a business person is consuming the data it’s already highly, highly schematized.” What we are saying is that there is really a necessary vetting process, and if for no other reason so that when people do their analyses they can cite their sources and hopefully be citing specific sources that have been vetted and have been certified, approved, endorsed, whatever verb makes the most sense there. I think that’s a good middle ground. What we’re saying is we do want to provide latitude. We want to do it within the context, though, of some trust.
Howard Dresner: I would add one more thing, Andrew. I would say that this is a business initiative. Certainly we need the technical and IT resources to support what we’re doing, but the business needs to take the lead on this. We shouldn’t abdicate that responsibility as business people. Business people need to be aware of the data, they need to take advantage of the data, and they need to be supported by the IT function if there is an IT function. We shouldn’t allow IT to own this problem. This is a problem that we need to jointly own. The most successful implementations are where there is complete alignment between the IT resources and line of business.
Best Practices in BI and Big Data Analytics Understanding
Andrew Brust: If that education you were talking about and extinguishing the trepidation that I had mentioned, if that’s part of all that, then it sounds like people will be on a good path. Let me ask you a flip side of this, though. Maybe the question’s a little loaded, but do you think it’s possible we’re getting to the point where some of the tools for doing analytics, whether it’s big data or BI for that matter, might actually hide poor business understanding? Are we perhaps at the point where the tools are easier enough where it allows people to put together analyses that look authoritative but in fact may be, I don’t know, based perhaps on less than rigorous analysis on the back end?
Howard Dresner: You’re referring to Excel?
Andrew Brust: I’ll let you pick your poison there.
Howard Dresner: This isn’t a new problem, that’s my point. You put some data into a spreadsheet, you generate some charts, and all of a sudden it gives them credibility. It could be invented numbers. It’s hard to know that. Do I think there’s a risk? Yeah. Once again, it does come down to education. That’s part of my concern, by the way, with some of the machine learning tools that are out there. If you’re going to embrace machine learning, and I think there are some great use cases out there, you have to have somebody that’s responsible for auditing those things. We can’t simply say, “Because we don’t have the competence level and the knowledge level to do the analyses the proper way, we’re just going to hand everything over to the black box.” I think that’s foolishness. I think people need to understand what’s really driving the business all the way down to the underlying data. It goes back to that education issue.
It’s great that the tools are so usable, but you have to be able to look at an analysis that’s generated by any tool and see a number that just looks wrong and be able to say, “Hold on.” I do it every day. We analyze a lot of data. You look at something, and if you know your business and you know your data, you’ll look at something and say, “That’s an anomaly. I need to drill down now and look at the actual row set or the actual value and try to figure out what’s going on here. Should I exclude that? Is that just an outlier? Was that noise? Or is there something else here that needs to be explored?” I think we need to make sure that we don’t give up that ability, and we need to education people, once again, to understand how to use data and understand data so when they see something that looks odd, they know it, as opposed to simply accepting it.
I think that’s really what you’re getting to. Are the tools so sophisticated and so visual now and maybe so automated that we don’t do that anymore, we simply don’t challenge? We should always be challenging what we see in the data.
Howard Dresner’s Big Data Predictions for 2017
Andrew Brust: Good. I wasn’t thinking we’d be going for best practices in this discussion, but I think we’ve extracted some: the education, the combating the trepidation, the endorsement of datasets and being able to cite those and the kind of audit and challenge that you’re talking about. I think what you were also mentioning, even if not in so many words, is helping people develop an instinct for data that makes sense and analyses that make sense, and helping them in a visceral way to be able to recognize anomalies. That’s hard, but, like with many things, if you outline that as the goal, then you’re probably a good way towards achieving it versus not even identifying it at all.
We’re headed towards the end of our chat here, which means that I have the easy job of asking you a hard question around predictions and what you see coming down in the analytics scene this year. What are, I don’t know, a top two or three things that you see coming up in the near and more distant futures?
Howard Dresner: I think I would stay to a fairly near-term horizon. I tried to give up prognostication. I do think that increasingly the users will weigh in, will continue to have strong influence, whether it’s on the data preparation side, on the actual analytics themselves, and supported by IT. There was a time where this was all IT-driven, which, by the way, that’s not how things started. Back in the day it was very user-driven. Then it became IT-driven, and now the pendulum is swinging back again. I think we’ll continue to see that happening. We’ll see more and more user-driven types of technologies, whether they’re up in the cloud or whether they’re on premises. I do think that’s a real trend.
Cloud, by the way, is another one. We’ve been tracking cloud for six years now, and it’s continuing to grow. It’s almost linear every year. I think that we’ll see more and more cloud deployments, certainly within smaller and mid-sized organizations to achieve better performance, to achieve better cost savings, but also in large organizations. We’re seeing it certainly within departments. I think cloud is a real trend.
I’ve been talking about collaborative capabilities for a long time, and we’re starting to see more of that. We look at that in combination with user governance. I think we’re going to see more of that capability over time, where we’re trying to leverage groups of people and specific resources and skillsets within our organization that may be scarce to bring them in into an electronic conversation around business intelligence and the analytics. I think that’s quite important.
I think things like IoT. It’s interesting, because IoT doesn’t register as a really high priority within our community, within our research community, but it’s inevitable, because the data is being collected and it has been for quite some time. There’s a gold mine out there to be able to connect all the dots and analyze all this data. I think you’re going to see more interesting syndicated sources of data that are out there. That means that the ways in which we want to analyze that data … We’re involved advanced and predictive analytics. We’re involved things like location intelligence. In fact, we’ve been tracking location for quite some time. Of course, location isn’t new. We’ve had GIS systems for a really long time. They’ve been highly specialized, but it is becoming increasingly mainstream. It’s been a sleeper topic. If you look at IoT, the value there is really understanding not only the state of various sensors and devices out there but also understanding location over time.
There are a number of technologies that we see as growing in importance. Once again, some of them are sleepers, but they’re all very much user-enabled. They’re all about self-service, once again, supported by whatever technologists or IT organization that may be available to them.
Andrew Brust: Makes sense. By the way, recently I guess one of the big BI players, Qlik, announced an acquisition of a GIS location analytics vendor, so I think that nicely corroborates some of what you were saying there. Before we go, could I ask you to just expand on what you had in mind when you were talking about user governance? That seems like a really intriguing concept. I didn’t quite understand it also. It’s possible the audience didn’t either. If I could get you to expound, and then we’ll wrap up.
Howard Dresner: I think it goes beyond governance, really. The reason we combined it or joined it with our collaborative research is because it’s about user governance. It’s, how do we share information? How do we control how information is shared in the organization? It goes beyond that, too. One of the things we’re starting to look at more closely is the notion of catalog. How do we find information within our organization? It does talk about things like certification. How do we certify a particular model? How do we determine which are the most used, which are the most popular, as well as those that are certified within our organization? How do we capture commentary around these things and share these things?
Collaboration and governance and things like catalog I think all go hand in glove, especially in a very large organization where finding things is a real problem and you want to reduce the amount of redundancy that you have in the organization or the duplication of effort.
Andrew Brust: Neat. That sounds also like we’re talking, this can be hard in large organizations, but we’re talking about … Yeah, you’re crowdsourcing things to an extent, but I think it’s also more about building consensus and getting people who have a collegial relationship to also reach a shared understanding of datasets that are important and accurate and reliable. I feel like that theme buzzed a few times in this conversation, and that’s pretty exciting, actually, because it’s not something I’ve seen discussed a lot or written about a lot, so I’m happy we worked that out and extracted it.
Howard, thank you so much. I knew this would be a good chat, but it was even better than I thought it would be. I wish you a great 2017.
Howard Dresner: Thanks, Andrew. Same to you.