Informatica’s Anil Chakravarthy and I continue our conversation around data security, this time discussing how risk management is a perfect example of a data-driven exercise. He elaborates that in the past it was either driven by human expertise or by process and increasingly, it’s becoming process-driven.
We also talk about the role Informatics plays and how cloud and data aggregation is their sweet spot.
Don’t miss it! Tune in below for part two of our Big Data & Brews with Informatica.
Stefan: But let’s talk a little bit about that “using data to secure” topic. Where do you see the opportunity in the market?
Anil: You mentioned Splunk earlier. You see a lot of companies now which have really changed the ways essentially security happens. Or, I can even broaden the topic further to your earlier conversation about risk management. When you think of managing your risk, that is essentially a data-driven exercise right now. In the past, it was either human-expertise-driven or process-driven. I think increasingly we’ve seen that it is becoming data-driven. A great example is, think of just what is happening even at the network security level. In the past, it used to be that you had specific devices like routers and firewalls, etc. from which you collected logs and you prerecorded what you were looking for and you basically said, “This is what a security attack looks like.” And then, you look for patterns that match that prerecorded knowledge that you had.
Now that world is changing very quickly even at the network level. You basically now collect logs not only from all the network devices, applications, active directory interface, user access. You pretty much collect all of that information and then you use big data techniques to find the pattern rather than say, “Hey, I already know the pattern of attack and I’m just going to go look for that pattern.” I say, “I don’t know the pattern of attack.” The assumption right now is, I have all this work and attackers only needs one way to get in. Therefore, I don’t know what way they’re using to get in. So, let me get the data and see what the data tells me in terms of what made me abnormal and then use that to find if it’s really a security vulnerability, right? That, to me, is how data is being used to change the world and that’s happening in fraud detection. That is happening in cyber security. It is happening in, for example, insider threat detection.
So, in a variety of various areas it all used to be that I would call a pattern and then I go look for data that fit that pattern. It’s now like, “Let me get the data and identify what the pattern might be,” because there are just way too many ways to get in for a malicious person and so, I can’t predict what way they will use. So, it’s the mindset … The mindset has changed and the technology is now available to make that happen.
Stefan: Where is Informatica playing a role there? In the data aggregation part?
Anil: Yes, exactly. We are not the analytics provider. We will work with any analytics provider. We’ll work with any kind of visualization technology or user-oriented technology that the customer wants to use. We are a data infrastructure provider so it’s really data aggregation, data integration, cleansing and getting a 360 degree view of data so that if you identify, “This is the area that I need a 360 degree view of,” I can get that and then that can then be consumed by the applications. That’s really the role that we play.
Stefan: What are some of the biggest challenges to get all this data fabric into place and in big companies?
Anil: The challenges are … Right now, for example, this is all the different types of data. It’s not only structured data but machine data, unstructured data, semi-structured data, etc. So, there’s a lot of variety of different data types. Second is the latencies. Some data you need to get in batch. Some is in streaming, some real-time, etc. Third is the location of data, right? On premise versus cloud versus some combination of the two and making sure you get the access. And the fourth is, once you get all these different data types, how do you make sure that you process it efficiently, because you don’t want to get data and store it all in one place and that could become a security hole by itself. So, how do you process it most efficiently for all these different data types and keep up with different technologies? Think of NoSQL for example, changing so rapidly. For a customer, how do you build an abstraction layer so that, while you can make use of the new technologies, you don’t go out of fashion or you’re not stuck with the wrong technology. It’s helping the customer strike that balance. Those are really the challenges for us.
Stefan: Do you see that the data volume is a bigger problem or the data variety? Are companies challenged by the exponentially fast-growing number of data sources or is each data source just becoming bigger?
Anil: Yeah, I think it depends on the company. I’ve talked to customers where both holds. For example, we have customers who are now saying, “Look, rather than going to traditional data providers which are not collecting as many data sources, we want to go to as many sources as possible.” I’ll give you an example. The Weather Channel, for example, is a customer. Historically, to predict the weather, and to provide weather forecasts, they go to the government sources and get some data sources from there, etc. But, now what is happening is there is a ton of available data. For example, every town, every city, publishes certain weather forecasts and source data that they can use right? For such companies, the volume is huge but the data types are relatively still the same.
Now on the other hand, if you think of someone like an insurance company today, and they are trying to use, for example, new types of techniques to predict risk and now suddenly they’re getting some IoT type of data from sensors that may be deployed. They may be working with the re-insurers or providing certain data. They have their own main frame systems which have certain data. So, for them, variety is the problem. So, it’s not as much. The volume is manageable. Variety is the problem. So, I think we’re still in the early days. So, I think we’re definitely seeing all these different kinds and I think that’s part of the challenge in building … I don’t know if you can build a universal platform that can handle … It’s very difficult to believe but who knows? There may be some promising technologies that seem to be very flexible in being able to do that.
Stefan: From your perspective, where’s the biggest growth opportunity for your company?
Anil: We look at it as the intersection of what’s happening with the cloud and big data. Not only the movement of data between our premise and cloud and within cloud to cloud but also just the sheer growth of data in the cloud. This is a big opportunity. And if you look at the big data world, I think a lot of what happens in the big data world from our perspective, the value, especially for enterprise customers, the value of big data comes from when they can derive insights by combining data that they have from their own systems, etc., with either third-party data, customer-generated data, machine data that they can put together. So, that intersection is good for, and we are a data infrastructure provider, so those are the two big areas where we see opportunity.
Stefan: Yeah, couldn’t agree more with you that really enriching more and more data will give you more context and therefore, better insights.