VMworld

 

Dana Gardner's BriefingsDirect

1 Post tagged with the mahesh tag
Listen to the podcast. Find it on iTunes/iPod. Read a full transcript or download a copy. Sponsor: AccelOps. Connect with AccelOps: Linkedin, TwitterFacebook, RSS.

The  latest BriefingsDirect podcast discussion centers on how new data and  analysis approaches are significantly improving IT operations  monitoring, as well as providing stronger security.

The conversation examines how AccelOps has developed technology that correlates events with relevant data   across IT systems, so that operators can gain much better  insights   faster, and then learn as they go to better predict future  problems   before they emerge. That's because advances in big data analytics and complex events processing (CEP) can come together to provide deep and real-time, pattern-based insights into large-scale IT operations.

Here  to explain how these new solutions can drive better IT   monitoring and  remediation response -- and keep those critical systems   performing at  their best -- is Mahesh Kumar, Vice President of Marketing at AccelOps. The discussion is moderated by  Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: AccelOps is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:
Gardner: Is  there a  fundamental change in how we approach the data that’s  coming  from IT  systems in order to get a better monitoring and  analysis  capability?

Kumar: The data has to be analyzed in  real-time. By real-time I mean in  streaming mode before the data hits  the disk. You need to be able to  analyze it and make decisions. That's  actually  a very efficient way of  analyzing information. Because you  avoid a  lot of data sync issues and  duplicate data, you can react  immediately  in real time to remediate  systems or provide very early  warnings in  terms of what is going wrong.

The challenges in doing  this streaming-mode analysis are scale and speed. The traditional  approaches with pure relational databases alone are not equipped to  analyze data in this manner. You need new   thinking and new approaches to  tackle this analysis problem.

Gardner: Also for issues of  security, offeners are trying different types of attacks.  So  this needs to be in real-time as well?

Kumar: You might be familiar with advanced persistent threats (APTs).    These are attacks where the attacker tries their best to be  invisible.   These are not the brute-force attacks that we have  witnessed in the   past. Attackers may hijack an account or gain access  to a server, and   then over time, stealthily, be able to collect or  capture the   information that they are after.

These  kinds of threats cannot be   effectively handled only by looking  at  data historically, because  these  are activities that are happening  in  real-time.



These kinds of threats cannot be   effectively  handled only by looking at data historically, because  these  are  activities that are happening in real-time, and there are  very,  very  weak signals that need to be interpreted, and there is a  time  element  of what else is happening at that time.   This too calls for  streaming-mode analysis.

If you notice, for   example, someone  accessing a server, a database administrator accessing a   server for  which they have an admin account, it gives you a certain   amount of  feedback around that activity. But if on the other hand, you   learn  that a user is accessing a database server for which they don’t   have  the right level of privileges, it may be a red flag.

You   need  to be able to connect this red flag that you identify in one   instance  with the same user trying to do other activity in different   kinds of  systems. And you need to do that over long periods of time in   order to  defend yourself against APTs.

Gardner: It's always been difficult to gain accurate analysis of large-scale IT  operations, but  it seems that this is getting more difficult. Why?

Kumar: If you look at trends, there are on average about 10 virtual machines (VMs) to a physical server.    Predictions are that this is going to increase to about 50 to 1,  maybe   higher, with advances in hardware and virtualization  technologies. The increase in density of VMs is a complicating   factor  for capacity planning, capacity management, performance   management,  and security.

In a very short period of time, you have in effect    seen a doubling of the size of the IT management problem. So there are  a   huge number of VMs to manage and that introduces complexity and a  lot   of data that is created.

Cloud computing

Cloud computing is another big trend. All analyst research and customer feedback suggests that we're  moving to a hybrid model, where you have some workloads on a public  cloud, some in a private cloud, and some running in a traditional data center. For this, monitoring has to work in a distributed environment, across multiple controlling parties.

Last    but certainly not the least, in a hybrid environment, there is    absolutely no clear perimeter that you need to defend from a security    perspective. Security has to be pervasive.

Given these new    realities, it's no longer possible to separate performance monitoring    aspects from security monitoring aspects, because of the distributed    nature of the problem. ... So change is happening much more quickly and  rapidly   than ever before. At the very least, you need monitoring and  management   that can keep pace with today’s rate of change.

At the very least, you need monitoring and management   that can keep pace with today’s rate of change.



The  basic problem you need to address is one of analysis.   Why is that? As we  discussed earlier, the scale of systems is really   high. The pace of  change is very high. The sheer number of   configurations that need to be  managed is very large. So there's data   explosion here.

Since you  have a plethora of information coming   at you, the challenge is no longer  collection of that information.  It's  how you analyze that information  in a holistic manner and provide   consumable and actionable data to your  business, so that you're able  to  actually then prevent problems in the  future or respond to any  issues  in real-time or in near real-time.

You  need to nail the   real-time analytics problem and this has to be the  centerpiece of any   monitoring or management platform going forward.

Advances in IT

Gardner: So we have the modern data center, we have issues of complexity and    virtualization, we have scale, we have data as a deluge, and we need to    do something fast in real-time and consistently to learn and relearn   and  derive correlations.

It turns out that there are some   advances  in IT over the past several years that have been applied to   solve  other problems that  can be brought to bear here. You've looked  at what's being done with big data and in-memory  architectures, and you've also looked at some of the great work that’s  been done in services-oriented architecture (SOA) and CEP, and you've put these together in an interesting way.

Big data is   about volume, the velocity or the speed with which the data comes in and   out, and the variety or the number of different data types and sources   that are being indexed and managed.



Kumar: Clearly there is a big-data angle to this.

Doug Laney, a META and a Gartner analyst, probably put it best when he highlighted that big data is    about volume, the velocity or the speed with which the data comes in and    out, and the variety or the number of different data types and sources   that are being indexed and managed.

For    example, in an IT management paradigm, a single configuration setting    can have a security implication, a performance implication, an    availability implication, and even a capacity implication in some cases.    Just a small change in data has multiple decision points that are    affected by it. From our angle, all these different types of criteria    affect the big data problem.

Couple of approaches

There  are a couple of approaches.   Some companies are doing some really  interesting work around big-data   analysis for IT operations.

They  primarily focus on gathering the   data, heavily indexing it, and  making it available for search, thereby   derive analytical results. It  allows you to do forensic analysis that   you were not easily able to  with traditional monitoring systems.

The   challenge with that  approach is that it swings the pendulum all the  way  to the other end.   Previously we had a very rigid, well-defined   relational data-models  or data structures, and the index and search   approach is much more of a  free form. So the pure index-and-search type of an approach is sort of the other end of the spectrum.

What    you really need is something that incorporates the best of both  worlds   and puts that together, and I can explain to you how that can  be   accomplished with a more modern architecture. To start with, we  can't do   away with this whole concept of a model or a relationship  diagram or   entity relationship map. It's really critical for us to  maintain that.

What   you really need is something that incorporates the best of both worlds   and puts that together.



I’ll    give you an example. When you say that a server is part of a  network   segment, and a server is connected to a switch in a particular  way,  it  conveys certain meaning. And because of that meaning, you can  now   automatically apply policies, rules, patterns, and automatically    exploit the meaning that you capture purely from that relationship. You    can automate a lot of things just by knowing that.

If you stick    to a pure index-and-search approach, you basically zero out a lot of    this meaning and you lose information in the process. Then it's the    operators who have to handcraft these queries to have to then    reestablish this meaning that’s already out there. That can get very,    very expensive pretty quickly.

Our approach to this big-data   analytics  problem is to take a hybrid approach. You need a flexible and   extensible  model that you start with as a foundation, that allows you   to then  apply meaning on top of that model to all the extended data   that you  capture and that can be kept in flat files and searched and   indexed. You  need that hybrid approach in order to get a handle on this   problem.

Gardner: Why do you need  to think about the architecture that  supports  this big data capability  in order for it to actually work in  practical  terms?

Kumar: You start with a fully  virtualized  architecture, because it allows  you not only to scale  easily, ... but you're able to reach  into these  multiple disparate environments and  capture and analyze and  bring that  information in. So virtualized  architecture is absolutely  essential.

Auto correlate

Maybe    more important is the ability for you to auto-correlate and analyze    data, and that analysis has to be distributed analysis. Because  whenever   you have a big data problem, especially in something like IT    management, you're not really sure of the scale of data that you need  to   analyze and you can never plan for it.

Think of it as applying a MapReduce type of algorithm to IT management problems, so that you can do    distributed analysis, and the analysis is highly granular or specific.    In IT management problems, it's always about the specificity with which    you analyze and detect a problem that makes all the difference  between   whether that product or the solution is useful for a customer  or not.

In  IT management problems, it's always about the specificity with which     you analyze and detect a problem that makes all the difference.



A major advantage of distributed  analytics is that you're freed from   the scale-versus-richness trade-off,  from the limits on the type of   events you can process. If I wanted to  do more complex events and   process more complex events, it's a lot  easier to add compute capacity   by just simply adding VMs and scaling  horizontally. That’s a big  aspect  of automating deep forensic analysis  into the data that you're   receiving.

I want to add a little bit  more about the richness  of  CEP. It's not just around capturing data and  massaging it or  looking  at it from different angles and events. When we  say CEP, we  mean it is  advanced to the point where it starts to capture  how people  would  actually rationalize and analyze a problem.

The  only way   you can automate your monitoring systems end-to-end and get  more of  the  human element out of it is when your CEP system is able to  capture   those nuances that people in the NOC and SOC would normally use to rationalize when they look at events. You not    only look at a stream of events, you ask further questions and then    determine the remedy.

No hard limits

To    do this, you should have a rich data set to analyze, i.e. there    shouldn’t be any hard limits placed on what data can participate in the    analysis and you should have the flexibility to easily add new data    sources or types of data. So it's very important for the architecture to    be able to not only event on data that are is stored in in  traditional   models or well-defined relational models, but also event  against data   that’s typically serialized and indexed in flat file databases.

Gardner: What's the  payoff if you do this  properly?

Kumar: It is no surprise that our  customers don’t come to  us saying we have a big data problem, help us  solve a big data problem,  or we have a complex event problem.

Customers say they are so  interconnected that they want these managed  on a common platform.



Their   needs are really around  managing security, performance and   configurations. These are three  interconnected metrics in a virtualized   cloud environment. You can't  separate one from the other. And   customers say they are so  interconnected that they want these managed   on a common platform. So  they're really coming at it from a   business-level or outcome-focused  perspective.

What AccelOps   does under the covers, is apply  techniques such as big-data analysis,   complex driven processing, etc.,  to then solve those problems for the   customer. That is the key payoff --  that customer’s key concerns that I   just mentioned are addressed in a  unified and scalable manner.

An   important factor for customer  productivity and adoption is the  product  user-interface. It is not of  much use if a product leverages  these  advanced techniques but makes the  user interface complicated --  you end  up with the same result as before.  So we’ve designed a UI that’s very easy to use, requires one or two clicks to get the    information you need; a UI-driven ability to compose rich events and   event  patterns. Our customers find this very valuable, as they do not   need  super-specialized skills to work with our product.

Key metrics

What  we've built is a platform that monitors data center performance,    security, and configurations. The three key interconnected metrics in    virtualized cloud environments. Most of our customers really want that    combined and integrated platform. Some of them might choose to start    with addressing security, but they soon bring in the performance    management aspects into it also. And vice versa.

And we take a  holistic cross-domain perspective -- we span server, storage, network,  virtualization and applications. What   we've really built is a common  consistent platform that addresses  these  problems of performance,  security, and configurations, in a  holistic  manner and that’s the main  thing that our customers buy from  us today.

Free trial download

Most of our customers start off with the free trial download. It’s a very simple process. Visit www.accelops.com/download and download a virtual appliance trial that you can install in your data center within your firewall very quickly and easily.

Getting    started with the AccelOps product is pretty simple. You fire up the    product and enter the credentials needed to access the devices to be    monitored. We do most of it agentlessly, and so you just enter the    credentials, the range that you want to discover and monitor, and that’s    it. You get started that way and you hit Go.

We  do most of it agentlessly, and so you just enter the   credentials,   the range that you want to discover and monitor, and that’s   it.



The  product then   uses this information to determine what’s in the  environment. It   automatically establishes relationships between them,  automatically   applies the rules and policies that come out of the box  with the   product, and some basic thresholds that are already in the  product that   you can actually start measuring the results. Within a  few hours of   getting started, you'll have measurable results and  trends and graphs   and charts to look at and gain benefits from it.

Gardner: It   seems that as we move toward cloud and mobile that at some point  or   another organizations will hit the wall and look for this  automation   alternative.

Kumar: It’s about automation and distributed   analytics and about getting very  specific with the information that you   have, so that you can make  absolutely more predictable, 99.9 percent   correct of decisions and do  that in an automated manner. The only way   you can do that is if you  have a platform that’s rich enough and   scalable and that allows you to  then reach that ultimate goal of   automating most of the management of  these diverse and disparate   environments.

That’s something  that's sorely lacking in products   today. As you said, it's all  brute-force today. What we have built is a   very elegant, easy-to-use  way of managing your IT problems, whether  it’s  from a security  standpoint, performance management standpoint, or   configuration  standpoint, in a single integrated platform. That's   extremely  appealing for our customers, both enterprise and cloud-service    providers.

I also want to take this opportunity to encourage  those of your listening or reading this podcast to come meet our team at  the 2011 Gartner Data Center Conference, Dec. 5-9, at Booth 49 and  learn more. AccelOps is a silver sponsor of the conference.
Listen to the podcast. Find it on iTunes/iPod. Read a full transcript or download a copy. Sponsor: AccelOps. Connect with AccelOps: Linkedin, TwitterFacebook, RSS.

You may also be interested in:
0 Comments Permalink
Dana Gardner

Dana Gardner

Member since: Jul 19, 2011

Analyst Dana Gardner examines IT news and trends that impact software strategists to provide insights and outcomes on cloud, SOA, app dev, SaaS, enterprise infrastructure and mobile convergence.

View Dana Gardner's profile

Actions

Create Your Own Personal Blog

To create a personal blog on VMworld.com, sign into your account, click on "Manage Account" in the top right corner of any page, click on the "Blog Posts" tab and then click on "Create a Personal Blog" or "Write a Blog Post" from within your account profile.

Note: All blogs will be monitored and reviewed for content. Any blogs not related to virtualization or considered to be spam or offensive will be removed.