Resources

Build a Network Operations Center (NOC) Dashboard

Video
 


Join us as we demonstrate the step by step process of setting up the SevOne all-in-one custom dashboard. Learn what key metrics IT professionals need to monitor and what type of reports can be generated.


Transcription:

Today's topic is how to build an effective dashboard. Just doing a little bit of a recap for those people that are new. A little bit of high-level information about SevOne, and why people choose SevOne. Because a lot of the traditional environments that go out and collect today have many different pollers that go out which make it look complex. They take that data. They aggregate it. They have to roll it up into a centralized database. Then you have a separate reporting engine which can then in turn make for some slow reporting, some multiple different bottlenecks and points of failure. SevOne has come up with a solution that alleviates a lot of these bottlenecks, points of failures and makes it easier to follow here.

With SevOne, we have what we call our cluster technology and basically it's an all-in-one appliance. Within that appliance, it does everything. It does the collection, it's the database, it's the reporter, it's the analysis. It's like that one-stop shopping. It's device aware, so each appliance knows about the other appliance in the system. That gives you some power to do automated parallel computing and data requests. When you have a user that logs into one particular appliance, it can see the entire global infrastructure. What you wind up getting is, I want to log in, say show me my top ten most utilized interfaces across my infrastructure. I don't have to know where that data is collected. If I log into North America and I run that request, It's going to go ahead and send go to every other appliance that it knows about and says, “Give me your top ten most utilized interfaces.” It's going to go off and run that query. It's going to get the data results and send it back to the one that made the request.

Now, it's going to take all that information, sort it and display for me the top ten. In essence, I’m really getting that extra computing horsepower. The other thing with the cluster technology is that we also have a hot standby option. If for whatever reason something happens to the primary, it's automatically in sync with the fail over box. All the data is still collecting and you still have all that data still available for reporting, so that adds up for a lot of things. The other thing is with the distributed real-time database. It also allows you to collect the data locally to where you want to, without having to collect across large geographical environments. It is possible to do that but you have the option of collecting locally as well. What we have found is that all of our SevOne customers, it's really less than one full time employee to maintain our system.

A little bit about SevOne. We have an open architecture. It is an all in one appliance and it's not just for the reporting, alerting and that capability. It's also from all the data collection. You don't buy individual modules. It's one appliance that has everything available to you. The most common is that everybody does SNMP polling. We also support ICMP, XML, WMI, IP SLA. We support all versions of flow data, whether it be NetFlow, sFlow, J-flow, IPv6. We can also get things for our voice, whether it be around Cisco or Avaya and do call detail records. I want to pull in information from servers, whether it be through, I had mentioned already WMI. I want to do process monitoring. I want to do response time. What's not on the screen is, we also have a plug-in for MySQL as well as Oracle. It's really going to be that one-stop shopping for multiple different data points and have them be immediately available for alerting and reporting.

We also have the ability to pull in 3rd party data. If you have data that you want to pull in that it didn't come in from a normal plug in. It could be done in one of two ways. We could do it through what we call our xStats engine. If you have a large amount of data from another product that you want to pull into SevOne or from another system. You can pull it into xStats or through our open API you can write some scripts. You might ask, why is that valuable to me? What is some information that I can do? One use case is I have one particular customer who has three different models of a switch. Two of those models have an OID that we can pull SNMP data for CPU utilization. One of the models does not have that OID but it's very important to get that CPU information from that particular switch. Basically, using our open API, we are able to log into that particular switch, leave our persistent connection open.

Do a show command in every few. In this case, we did it every 30 seconds. Pull in that data, so we can now have that data and graph it with other information. It doesn't necessarily have to be information like that. You can could be pulling information about the weather that's going on outside by going to another website which we have in our demo center as well. There's a lot of valuable information from maybe getting different data that's not part of one these plug ins. In addition to that, I've talked a little bit about the API for pulling data in. As well as you can use the API to integrate in with other systems. Whether you want to send data out to fault management systems, tie into a configuration management database. “Hey, I added a device to my CMDB, I want to make sure SevOne is monitoring it.”

You can have a script that syncs it too. “Hey, I've added a device let's make sure it's added into SevOne.” “Hey, we've decommissioned a device in our CMD, let's decommission it in SevOne.” You can automate a lot of that process behind the scenes. Obviously there's some nice integration that we can do with the API with portals. The availability for instant reports and we're going to go into today on building some of these nice reports that you can do within SevOne. As well as event notification. We also now have an app for the iPhone that can forward a lot of alerts and you can view those alerts right from your iPhone. Any questions so far before I get ready to begin the demo? As we go through, if you have questions feel free to ask. Then, I will also open it up at the end if you have any additional questions.

A little bit before I begin. A little bit about the infrastructure and what we're going to do. Basically, I have two sites here. I have a remote site that's in New York and we have our headquarters, which is in Delaware. What I want to do, is I want to build an effective dashboard for monitoring my remote site. I want to get an idea. Really, the way it works is my remote site, they connect through an MPLS network back at the headquarters and if they want to go out through the internet, they have to go out through headquarters. Let's just say that we've done our job. We've got some information at our headquarters and I've just rolled out this new remote site and I want to build an effective dashboard. How would I go about doing that in SevOne? That's what we're going to do today.

With that, let me share out my browser. All right, so everybody hopefully sees my web browser. I'm already logged into SevOne. When I come in, this is the standard interface. My favorite reports and what's going on from an alerting perspective. What I want to do now is how do I build a report? Within SevOne, I just basically go into my reporting menu and create instant report. What this is going to do is pull up a nice wizard for me. It gives me all the different types of data or all the different types of reports that I can run against and start building. What is it that I am interested in? I can pull in TopN reports. What are the topN most utilized interfaces, or CPU, or whatever? We can do this from a capacity planning. We can do it from what's happening today, yesterday, last month and we'll show some of those examples.

Performance metrics. Give me the finer details over time and multiple different graphing types. FlowFalcon. If I'm capturing flow data, let's do some graphical representations. Alerts. What's happening from there? Telephony. Do I want to pull in CDR records? Device inventory. If I'm thinking about my New York site and I want to start monitoring, the first thing probably comes to mind is I want to know if I have any active alerts. Yes, I'm doing a good job, I'm being proactive. I'm sending email alerts. I'm forwarding SNMP traps. If I just want an overall health report, I'm going to start there. I'm going to click on my alerts and then what I want to do is; how do I want to look at my data? Do I want to look no aggregation? Do I want to see the raw alerts, or do I want to aggregate it by device or device group?

In this particular case, I'm going to choose by device. You're just going to follow my wizard here and go next. What am I looking for? Well, I've already created a bunch of device groups. I've basically said, I want to look at all of my devices that are in my New York data center. We're going to go ahead and choose that New York data center. Choose next. All right, what's the resource limit? How many devices do I want to display? Let's just shorten it up. I don't think I have that many devices, but I'll just say, let's break it down to ten. I think I only have three devices right now in this data center anyway but I'll choose next. Do I want to do any filters? Is there certain types of alerts that I'm looking for? Do I want to set up any severities? Maybe I want to look for alerts that are only alerts or do I want to look for traps?

Let's see here, severities. Let's choose severities and equal to. Maybe I want to look only for alerts that are of the emergency level or critical. I'm not really concerned with information or debug messages, I just want something that's critical. Or I could start creating new filters. Right now, I don't have a lot going on, it's relatively new so I'm just going to leave it as the default. Choose next. What's the time parameter that I want to look for? Do I want to look for today's alerts, this week's alerts, or the month? For right now, I'm going to say let's build a dashboard for today and then I'll show you how we might be able to change it with basically one or two clicks to make it for a longer period of time.

Then it comes down to how do I want to visualize this? Well, do I like graphs? Do I like pie charts or tables? Or I could choose multiple. What I'm going to do here is I'm going to choose a table and alert summary and I'll show you what that looks like. If I choose next, it's going to give me a little bit of summary about what it is that I'm looking to do here. I'm looking to just confirm my New York device group. I want to basically break down my alerts and aggregate it by device. I'm going to set my resource limit to ten and we're looking for today. All right, so I'm going to choose finish. What that's going to do, it's going to draw out for me. I can look at it in one of two ways. In a tabular fashion, so I have basically as I have mentioned, I have three different devices that are in my New York. I have my tabular break down for the number of alerts. How many are affected out of how many possible devices. Or I can look at it broken down for today.

I'm just going to modify this report real quick. I don't really like that format. I'm just going to go back into my visualizations, so I'm going to go with alert summary. I just think that looks a little bit cleaner and nicer on the report. Basically, another quick thing that this is telling me is that these alerts have been persistent alerts. We haven't really fine-tuned them yet, so that makes a little bit sense but what I can see here is that I have some alerts that are going on. What are some of the other things that I might be interested about monitoring for New York. Let's go ahead and I'm going to add a new one. It's going to pull up my wizard. Let's get some of those TopN reports. I'm going to go next here. What are the things we're interested in? The first thing is I want to make sure that I'm going against just my New York device group. All right then we choose next.

Then what is it I'm looking for? Am I interested in most utilized interfaces, most discarded interfaces? Do I want CPU? We can do a little bit of everything above. Let's start off with CPU for the beginning. Let's scroll down here. Let's look for, I know the Cisco device, I’m going to choose Cisco CPU utilization. Then I'm going limit my result to ten. We’re going to go next. What am I looking for? Again, I can choose for today. Am I interested in maybe looking at this last week? Maybe I want to do some capacity planning while I'm in here. Let's start looking at where I'm expect it to be in the next three months. We’ll just try to choose next. Visualization. I like tables. I'm just going to choose finish here. What I can see here is two devices fall within this realm. The other one isn't a Cisco device, so I might want to create another type of report.

Let's expand this a little bit so we have more of our screen put in here. What I can see here is what do I have for a one minute average for today, for last week, and then what I expect it to be for the three months. As well as the five minute averages going across. One of the other things that we can do is. It's nice that I have this for today and you know what? Let's maybe not put in, I'm not really interested in capacity planning. I just want an overall health of what does today look like. Again, let's go back and edit this report. It’s going to change my time. I'm going to get rid of the three months and last week and if I choose finish, I get a little bit prettier chart. A graph that's a little bit more about performance for today. That's nice that I have that this is my average for the day but what does that look like for the course of the day?

With SevOne, we have this really cool concept called chaining. What is chaining? Chaining is basically going to let me link one report to another report and use some of the data elements that are in it. I'm going to click on my gear box here and I'm going to click on chain. I can choose quick chain or custom chain. What's the difference between the two? A quick chain is it’s going to choose for me what it thinks that I'm looking for. Custom means I'm going to go with something specific. Let's start with custom, then I'll do a quick chain later on. If I go into custom. I want performance metrics. How do I run and limit my resources? I'm going to say to five even though I know I have two in this report but as my data center grows, I want to make sure I'm handling that growth. I'm going to limit the resource to five. Do I want to look at this data combined in one chart? Do I want to split them into two? I'm going to split them into two, it could be up to five. I choose next.

How do I want to display it? That's fine, a percentage. Let's go into my settings. Data aggregation. I don't want to aggregate the data. I want to see the raw data points. Because that's one of the nice features about SevOne, is we're storing all the raw data for up to a year. I want to look at those raw data. I don't want to look at the aggregation. Let's go look at analysis. I'm not really concerned about trending but I really want to look at baselines. What I also want to do is let's compare that to standard deviation. How am I doing compared to normal? Let's go up to two standard deviations. I don't want to use working hours for the device. I want to look for today. Sorry about that. All right and so I go next, how do I want to visualize this data? I like line charts. Whenever it comes to graphs I want to look at it as a line chart but I have all these other choices available to me. Choose next, again just go through my summary and I choose finish.

Now, what it's done for me is it's taken those two CPUs that I have listed above and it's graphed them out over time. It's really cool that I can actually see how I'm doing compared to normal. What I can see here is that right around in the middle of the night. Right after midnight, I had a huge spike. That I went up to 50% utilization and when I'm normally right around 15%, so I can see that deviated from the normal baseline and I'm also two standard deviations outside. Maybe that's why my alert has been firing off. We didn't drill down into those alerts, so I have that capability that's in here. Let's add a few more things in here. I'm going to add in, let’s maybe do a TopN around utilization. Let's choose TopN. What we're going to do is go to device group again. We're going to choose our New York device data center. This time, we're going to go with most utilized interfaces and I'll go with all the defaults here and just hit finish.

If I scroll down it adds to my report. Now, I'm looking at my most utilized interfaces. Let's clean this up a little bit. Let’s say that I want to put these side by side so we get a little bit more real estate. I can take this information and move it around. Now I have my most utilized interfaces Okay, great. Now again, maybe I want to look at them over time. What I can do is do a chain. If I do a quick chain, it's going to automatically assume for me that I want performance metrics and I want to graph them out over time. The one thing that it does do for me though is that it shows me an average it's aggregating the data. I'm going to do a quick edit here. Of this, I'm going to go to aggregation and I'm going to turn it off. If I choose finish and now I'm going to get the raw data points. I can really see what type of spikes do I have. It’s a pretty consistent traffic going across the wire. Our one gateway, our MPLS link to corporate makes sense. Right here in the morning we had a little bit of a spike. I've chained from one level to another level.

I can keep going further. If I have I have NetFlow data, which I'm pretty sure I do. I've turned that on, again I can chain from here. Now if I do a quick chain, it's going to assume that I want flow data. I’m going to do that, at the bottom of the report it's going to add yet another report for me and by default, it's going to choose top talkers. That's nice maybe I want to look at top talkers and maybe I want to chain another report off of it because I don't want to look just at top talkers. What I want to do is come down here and set up top talkers, I want to change it. Let's look at top applications with next top. Let's actually do top conversations and applications, and we're going to do toss because you got a couple of different quality of service setups and I actually want to see how that's doing. Now, it's going to grab it and I can see there's my top talker. Most of that traffic that's going across is web traffic.

It looks like I really have one user in New York that's doing the bulk of the activity. He's probably got some scripts that are going through the night here. That's really simple how I can build a quick dashboard. If I want to clean this up, you can add in a few other objects. Instead of choosing add, I can this time use a simple item. What do I want to add? Do I want to add in a separator? How many columns do I want to go across? Let's add a separator real quick. I'm going to out in here, maybe call this utilization. It's going to automatically add it in down at the bottom. Let's take this and move this up right to where we started doing our utilization information. Maybe I want to add in a little bit of text description. Instead of a separator, let's choose a little text box.

I'm going to make this a one column. I'm going to say basically my top ten most utilized lengths and I'm going to say chained to performance over time. As well as flow data associated to those lengths. If I choose finish here, it's going to automatically put it at the bottom here. I want to realign this. I'm going to grab this. Let's put this right up, we go right here. It’s going to find it. Now, I have some textual. If I'm sharing this report with someone, I can go ahead and put in color commentary as needed. You might want to put in a link here that does some information about the CPU. This is just a really quick way how to build a report. I want to save this report. I don't want to lose this information. I'm going to come in and click basically in the report and give it a new information. We're going to call this the New York Data Center Health Dashboard.

One of the things that we'll notice down here is that this is going to list out all the reports that I have, that are in here. If I have anything that's changed. What I can see here is that they're automatically linked together. If I come in here, my topN has a split into performance results. Then I also have performance results, then I have two different flow reports that are chained to it. I can go ahead and I can decide if I want to share this with someone, or do I want to schedule it and have it emailed on a particular schedule? If I save this report, we can close it back there. The other thing I can do is, if I want to come in, I can rename some of these reports just by coming in here and say performance metrics. I can just come in here and say CPU utilization over time. The reason I'm saying over time instead of for today is remember I said that, “Hey, how do I change this across the board?”

If I don't want to go for today, I can click on my gear box. Hit chain. Change time frame. Let's just choose this week, or last week. Let's just choose this week, since we're on Friday. It's going to automatically, see I changed it here to this week and it automatically changes my graph, because they are linked together from a chaining perspective. The other thing is if I come down here. Let's say that I start from here and I want to do an investigation and look at this spike. I can choose my graph actions. Zoom into this little tiny spike that we had here. Then if I scroll down to the bottom. You'll notice that flow reports are automatically changing and there's my little spike. What I can see here is that, “Oh, we had a little bit of call manager going on.” Which wasn't in my top flow before. We can see that.

Now, I had created this and saved it as a day. Maybe what we want to do is do the whole report as a week so I could say this week. What I can now do is come in here and just put in this week. If I save this, now I'll have two reports. One for today, and one for this week, so I don't have to go ahead and recreate the wheel and start over from scratch. If you notice from this week, you can even see that we did have an incident on my New York gateway and it's cleared up. We also had a critical incidence on my voice gateway that has been cleared up. It's a quick way to do reporting. To create these dashboards. We're almost about out of time, so one or two things to show you here.

I can also choose to mark this as a favorite. Now, it's saved to my favorite report. If I have this drop down, I have a list of all my favorite reports and it will show up in here since I'm on it, it’s here but if I were to change reports, it would be in my favorite listings there. It would also show up in my home page. The other thing I can do is mark it as a dashboard. If I mark it as a dashboard, what that means is that every time I log into SevOne, this is the first report I'm going to see. I’m just going to turn this back off. The other thing we can do here too is again, we can export it out as a pdf and send it off to someone and save it. I wanted to leave at least five minutes for questions before we close out but do you see how this could be helpful today to you in your environment, as well as how simple it is?