Resources

Creating a Capacity Planning Report

Video

To meet the needs of today's rapidly growing business environment, organizations must ensure that their IT infrastructure is operating at its peak efficiency during both the slowest and the busiest of times. Join Dave Hegenbarth, SE Director of Global Strategic Alliances, as he provides insight on creating a capacity planning report by using SevOne technology.

 

Transcription:

Dave:
This is the welcome page for SevOne. Really it's just a huge collection of shortcuts. One of the things that we're interested in today is capacity planning. I've created a number of different capacity planning reports which I'm just going to come over to my report manager. We can see here I have one called network capacity planning. I'll click on that. I'll make that more dramatic. In real time we'll open up a new tab. We'll do a lot of computations very, very quickly. I'm putting up these charts in real time. We're going to look at some of the stats that we trended.

In this case we're looking at a TopN report. That means I want to rank something. In this case it's bandwidth utilization of all of the ports in my infrastructure. This is on my live network. I'm actually presented with a TopN ranked by the past month or actually, I'm sorry, ranked by today, I believe, next 1 month. I ranked by next 1 month. I'm projecting out. I have some supporting data here. This particular interface G4/7 today is at 34.42%. Over the last month he's been 39.38%. We're projecting somewhere around 61% utilization headed on out.

Then we have some graphs and charts below which are some of the similar data. They may take into effect a little different metric. You can see here that its average is a little lower. Its projection's a lot lower. The time frame on these is a bit different. Time frame is a big deal in in trying to calculate your projection outward. The more data points you have the more accurate your projection's going to be. In SevOne what we do is for any algorithm that we run we go back 6 times to the chosen time frame. If I look at the performance report for the past month, what we're going to do is actually go back 6 times that are 6 months, take those data points and use them in the projection out towards I think that goes for a month.

SevOne has the ability to retain all the as polled data for 1 year meaning that we don't do any sort of roll ups. We don't roll 5 minute polling to hour polling to daily polling. We hold all the as polled data. In this case I believe this interface gets polled every 60 seconds. What we've done is we've gone back 60, no 90 days, right? No, we went back 6 months, so went back 180 days and we project 4 of 30 days using that raw data set. We get fairly accurate data points when using raw data. A lot of folks in the industry don't provision enough disc space to hold that type of historical data.

We see a number of different utilization. We also can do the same thing for CPU. Here we have our busiest routers, CPU utilization 1 month projection. Now we're looking at the CPU across all of our different routers. We can see here that next month we're looking at 75.5. The past month is 86.02. It's a real network.

Sometimes we trend up. Sometimes we trend down. What we want to look for things that most of these guys look like they're fairly solid. This guy's even trending down just a little bit, 86 to 75 so that's good news. I'm knowing that I don't need to put a faster router in there. The fact is I have good confidence it's going to hold for quite a while.

We also have the same sort of thing just looking at it in terms of a bar graph, just showing us the past month utilization for the two CPU's in that switch. We see that nothing's gone to 100% so we're probably pretty good there. The last thing we're even trending buffer hits. If you're a router guy you know that there's small, medium, medium large and huge buffers and buffer hits and buffer misses are an important thing to look at from time to time so seeing troubled routers.

What I did was the same thing. I simply ranked these guys by any kind of buffer hit, who had the most and then where will we be next 1 week and the past 1 month. I've gone back 6 months worth of buffer utilization stats and deduced that over the month we've been at an average of 288 million hits. Yesterday we were at 289. Projection is 305 for next week. We'll see that we're trending slightly up. Again, really no cause for alarm. Lot of router stats in that one.

I do have a server report as well and it has some projections in it. Let me see which one's mine. In this particular server report I was more focused on some groups of servers. It starts with some alerting at the top which shows me by category where I've had some issues. I can see 3 of my 20 Linux servers have an issue. One of my Windows servers had an issue. One of my VMware servers had a potential issue. Down here we can see what some of those were. We can see that my agent, my WMI service that would serve up some Windows things, has gone down recently. I need to take a look at that one. I can see that backup storage used is 100%. We might want to look into these.

Alongside of that what I've done is I've plotted out my CPU load, both tabular and in a bar graph. Bar graph makes for pretty colors. We can see a change was made right around Thursday, May 23rd, because this server was just getting hammered, CPU's almost 100%. We made some adjustments right about May 23rd and all of a sudden now I have a much more reasonable load. Same thing, we can take a look at is disc utilization. We can see that change maybe not so easy to make. This guy's been 100% almost all the way through.

Then back to capacity projections which you wondered why I was showing this. I can show out of my servers I have my top 5's here. Who to look at? Web backups we saw. He was hovering around 100% in real time. We could see the next 30 days we're projecting him to be out of disc space. We're looking at CPU load. Nice thing is nobody ranks over 84.32%. Good news is I don't have any servers trending towards being out of CPU. Bad news here are memory. We're looking at these guys a set of box two is 7 to 208. If they continue to consume memory as they have over the past 180 days, they will hit way out of memory the next 30 days. We'll probably need a heads up look at that.

Number of different capacity planning projections there. How did we get some of these? Nice fancy dashboards might be the next one or way to look at it. A lot of what we did was we can either choose the TopN which is our tabular form or we can go over to performance metrics. I clicked on my performance metrics. Maybe I'm interested in just a particular router or CPU. As I click on that I can go find one of my favorite devices. How about my New York voice gateway, it's fine. We want to look at SNMP polling. Want to look at CPU maybe. Want to know. I certainly don't want my voice gateway to run out of CPU. If I do next I can take a look at some interesting settings. I might want to see it in CPU. I don't really need any aggregation but some of the data analysis we want to do.

We can either trend data on a graph. We can do a projected trend or historical trend. Historical trend is simply going back and trending the data that's on the graph with the average of the past 6 times the graph period. If I was graphing 1 week, it would go back 6 weeks but I would draw a line through the date it presented.

Projected is what we've been really looking at which is that projection out. Where will I be in 30 days? Then I have my projection types. We have 4 types of math right now, exponential, linear, logarithmic and power are the 4 maths that are projection equations we can apply to a data set. I often get the question: Can we add our own? Currently you can not add your own to the product. You can select 1 of these 4's. You do always have the ability to export the data as a CSV and take that and do some other interesting math on it. Sometimes people combine it with other metrics and things like that in a spreadsheet.

Any of these guys, typically linear is good for things that don't change very quickly. Logarithmic and power are much better for things like CPU that happen to bounce up and down. I could add in percentiles if I wanted to but that might make us crazy right now. Let's just go with this and see what we get.

I'm just going to click finish at this point and hopefully we get a graph. Has some data at the very beginning. Then it has a projection line out. Now I think I did today and projected out a month. That's maybe silly. I can come back in here and say, "You know what. Let's take a look at 4 weeks." Just that quickly I get my 4 weeks worth of data and now we can see my trend line alongside with that. We can see that the average is 19%. Given the math and where we started, the raw data points which are every 60 seconds for the past 4 weeks. Very quickly understand that we're going to go out to about 15% or so over time. We can add this.

If this is something I want to keep, I was interested, this becomes Demo with Dave Capacity. I can save this report. I could have this re-mailed to me every so often. I could add things that happens to be my voice gateway router. I could come back in here and edit this. I could come to resources and I could even go in and say, "You know what. I want to look at my Windows server CPU right alongside that." I can't spell Windows but if I could. I could say, "Very dissimilar device type but I'm still interested in CPU data." We'll look at CPU load. We'll say finish. Now I'm graphing for whatever reason my router and my switch right alongside each other. We can see those projections as well. We can see that this guy both trending down just slightly. Maybe coming into summertime people aren't working the servers hard or we migrated services off of these gateways and servers to wherever we need to.

Inside of this too we can go to a PDF. If I want to send it to my mom and show her what I did today, I can click on the PDF button. You get a what you see is what you get PDF. Hopefully, everybody can see that on my screen. I can mail this off to coworkers. I can also schedule this report. I can schedule this PDF to be delivered to me every month so I keep an eye on my capacity for my server and router just that easily and my email is a PDF. I can always log back in here and take a look at it here.

That's a whole little or a lot about how we do capacity planning inside of SevOne. Lots of things, you can do this on any metric we collect. We do have the ability to collect CPU stats, router stats. We have temperature voltage off a UPS. Any stats that you can think of come into the system and then we apply those math to those things. I did mention as well I could even go get CSV and this will pop open a spreadsheet for me. You'll see that I have all the as polled data here somewhere in my things and spreadsheets.

I click the share button. Now you can see you can get the comma-delimited data. The time stamps in this case are in Linux or Unix time stamps that you can add or multiply or divide by time. You do have that spreadsheet option if you want to take these numbers and make some different calculations with them.

With that, Alex, I think you can open it up and maybe someone has a question or two. That would be very good so I don't have to do all the talking.

Alex:
Everyone's un-muting

Dave:
Excellent. Questions?

Matt:
Yeah, Dave, hi, this is Matt from Xerox.

Dave:
Hey, Matt.

Matt:
In your projections you went out 30 days. In our case we need to provide quarterly capacity plans so we'd like to set a target out of 90 days. How easy is that to accomplish?

Dave:
Sure, why don't we go back to data analysis. Why do 30 when you can do 90. Let that cook for a second but that should get you a look at your 90 days rather than 30. You can do 89, 92. You can put in the number of days that you want to.

Matt:
Okay. Then based on what you said earlier, that gives me 6 times 90 days of history.

Dave:
We'll try to go back. We'll consume all the data you have going back. Yeah, we will try to go back. If you've got 6 times 90 we're into almost a bunch of data.

Matt:
A year and a half.

Dave:
Yeah, a year and a half or so. As far back as you have, that's what we'll do. If you have a year and a half, great. Other questions?

As I did mention you can get your data into CSV if there's other math. I did mention the 4 math types. You can do the other thing I didn't show but I could do this in terms of a bar graph. I can do it as a bar graph. I can't do it as a pie graph. You could bring this into a bar graph. You could put multiple graphs in the same. If you guys are not familiar with SevOne, that's one graph I always have the ability to come back and add a different graph to this. You want to click there but a lot of different flexibility things you can do.

I could add a TopN as we did this before. We could go get my most utilized interfaces. I'll just click through that. Most utilized interfaces and then in settings again we'll take a look at how we want to see that. I'm just making stuff up waiting for somebody to give me a question and then we'll go from there. Come on, where's my button? There we go. Within time span, maybe I want to do a past month. Then I can roll into my projected maybe next 6 months. Take today out of here. Just that easily we can create a table that has all those fun stuff in it. See there. Next 6 months ranking. I may or may not have enough data. Depending on where you are to get that and then you can see the stats that are there.

Any other questions, guys, on a Friday before Memorial Day? Capacity Planning. All right. Well, thank you for joining Demo with Dave. We will convene again in 2 weeks time and we're going to talk topic that we've done once or twice before with maybe a little new twist is monitoring voice over IP. We'll talk about the quality of network, network transport. We'll also talk about the ability to monitor our PBX's and things like session border controllers if you run a large SIP organization. All of that put together into voice over IP in 2 weeks.

I hope everybody has a great Memorial Day weekend. I will talk to you guys in 2 weeks.

Alex:
Thanks, everyone.