Monitoring Your VMware Infrastructure Using Metrics, Flows, and Logs


This demo will discuss how SevOne can monitor your VMware infrastructure and help troubleshoot infrastructure issues using the metrics, flows, and logs technique.


Hello, everybody, and welcome to another SevOne demonstration. In this demonstration, we're going to go through how SevOne can monitor your VMware infrastructure using our metrics flows and log techniques as well. SevOne can trouble shoot issues you might be having with your virtual infrastructure using those same metric flows and logs.

Just some background about the specific situation. In this demo, what we're going to do is we're going to be walking through how SevOne integrates standard metrics with your VMware infrastructure. We're able to see things like the health of your v-center. As well as the network connections feeding the information into or out of the virtual machines. We're also going to take a look into specific ESX Hosts. In this specific instance, you'll notice that there is an issue. There's a spike in utilization on one of your hosts. What we're going to do is use the SevOne techniques of metric flows and logs to determine what the issue is and how we can go ahead and resolve that issue.

What I'm going to do is go ahead and switch over to our demo system. You'll see that status map again showing our virtual center as well as our network connections. Then, directly below that you have your development cluster which consists, in this case, of four ESX Hosts. What makes SevOne a great way to monitor virtual infrastructure, specifically, VMware. We can combine multiple sections of your VMware infrastructure in one spot. If you were to monitor your Vmware infrastructure just using VMware's monitoring system, you'd only be able to see what is in your V-center. You wouldn't be able to see what is connected to your V-center, data moving in and out of your V-center. And, any other devices you might have in your data center that isn't specifically virtualized. SevOne has that ability to merge all three of those components together to give you one concise easy pane of glass look into your infrastructure.

For instance, if you want to see the specific health of your virtual center. It's as quick as clicking this node here. What we can do is we can dive into the specific report. You're able to see things like your processor load, your most utilized interfaces, your memory utilization, as well as, the average virtual machine CPU load on each device, Disk IO, Read and Write, so on and so forth. You're able to get the normal metrics that you would get off your V-center monitoring tool just from VMware. You're also able to get other metrics. If we go back what you're able to do, is we can see the VMware interface that is moving data to and from your virtual center. If you go ahead and click our VMware interface here. What we're going to do is, we're going to pull up the utilization for your ESX Hosts. You're able to see exactly what's moving to and from them. What we have here is our most utilization, to and from the host.

The one thing that you'll notice here is that if we look at metro report today, there was a spike in transfer and utilization around 3:30 in the afternoon. That's interesting. That's something that doesn't usually happen. Clearly, as we can tell here, it seems to be a stagnant line across the board. But, here, we had a spike in utilization. What we're going to do as a way to try and determine what that utilization was or what it effected. We're going to dive into one of the specific hosts of your V-center.

I'm going to go back. I'm going to go ahead and take a look into our V-Host 01, just one of our ESX Hosts. We're going to be able to see some of the metrics you can pull off that. As well as, we're going to take a look and see if anything is different. That doesn't usually happen on here. Just by looking at the virtual host one health, we can get metrics like your total CPU load, total memory being used on the virtual host as well as the temperature of the host, Disk IO, Read and Write. As well as, we can see the flow that is going across each one of the hosts.

As you'll notice in the beginning here, we see the same corresponding spike in net in and out for both physical and virtual being your vVM in and outs. As you can see, we do see that corresponding spike that we saw with our bandwidth. If we went down to our flow, we see that we didn't move large amounts of packets. There wasn't a mass move in packets. It doesn't seem like many things have been transferred back and forth. That's interesting. We're still not entirely sure what has caused this issue in terms of that spike in our infrastructure's in and outs.

What we're going to do is use our metric to log technology to take a look into the logs to see if any administrative changes were done on any other ends. We go into our gear icon. We go down and we select log analytics. Since we were specifically looking for our virtual infrastructure in SevOne, what we're going to do is we're going to go ahead and go into contexts and select Lab V-Center. By selecting contexts, what we're doing is we're more or less filtering our logging application to show us the specific data that we're looking for.

Now, you can do this in one of two ways. You can do this inside the logging application, like we're doing now. Or, what you can do is you can actually do it inside the reporting itself. If you were to go ahead and filter out a specific time period or specific device inside the metrics at that time. The logging application will match that specific filter from the SevOne MNS. Or, you can do what we're doing here. Specifically selecting what we're looking for so we can piece it out bit by bit.

I'm going to go ahead and select our v-center cluster. Then what I'm gonna do on the left hand side, is I'm going to specifically look for a source IP. I want to narrow it down based on our specific virtual instance. I'm going to go ahead and select our source host IP.

You'll notice on the right hand side here, that it brings up the two hosts that are associated with that specific v-host. Logically thinking, our virtual host will be hosted on the first host. I'm going to go ahead and select the first host. What you'll notice is there's that same corresponding spike that we saw on the data metrics. That's interesting.

Now, what we're going to do is we're going to go ahead and go from the actual report inside the logging application into just looking at the specific logs. I'm going to go ahead and select our scroll icon. If we go ahead and take a look at the specific logs, you'll notice that it seems that we are sending from one IP address to the other. It looks like as if our virtual host was transferred. Our v motion prepare sending from this source IP which is the first source IP to the second source IP as in the second host. It seems like that was caused by transferring our virtual instance from our first host to our second host. That would be a clear indicator as to why there was a spike because that would be a reason for transfer.

If that transfer did occur there would be a spike in announce because everything had to be copied from to the other. According to the logs, that is the explanation as to why there was that spike in in and outs as well as bandwidth. You're transferring the virtual instance from one host to the other. By using SevOne's metrics and flows and logs, what we were able to do was, we were able to trouble shoot this issue and determine that the reason there was this spike in bandwidth and there was that spike in transfers in and out was due to the fact we had moved our virtual instance from one host to the other.