Solving Drops in Network Availability


This video demonstrates how the SevOne solution is used to improve troubleshooting techniques utilizing integrated metrics, flows, and log data.


Hello everybody and welcome to this SevOne demonstration. In this demonstration we are going to go through how SevOne can improve your troubleshooting techniques using integrated metric flows and log data.

Just some background for the demonstration. In this demonstration, we have a window server, a web application service, a switch, and a remote gateway. SevOne is polling data from the switch, which is connected to the gateway, the window server, and the application service from the cloud. During the polling, there is a drop and availability. What we end up doing is using our logging technique and our flow techniques to pull information off the remote gateway to determine if traffic is still coming across the gateway. As well as seeing if there are any issues with the gateway itself, taking a look at the log data.

What I'm going to do, I'm going over and switch to our demo system. Here you can see that we are currently experiencing an issue with our application service shown by the red dot. You click the dot, shows that we have a current alert. What it specifies, that we are not able to connect to this application service. We have a 0% connection rate for over 15 minutes.

If we scroll down to the metric itself, you'll notice that we are experiencing drop offs in availability periodically from our application service. The drop offs are not occurring on a windows server or our gateway. Which leads us to believe the issue is not the server itself, but the application.

What we're going to do is we're going to dive a little bit deeper into this, and we're going to look at some of the flow data to see if traffic is still coming across the servers. Because is traffic is still coming across, then it's a different issue other than a blockage of traffic on the servers.

What I'm going to do is I'm going to go ahead and select a drop off portion here. Now that I've selected the specific section, I'm going to go ahead and use our metric to flow technique by going up to the UR-icon. I can select Chain and then select Quick Chain. As you can see, we still have traffic coming across the servers and the services. There is traffic coming across the HTTP port, which is our web application service. There still is traffic coming across these specific devices, so it has to be another issue other than a block in traffic.

What we're going to do is we're going to go ahead and do our other technique. A metric to log technique. To take a look at the log applications to see if there were any changes made to the devices that would have caused this issue. Once again, I'm just going to up, and do our simple one click from metric to log. As you'll notice, it's already selected the time period that I had selected on the metric itself. I don't have to make any changes. You can also that the source IP is still the same. This is an exact log snippet, other report that we had before.

Now, one thing that you'll notice is that with our specific IP address, we had a user log in and change access to port 1112. Then as you can see further down the list here, that we have multiple denies from TCP port 1112. It looks as if the user "Dave H" has logged in and denied access to that specific port. Which is a clear indicator as to why you were having a drop off in availability for those specific time periods.

You can see how having access to both metrics, flows, and logs can give you the ability to dive deeper into an issue and figure it out much quicker. Now that we have this information, it's as simple as taking it and exporting it to somebody that's able to go ahead and take care of the issue.