SevOne Data Appliance for PLA: Introduction to Performance Logs
SevOne Product Manager, Tom Grabowski explains how SevOne can help you monitor your log data more effectively. This demonstration shows how you can convert raw logs into actionable performance metrics.
Hi this is Tom Grabowski, SevOne product manager. Today I want to introduce you to performance logs, show you how SevOne can help you visualize your log data more effectively. I'm logging into one of our demo servers with Apache log data streaming to it. We can see at the top how much log data that we've collected. We've collected, over the last 24 hours, probably about 33.5 Gigs of log data. Within that log data we have something called key indicators. They might be fields, they might be descriptive information about what's in the log data and how we want to view that information.
If I scroll over this log data you can see the names of the indicators. This is really Apache common log format. We're looking at a lot of this log data and we've added something called a duration field here, which shows us how much time it's taken for the app server to respond to a web server request. I can click on any of these key indicators on the left hand side here and get a list with a trend of how many times, for example on datasource, what are the unique values on datasource and how many times have they happened over the same time frame that we're looking at.
We can look at other items here, you know HTTP status codes. This is combined across all my web servers. Look at things like site names and again you'll see a combined list of the site names and trend graph of what's happening. If I get rid of this and click on any of these indicators I can drag and drop the indicators on kind of the middle section here and we can view that data, zoom in and out. I can click on a specific part of the data if I want to look at the raw logs for that specific time frame.
Really easy to navigate through this information, you know pivot off of it. Even though we're collecting a lot of data, having those key indicators and navigating and pivoting off the key indicators make it really easy to navigate through your log data without having to understand the syntax or complex query's that you would have to do. Whether it's trends, I could also drag and drop any of the tags and view what are the data sources that we're collecting. If I want to click on any of the data sources we can you know add more information. If I wanted to see what is the status, site names, and be able to get a nice report about the log data that I'm looking at.
We have an Omnibar in here so if we wanted to type in the search request you could type that in. This is really used in our API quite a bit. What I want to show you is how performance logs kind of differ from your typical search of the log data. What we have is the ability baseline, trend this information, compare that data. For example we have a field here, duration, that's just the Apache log format you can add a percent D to get the duration performance. Across that data span or across that time frame, you know what is the average or regular duration across all my servers and be able to see the spikes and dips in that time frame.
It might not be as useful to see kind of that in aggregate but we can also take that duration and carve it up into duration categories, which might be a little more interesting to see things like which fields are giving us over 60 seconds or which query's are giving over 60 seconds to return entries for? I can look at this from a trend but I'm going to actually take a look at it from my table view here and see if I want to look at kind of the datasources and let's just say the site names. Drag those into my header here. I see the actually raw messages but I might not want to get that.
I can click on any of these filter, the headers, or really summarize this data. If I want to summarize, I get a quick summary of what data source, what is the site name it's providing, and summary count. Which ones, if we have any specific problem areas with the datasource or site. From here you notice I can drill up or drill down into the information. I could even go in and find some of the other data that maybe we're not indexing on but I'd like to see in my typical report to see what are the URIs that we're collecting, is there any similarity. I can see you know what are the items. There is some redundant information here. It would be useful to go and see those items actually program-ably from my web server, what's happening there, and be able to provide that information to my application developers and say, "What's going on with this application?"
Now you're not always watching the log data and looking for this type of thing. One of the great things about looking at performance data is having the ability to have the alerts. If we know what is our normal trends we can attract normal baselines over time. We can also get alerts when there's a spike in the baseline. For example in my 404 status codes, when I see a spike in that it will send me an alert saying it's spiking above normal range. You'll see there's a blue line for normal range and when it goes outside the red lines, whether that's on the low end or on the high end, I want to be able to get an alert on that. These are numbers or percentages that you can set and configure yourself when you're looking at the alerts or the type of application data.
Being able to have kind of that moving average and alert on what's going on across my application. Whether it's HTTP status, which is kind of a count how many HTTP statuses there are, or duration which is again a computed value. If we want to get an alert when the sum of durations goes up, users will see when it's taking longer for our web servers to respond. We can see that duration over time and what's typical and really look at when it goes above a certain percentage, high or low, based on the baseline of that data.
That's an example of how we look at log data as performance metrics and really using those key indicators to monitor that data over time, look for abnormal activity and to really be able to highlight and visualize that data through alerts. It's about finding those unknowns in the data. You didn't know some of this was happening. Being able to alert you when spikes or dips or changes in that log data occur. That's really what we're trying to get to with the performance log appliance. Being able to combine that with what you, and how you typically look at performance metrics across your enterprise.