Before you start this article you should get your time sorted - read all about setting up NTP in IOS with our NTP article
. Remember logs are useful when taken in context. If the time is wrong on your router then you can forget using them in any meaningful way when it comes to diagnosis.
Just so we are clear, logs and debugging is useless without context - you need to have your networked devices synchronised with each other and with a common time source. If you time is out then your logs have little or no value for things like event analysis, trending, capacity planning and forensic analysis.
*** SO STOP READING THIS AND SORT YOUR TIME OUT FIRST ***
Time right now? Then we’ll begin...
Logging the messages created from your device is a very useful for troubleshooting a variety of issues we’ve already discussed like event analysis (i.e. a fault on your device caused x,y and z to occur at this time
) and forensics (e.g. during or after a security incident). So what happens when your router reboots and you want to see those logs? Well it’s likely you’ve only setup logging to the buffer (in volatile RAM) on the device so when it reboots you’ve lost all of that useful data. What if your device is being hammered by bad traffic or a flapping interface which fills up the available buffer space (the amount of RAM allocated to log messages). Wouldn’t it be great to keep those logs for historical purposes?
Damn right - here are some logging options.
First of all we need to enable logging with the global configuration mode command ‘logging on’. You should perhaps note that this is the default...hopefully nothing to worry about unless some unscrupulous or crap engineer has disabled logging.
In this screenshot the router has booted with a clean configuration. Basic things like hostname, interface IP and VTY access have been setup but rest assured, no logging configuration has been done...so what do we get enabled by default?
Syslog, enabled; Console, enabled; Monitor, enabled...what does all this mean?
Syslog monitoring is a double-edged sword. In this regard we could (if on a PIX/ASA) be talking about SNMP traps but in this case we’re saying what logging information should be sent over the network to a syslog server.
Monitor logging is all about remote sessions to the device. When you telnet or SSH onto the device you wouldn’t see any logging information even if this is turned on. To see this logging data the operators needs to issue the ‘terminal monitor’ command.
Console logging is all about local access to the device. By local we mean access from devices connected via their ‘console’ port. This is probably the best place to have logging turned on but you don’t want to have too much logging since a serial port can easily become swamped with logging information.
Buffer logging is a historical dump for logging data. Lets say an event occurred a few minutes ago and you want to see what happened. You can telnet to the device, issue a ‘show logging’ command under privileged EXEC mode and you’ll see a list of historical logging data.
So what can and should we tune for logging data? Here are the headlines:
- Syslog - send too much traffic to your syslog server and this means more network traffic which could cause a network capacity issue not to mention you are going to fill up your syslog server disk more quickly.
- Monitor - remember this is logging being sent to a telnet or SSH client. Again, as with syslog, you will be increasing your network utilisation which could be bad, particularly so if you are seeing the logging increase as a result of service issues related with network capacity issues. Logging too much over the monitor session and you will lock out that session causing you to dump the session and restart your connection.
- Console - unlike the monitor session the console is way slower normally running at 9600 baud not 100Mbps! If you set logging levels which are too high you are almost certainly going to lock yourself out of the console session. If you are already experiencing network issues then the console is your only hope of getting meaningful telemetry out of your failing device. Setting logging levels too high on the console WILL lockout your serial connection period! Avoid if you like your job. The only way to bring a router back from a lost console due to high logging levels is usually to reboot it.
- Buffers - this is a memory risk. The facts are that you want to go back in time to look at historical logs. Well thats great but logging for too long runs a risk of consuming more RAM which could be used for other things. It’s a fine line between keeping logs for as long as you can and not consuming previous RAM when you could actually store historical logs on an external device like a syslog server.
What about those logging levels? What does ‘logging debugging’ mean? Well we can tell the device to log more or less verbosely to include more or less functions. There are 8 different logging levels available and each one logs more and more data as you pass from a lower to a higher number i.e 0-7 with 7 being the highest or most verbose level.
- Emergency (severity 0)—The system is unusable
- Alert (severity 1)—Immediate action is needed
- Critical (severity 2)—Critical condition
- Error (severity 3)—Error condition
- Warning (severity 4)—Warning condition
- Notification (severity 5)—Normal but significant condition
- Informational (severity 6)—Informational message
- Debugging (severity 7)—Debugging message
So, now we know the defaults, we know the types and we know about logging levels. Lets consider the main logging types in more detail.
We’ll consider buffering
first since it’s the simplest to start with.
Take a look at the following output from a ‘show logging’ on a Cisco wireless AP.
This output is using something called the logging buffer. Notice the logging buffer is 1048576 bytes? Well we’ve increased the buffer here from the default of 4096 bytes or 4Kbytes. It is not uncommon to find this is one of the more obvious tweaks you would want to do on your IOS device. To explain the point imagine a very busy router being hit hard with a flapping port. The state of the interface goes up and down every few seconds or so and each of those logs are sent to the history buffer. Pretty soon the log is going to fill up and when it gets full we move to FIFO mode (first in first out) and we start to drop out the oldest messages. If you, as an engineer, are being asked to find out when this started happening and you are depending on the log output...you’d better hope your log buffer is big enough.
Now, what you shouldn’t do is change the buffer size without looking at your available RAM
. Remember the logging buffer sits in RAM so the more you reserve for logging, the less you have for any other processes. Right, lets take our vanilla configuration and turn on the buffering of log files to RAM.
Whats the default buffer size again?
To change the buffer size we simply issue the following command, we’ve changed this to 4 times the default or 16384 bytes
Thats great, we’ve got a bigger buffer, but what if the buffer still fills up? What if we choose to log more data by increasing out logging levels? The best way to make sure we catch all data and can keep an infinitely more long term history is to use syslog.
Lets configure our 2600 router for SYSLOG. We’ll tell the router to log at the highest level and send it’s logging data to our SYSLOG host running on IP address 192.168.1.34.
First of all lets configure the logging server. We use the command ‘logging host’ followed by the IP address of the SYSLOG server
Now, before we move on here I’m going to bring back my first statement in this article regarding time and logs. Logs need context - SORT YOUR TIME OUT ;-) Seriously, without time-stamping the log data it has very little use or meaning. So how can we fix that?
OK first lets make sure we are using NTP and we are synchronised (if you have not done this then follow our NTP ‘howto guide here
Here is a screenshot from our NTP. We are synchronised, all is ready to go.
So now we need to make sure that our logs get time stamped. Lets first look at a log file which has been written to the buffer. To demonstrate the issue we’re just going to perform a ‘shutdown’ then a ‘no shutdown’ on FastEthernet0/1.
Notice that the log in the buffer shows the state change to down and back again...but we have no way of knowing when this happened. Enter time-stamping!
The ‘service timestamps’ command allows us to prepend any logs with the current time and date. We have told the router to use the actual date for the log (rather than the number of seconds since the box was brought into service) and to include information like the timezone (in case we are going to export these logs to another country working with UTC when the logs were sourced in a country working with summer-time which it could be out by an hour. We’ve also told it to include the year which is off by default. Remember of course the more you add the more buffer space is used up....it’s all about compromise.
Lets take a look at the same ‘shut/no shut’ with the timestamp on.
Awesome, now we have prepended the time and date which is perfect for troubleshooting. Notice the first two log entries were still in there without the time and date.
When you connect to a cisco device be it a router, switch or firewall you may wish to turn on debugging to more closely watch what is going on. Now normally the debugs are not sent to the telnet or SSH session so we need to tell the device to override the normal behaviour and to send those debugs and other messages to the VTY console and not just the serial port of buffer and sysiog.
This type of redirect to screen is called ‘monitoring’ and is akin to the unix ‘tee’ or shell redirect ‘2>’ command where stderr
can be sent to another process.
Lets have a look at a normal session. We’ll telnet onto the 2600 router we already setup with buffering. We’ll telnet on and tell the router to debug HRSP. Then we’ll build an HSRP instance on the interface Fa0/0 and look for the console messages to scroll on the screen.
Firstly lets turn on debugging
OK great now lets turn HSRP on interface Fa0/0
I waited for a minute but there were no messages on the screen...thats what we expected right?
OK so now we’ll turn on monitoring. Now remember the level of information we will receive on the screen is related tot he level of debugging stated in the ‘show debug’ statement we saw earlier so currently we are at level ‘debugging’ which is more than enough.
Ok, we enabled the log redirect to screen with the ‘terminal monitor’ keyword. Then when we disabled the HSRP on the interface we got the debug information sent to the screen. Awesome.
OK the last subject to cover off in our brief stop off is syslog. Syslog is used by Unix systems as a method of “System Logging” for services or daemons running on the box. It evolved to include remote syslog where devices could sent information about the services and authentication etc across to one central unix box to manage the lot.
In our world, network engineers depend on syslog heavily to troubleshoot issues we see on the network. Syslog is probably one of the best tool in your box and should be ignored at your peril - you will need it one day.
Right lets setup some basic logging. I’ve got a syslog server running on host 192.168.1.34. I’ve enabled the fantastic syslog server tool from solarwinds.com for the test. We’ll begin by telling the cisco router where exactly to send it’s logging information:
The ‘logging host’ command simply tells the router where to send it’s syslog traffic. Right now lets bring down that fastethernet interface again
We see that the interface Down and Up is shown. Notice that the interface goes DOWN before the router has setup the logging to the host - it took that first log message to startup the syslog service. Once it is started though, we’re golden.
Now on the syslog server itself we don’t see the interface going DOWN because the logging wasn’t running at the time.
Notice one of the benefits of the remote syslog server. Do you see that the time and date that the event was received is recorded on the far left? Having the syslog servers time correct is clearly one of the more important things since ALL of your logs will be stored on this and having a fixed time reference for all of your logging is invaluable.
Facilities and Levels
I use facilities for all the wrong reasons I’m sure. Back in the days when I was a unix administrator I used to have syslog daemons pointing at one master syslog collector. The syslog collector was configured using facilities where each of the facilities available was used for different hosts. I’d have my SUN boxes pointing at local1 and SGI pointing at local2. Each facility would then have the different logging levels inside it. I used facilities like a pigeon hole for internal mail.
Now I’m not sure if that was right or wrong and I never bothered to find out, it just worked for me. There are more than a dozen facilities available for different uses including local, system wide, UUCP (unix to unix copy), mail, clock and more besides but I’d suggest you don’t move away from the default.
I guess this concludes our brief run into logging today
Thank you for reading, please leave us some feedback.