February 2001
Discovering the Unexpected
As we have seen from our previous tips, the synergy of using common utilities along with your protocol analyzer can greatly enhance your analysis capabilities. One such powerful technique is to synchronize your protocol analyzer's time and date to that of utilities that provide some sort of logging function that include a timestamp. Recently we used the PerfMon tool on an NT server to discover a symptom that ran counter-intuitive to conventional wisdom.
In this particular network, users were experiencing very long delays (10+ seconds) when performing simple tasks such as opening files. Analyzer traces showed that one particular server common to most users, would immediately acknowledge a client's Server Message Block (SMB, the Windows layer 7 application protocol) command via a TCP ack packet, but the SMB response was often delayed by several seconds.
By manually setting our analyzer's time to within one second of the server, and using PerfMon to log critical server performance statistics, we noticed periods of fairly heavy CPU utilization followed by low utilization. So our first inclination was to suspect an overloaded server, right? Wrong! By looking at PerfMon periods when the CPU was busy and correlating with the packet timestamps in our analyzer we discovered that the worst response time was when the CPU utilization was less than 2 percent! We also noted that virtual memory paging was extremely high and that the disk raid subsystem remained busy even during periods of low CPU utilization as if it was "catching up" on disk I/O, or to use a more technical term, "thrashing". In short, removing a load off the disk subsystem by defragmenting files, freeing up space by archiving old files, moving certain files to the client, etc., improved the SMB response time dramatically.
|