interview question and answer

March 20, 2012

Troubleshoot– Windows 2008 R2 Server Manager Roles and Features – I



Troubleshooting Event ID 333

Event ID 333 basically occurs when system registry fails to flush operation to the disk. In most of the cases, Event ID 333 is more of a byproduct rather than an issue itself.
Event id 333 occurs when there is some performance issue or when memory/disk is not keeping up with the load. Generally when the issue occurs, you would see other Event IDs as well pointing towards the actual cause that triggered Event ID 333.
There are 4 likely causes for getting 333:
· Memory pressure- Physical or Virtual memory bottleneck, low System PTEs, Working set trimming etc.
· Disk pressure – Bottleneck, performance issue etc.
· Filter driver – Bad driver keeping registry from being flushed.
· Lock Pages In Memory – This behaviour can result if the SQL service account is given the user right ‘Lock Pages in Memory’

Troubleshooting


The following are the troubleshooting steps for this issue. Please note, all the steps do not fit in all scenarios and should not be applied as silver bullets.

Event Log

First this is to check for the Event IDs. Look for any other Event id related to disk, memory, server (SRV) in System log. Key event ids are: 2019, 2020, 51, 55, 52, 58

 Perfmon

· Look for key counters:
- Memory\%Committed Bytes in Use
- Memory\Available Mbytes
- Memory\Cache Bytes
- Memory\Commit Limit
- Free System Page Table Entries
- Memory\Pool Nonpaged Bytes
- Memory\Pool Paged Bytes
Physical disk or Logical Disk
- %disk Time
- Avg. Disk Bytes/Transfer (Read and Write)
- Avg. Disk Queue Length
- Avg Disk sec/Transfer
- Disk bytes/sec
- Split IO/sec
Paging File\%Usage
System\%Registry Quota in use

Disk

· Enable disk write cache
Enable disk write cache to increase disk performance. (Refer to KB 324446)
- This would enable the caching of data in memory instead of immediate write to disk. This reduces the load (queue length) on the disk and system can schedule flush the data to disk later.


· Perfmon
Monitor disk sec/transfer, idle time, split I/O, Data byes/sec
- Split I/O counter represent how fragment the drive is. It is best to defrag the drive as it has a major hit on the disk performance.
- Sec/Transfer represents the time it takes to transfer data. It gives the disk throughput
· Configure RegistryLazyFlushInterval to 60 secs. (Reference: KB317357 and KB324446)
- Setting value to 60, tells system to write registry changes to disk after 60 seconds. The more the number of writes, the more disk I/O. The value 60 is recommended by Microsoft.

· Event logs
Check for any disk related event ids. Most common sources are fdisk, disk. Common causes are corrupt/bad sector, controller issue or driver issues.
- Upgrade firmware drivers for controller,
- Run chkdsk if required if we have event if pointing to corrupt sector/cluster on the disk.

 

Memory

There could be contention in either physical or virtual memory on the system. The causes can be several and they do not have straight forward troubleshooting. It is recommended to have an understanding of memory concept before making changes as it can easily make the system unstable.

· Boot.ini
- On Windows 2003 x86 server, check Boot.ini, if we have /3GB switch in place and also keep the role of the server in mind. Try to modify the switch by adding /USERVA so that we can give more room to kernel memory. Visit the link to understand /3GB and /UserVA switches: http://technet.microsoft.com/fr-fr/library/cc784475(WS.10).aspx
- On windows 2008, we don’t have boot.ini
- Use of /PAE and /3GB is not recommended as it has adverse effect on system performance.

· SQL Server Consideration
- Configure SQL to use less memory for the buffer pool.

- SQL Server has it own memory manager (MM) and it doesn’t use windows MM. IT can be set to reserve X amount of memory, which windows cannot use.

- Configure Perfmon with SQL object and monitor the memory specific counters. This is when we have low physical memory issue on Windows system.
- 918483 How to reduce paging of buffer pool memory in the 64-bit version of SQL Server 2005 You can enable the lock pages in memory permissions to prevent SQL Server 2005 64-bit buffer pool memory from being paged out of physical memory
http://support.microsoft.com/?id=918483


· Disable Hot Add memory
- When the Hot Add Memory feature is enabled, the operating system pre-allocates kernel resources to handle any future memory that may be added to the computer. Kernel resources are allocated based on the capabilities of the computer instead of on the RAM that is actually installed. The kernel may allocate significant resources to RAM that may never be installed. Therefore, the Hot Add Memory feature may cause the maximum size of the paged pool to be much smaller than expected.
- To disable the feature: http://support.microsoft.com/?id=913568

· Pool memory leak
Look for Event id 2020 or 2019 for paged-pool or nonpaged-pool exhaustion. Configure poolmon.exe with appropriate interval and monitor the tag which has highest consumption at the time of issue.
- There are few articles for pool memory exhaustion but it is not recommended to apply without getting the poolmon data. KB 312362 is for maximizing the Paged-Pool limit on the box in case of Event ID 2020. But this is helpful when we have high memory consumption and not memory leak.

· Increase page file
- Again this is helpful if we have perfmon data to confirm the need.


· Apply patch
- For NTOSKRNL.EXE, as memory manager is implemented in windows kernel and ntoskrnl.exe is the executable.
[KB 935926: A Windows Server 2003-based computer stops responding when the registry is in heavy use]

· Free system PTEs.
- Look for perfmon counter value Free System Page Table Entries

Filter driver

Check for 3rd party drivers on the box which are outdated. You can use msinfo32 or Microsoft MPS utility to list out the drivers.

Last Resort – Complete memory dump

If the above troubleshooting does not help, configure the box for generating manual complete memory dump and trigger it when issue occurs. Send the dump to Microsoft for analysis

No comments: