Today my DBA reported that the server she was working on was spitting out “too many open files” errors and no new processes could be started.

This is a common problem with DB servers with heavy transactions. In my environment there are 6 DB instances running on the server. No quite the optimized setup I would say.

The fix was to increase the total file descriptors kernel parameter count in the /etc/sysctl.conf file. I doubled my limit from 8192 to 16384.

The walk through,

1. Find out what the current open file descriptor limit is.

~# more /proc/sys/fs/file-max

~# 8192

or

~# sysctl -a | grep fs.file-max

~# fs.file-max = 8192

2. View how many open file descriptors are currently being used.

~# more /proc/sys/fs/file-nr

~# 8191

3. View how many files are open. The number returned might defer as 1 file descriptor can have multiple open files attached to it.

~# lsof | wc -l

~# 10325

4. Edit the kernel paramneter file /etc/sysctl.conf and add line “fs.file-max=[new value]” to it.

~# vi /etc/sysctl.conf

fs.file-max = 331287

5. Apply the changes.

~# sysctl -p
~# fs.file-max = 331287

Problem fixed.

17 Dec, 2007  |  Posted by Danesh  |  in Linux

Another issue that popped up tonight. The time on a payroll server seem to be slower then usual. Futher troubleshooting on the box revealed that it took 4 seconds to move 1 second on the server. This caused the payroll servers to stop communicating between each other as time sync was part of a security measure built into the payroll software we run here.

Some googling later it seem to be a BUG with the kernel. The fix, suggested to either update the kernel, recompile the kernel or add some kernel parameters in GRUB to fix the issue. I decided to go with the kernel parameters because this was a production server and the downtime window was very slim.

The fix,

  1. vi /etc/boot/grub.conf
  2. Add to the end of the kernel line. “clock=pit noapic nolapic”
  3. reboot and check time. “watch date”

Continue Reading ->