"Too many open files" on Puppetmaster shouldn't corrupt client system state
|Status:||Needs More Information||Start date:||04/21/2011|
|Assignee:||Nigel Kersten||% Done:|
|Affected Puppet version:||2.6.7||Branch:|
We have been experiencing the bug related to abandoned MySQL connections that has been covered in other bug reports. As a result, the Puppetmaster occasionally runs out of available file handles until it is restarted.
Usually, this prevents new Puppet runs from occurring and all is well in the sense that nothing on the managed servers is significantly impacted. However, I just noticed that in one such instance Puppet on the client side actually replaced a previously valid system file with the contents “Too many open files – /path/to/puppetmaster/modules/modulename/files/expected_file.sh”. This, fortunately, did not render the system unusable but we use Puppet to manage certain critical system files that, if corrupted in such a way, WOULD render the system unusable until booting in single user mode and manually repairing, etc.
The Puppetmaster should check on the result of its attempt to read and serve files and if it can’t open the file for any reason it should not return the error message of the attempt as contents of a file for the Puppet client to execute with.
This is with Puppet 2.6.7 on the Puppetmaster and Puppet 2.6.4 on the client.
#1 Updated by Nigel Kersten about 2 years ago
- Status changed from Unreviewed to Needs More Information
- Assignee set to Nigel Kersten
I can’t reproduce this at all. With judicious use of ulimit, I can get a 2.6.7 master and a 2.6.4 node complaining about too many open files:
err: Could not render to raw: Too many open files - /var/lib/puppet/exercise_puppet/environments/dev/modules/base/files/etc/apache2/mods-available/authz_default.load
err: /File[/tmp/etc/apache2/mods-available/authz_default.load]/ensure: change from absent to file failed: Could not set 'file on ensure: Error 400 on SERVER: Could not render to raw: Too many open files - /var/lib/puppet/exercise_puppet/environments/dev/modules/base/files/etc/apache2/mods-available/authz_default.load
but none of the files have been replaced with the actual error message.
Can we get more details about your Ruby version, OS, etc? Are you able to reproduce this? How is your master set up? Apache? nginx? Passenger?
On my master
ulimit -n 9 seems to get me to about the right level where I can invoke
puppet master --no-daemonize --verbose --debug and I’ve got enough handles to read all my startup files, but several simultaneous client runs against this will exhaust available limits.