The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #7127

prerun_command don't stop puppet on error

Added by Victor Mora about 3 years ago. Updated almost 3 years ago.

Status:ClosedStart date:04/15/2011
Priority:UrgentDue date:
Assignee:Josh Cooper% Done:

0%

Category:plumbing
Target version:2.6.9
Affected Puppet version: Branch:
Keywords:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This issue is currently not available for export. If you are experiencing the issue described below, please file a new ticket in JIRA. Once a new ticket has been created, please add a link to it that points back to this Redmine ticket.


Description

When an error occurs in prerun_command, puppet continues execution instead of stopping.

We can see the error at the log file, but puppet continues doing changes:

puppetd[18430]: Failed to prepare catalog: Could not run command from prerun_command: Execution of ‘/path/to/command’ returned 1:

We want to prevent changes when an error occurs in prerun_command.

Our puppet.conf:

[main]

    server = server.domain
    runinterval = 300
    summarize = true

    listen = true
    client = false

    logdir = /var/log/puppet
    vardir = /var/lib/puppet
    ssldir = /var/lib/puppet/ssl
    rundir = /var/run/puppet

    prerun_command = /server/host/manage integrity check
    postrun_command = /server/host/manage integrity update 

We’ve tried with /bin/false as prerun_command with the same result. You can see and error at log file, but it continues with changes:

Mar 16 08:44:27 sistemes puppetd[26894]: Failed to prepare catalog: Could not run command from prerun_command: Execution of ‘/bin/false’ returned 1: Mar 16 08:44:43 sistemes puppetd[26894]: (//nagios-service/File[/etc/ nagios3]/checksum) checksum changed ‘{mtime}Fri Mar 04 12:08:06 +0100 2011’ to ‘{mtime}Wed Mar 16 08:41:07 +0100 2011’ Mar 16 08:44:58 sistemes puppetd[26894]: Finished catalog run in 17.95 seconds

Our puppet version: puppet —version 0.25.4


Related issues

Related to Puppet - Feature #2914: Transactions should have before and after hooks Closed 12/10/2009
Related to Puppet - Bug #1054: Reports should be sent for runs that fail entirely Closed
Duplicated by Puppet - Bug #8195: postrun_command should run after reports and summaries Closed 07/01/2011

History

#1 Updated by Rudy Gevaert about 3 years ago

I’m seeing this on 2.6.7 (and earlier too):

puptest:~# puppet agent --no-daemonize --verbose --onetime        
info: Retrieving plugin
err: Failed to prepare catalog: Could not run command from prerun_command: Execution of '/root/ugent-pre-puppet' returned 1: 
info: Loading facts in ethernetcontroller
info: Loading facts in raidcontroller
info: Loading facts in configured_ntp_servers
info: Loading facts in raidtype
info: Loading facts in ugentinfo
info: Loading facts in ethernetcontroller
info: Loading facts in raidcontroller
info: Loading facts in configured_ntp_servers
info: Loading facts in raidtype
info: Loading facts in ugentinfo
info: Caching catalog for puptest.ugent.be
info: Applying configuration version '1302861551'
notice: Finished catalog run in 0.93 seconds
puptest:~# puppet --version
2.6.7


#2 Updated by James Turnbull about 3 years ago

  • Category set to plumbing
  • Status changed from Unreviewed to Accepted
  • Target version set to 2.6.9

I can replicate this – it appears execute_prerun_command doesn’t exit on fail.

#3 Updated by James Turnbull about 3 years ago

  • Status changed from Accepted to Needs Decision
  • Assignee set to Nigel Kersten

I think this is a bug but I can’t find the original pre_run ticket – it’s certainly not as advertised in the docs.

#4 Updated by Nigel Kersten about 3 years ago

  • Status changed from Needs Decision to Accepted
  • Assignee deleted (Nigel Kersten)

This fix should be targeted at 2.6.x and 2.7.x

#5 Updated by Rudy Gevaert almost 3 years ago

  • Target version changed from 2.6.9 to 2.7.0

As 2.7 is now in RC state, could you still handle this bug please. It is own or our team’s requirement for using puppet. Thanks!

#6 Updated by Nigel Kersten almost 3 years ago

  • Priority changed from Normal to Urgent

#7 Updated by Nigel Kersten almost 3 years ago

Docs:

    # A command to run before every agent run.  If this command returns a non-zero
    # return code, the entire Puppet run will fail.
    # The default value is ''.
    # prerun_command = 

From examining the code, I can’t see that this was ever the case.

#8 Updated by Nigel Kersten almost 3 years ago

Another possible issue is that we run the prerun_command after doing a pluginsync.

That doesn’t seem desirable at all.

#9 Updated by Nigel Kersten almost 3 years ago

  • Target version changed from 2.7.0 to 2.6.x

This will actually be targeted at 2.6.x and 2.7.x.

Rudy, I have a very minimal patch with no tests yet that appears to resolve the issue with prerun at least.


--- a/lib/puppet/configurer.rb
+++ b/lib/puppet/configurer.rb
@@ -122,18 +122,20 @@ class Puppet::Configurer
   # which accepts :tags and :ignoreschedules.
   def run(options = {})
     begin
+      options[:report] ||= Puppet::Transaction::Report.new("apply")
+      report = options[:report]
+      Puppet::Util::Log.newdestination(report)
       prepare(options)
     rescue SystemExit,NoMemoryError
       raise
+    rescue CommandHookError => detail
+      Puppet.err detail
+      return
     rescue Exception => detail
       puts detail.backtrace if Puppet[:trace]
       Puppet.err "Failed to prepare catalog: #{detail}"
     end
 
-    options[:report] ||= Puppet::Transaction::Report.new("apply")
-    report = options[:report]
-    Puppet::Util::Log.newdestination(report)
-
     if catalog = options[:catalog]
       options.delete(:catalog)
     elsif ! catalog = retrieve_catalog

#10 Updated by Rudy Gevaert almost 3 years ago

Hi Nigel, your patch works. I have tested it! Thanks!!

#11 Updated by Nigel Kersten almost 3 years ago

Given the problems with trying to test exit status on a process that can possibly be daemonized, it feels like the right approach for the tests is to construct a report and check its status.

#12 Updated by Josh Cooper almost 3 years ago

  • Status changed from Accepted to Merged - Pending Release
  • Assignee set to Josh Cooper

After much discussion, it was decided that the desired behavior was:

if prerun fails, don’t apply catalog, include errors in the report, don’t alter the report status, send the report, and exit(1) if postrun fails, include errors in the report, don’t alter the report status, send the report, and exit(1)

Part of the reasoning for this behavior is that in the use case where the prerun command checks to see if the master is under too much load, and aborts puppet if it is, it is always desirable to send the report, since it can be running on a different server than the puppet master.

Along the way I fixed several issues:

  1. If the prerun command fails, we no longer apply the catalog, but we do run the postrun command, if present, and send the report (as it did before).
  2. If the postrun command files, puppet would not send the report. This is now fixed (it will always send the report)
  3. Errors during the prepare step, which includes plugin/fact download and prerun commands were not appended to the report. Now the report log destination is registered as early as possible, and unregistered as late as possible to ensure Configurer errors that occur in the run method are included in the report.
  4. The transaction was closing the Configurer’s report destination out from underneath it. As a result, postrun errors were not included in the report.

See commmit:f8c11329f1e0e0fb5648e26c7a5c58fc2341319e to 2.6.x

#13 Updated by Rudy Gevaert almost 3 years ago

Hi Nick, I’m happy with this solution! Thanks to you and your colleges for giving this a thorough solution. I’m happy to test it when the next rc comes out.

#14 Updated by James Turnbull almost 3 years ago

  • Status changed from Merged - Pending Release to Closed
  • Target version changed from 2.6.x to 2.6.9

Also available in: Atom PDF