The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #5876

Require and Subscribe on the same refreshonly exec doesnt work

Added by R.I. Pienaar over 3 years ago. Updated 4 months ago.

Status:Needs More InformationStart date:01/13/2011
Priority:NormalDue date:
Assignee:Nick Lewis% Done:

0%

Category:exec
Target version:3.x
Affected Puppet version:0.25.4 Branch:
Keywords:customer

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket is now tracked at: https://tickets.puppetlabs.com/browse/PUP-1106


Description

Given this manifest:

exec{"moo":
    command => "/usr/bin/cowsay 'fail :('",
    refreshonly => true,
    logoutput => true,
    require  => Exec["false"],
    subscribe => [ File["/tmp/1"], File["/tmp/2"], File["/tmp/3"] ]
}

file{"/tmp/1": content => 1}
file{"/tmp/2": content => 2}
file{"/tmp/3": content => 3}
exec{"false": command => "/bin/false"}

The Exec[moo] shouldn’t run it requires Exec[false] which will always fail, but it gets notified by the file resources via its subscribes and then runs anyway regardless of the state of the required resources.

In version 2.6.5 this might be related to #5670 but I am filing a new bug since I think its not as this bug is also present in 0.25.x while the one in #5670 is 2.6.x only

notice: //File[/tmp/1]/content: defined content as 'unknown checksum'
notice: //File[/tmp/3]/content: defined content as 'unknown checksum'
err: //Exec[false]/returns: change from notrun to 0 failed: /bin/false returned 1 instead of one of [0] at /home/rip/test.pp:12
notice: //File[/tmp/2]/content: defined content as 'unknown checksum'
notice: //Exec[moo]: Dependency exec[/bin/false] has 1 failures
warning: //Exec[moo]: Skipping because of failed dependencies
notice: //Exec[moo]: Triggering 'refresh' from 3 dependencies
notice: //Exec[moo]/returns:  _________ 
notice: //Exec[moo]/returns: < fail :( >
notice: //Exec[moo]/returns:  --------- 
notice: //Exec[moo]/returns:         \   ^__^
notice: //Exec[moo]/returns:          \  (oo)\_______
notice: //Exec[moo]/returns:             (__)\       )\/\
notice: //Exec[moo]/returns:                 ||----w |
notice: //Exec[moo]/returns:                 ||     ||

Related issues

Related to Puppet - Bug #3788: subscribe overrides tags Rejected 05/18/2010
Related to Puppet - Bug #5670: Failing resources should not notify Closed 12/24/2010
Related to Puppet - Feature #6911: Flexible, deterministic, unguessable application order Closed 03/30/2011
Related to Puppet - Bug #7486: Service refreshed despite failed dependencies Needs Decision 05/11/2011
Related to Puppet - Bug #14797: "require => Exec" ignored when using template as content Accepted 06/04/2012
Related to Puppet - Bug #18701: If a resource is triggered by a refresh, and it fails, pu... Investigating
Related to Puppet - Feature #19878: notice refreshonly ability Duplicate
Related to Puppet - Bug #20865: (Now) consistent unexpected behavior when notifying exec ... Accepted
Duplicated by Puppet - Bug #8372: files that require execs land changes even when the exec ... Duplicate 07/12/2011
Blocks Puppet - Feature #651: Exec checks should be turned into metaparameters Accepted

History

#1 Updated by James Turnbull about 3 years ago

  • Status changed from Unreviewed to Needs More Information
  • Assignee set to Nigel Kersten

Nigel – thoughts? I can’t replicate this in stand-alone 2.6.x.

#2 Updated by James Turnbull about 3 years ago

Okay I can now replicate it. Annoying.

#3 Updated by Nigel Kersten about 3 years ago

  • Status changed from Needs More Information to Accepted
  • Assignee deleted (Nigel Kersten)
  • Target version set to 2.6.x

#4 Updated by R.I. Pienaar about 3 years ago

I am not sure if this is related – i can make a new bug if you think it isn’t – but I am seeing similar notify issues:

puppet-agent[1723]: (/Stage[main]/Mcollective::Install/Puppet::Pkg[rubygem-stomp]/Package[rubygem-stomp]/ensure) change from 1.1.7-1.el5 to 1.1.8-1.el5 failed: Could not update: Failed to update to version 1.1.8-1.el5, got version 1.1.7-1.el5 instead at /etc/puppet/manifests/common/modules/puppet/manifests/pkg.pp:7
puppet-agent[1723]: (/Stage[main]/Mcollective::Install/Puppet::Pkg[rubygem-stomp]/Package[rubygem-stomp]) Scheduling refresh of Service[mcollective]
puppet-agent[1723]: (/Stage[main]/Mcollective::Service/Service[mcollective]) Triggered 'refresh' from 1 events

So here a package fails to install, it then still notifies the service and the service restarts

Additionally I question the ‘normal’ priority of this ticket, the dependency and notification system in puppet is key to everything, I’d consider these issues critical.

The above is from 2.6.6

#5 Updated by Joe McDonagh about 3 years ago

Just to chime in, also affecting me on 2.6.6:

    
puppet-agent[9787]: (/Stage[main]/Exim4/Service[exim]) Dependency File[/etc/exim/exim.conf] has failures: true
puppet-agent[9787]: (/Stage[main]/Exim4/Service[exim]) Skipping because of failed dependencies
puppet-agent[9787]: (/Stage[main]/Exim4/Service[exim]) Triggered 'refresh' from 1 events

#6 Updated by Nigel Kersten about 3 years ago

  • Priority changed from Normal to Urgent

#7 Updated by Nick Lewis about 3 years ago

  • Assignee set to Nick Lewis

#8 Updated by Nick Lewis about 3 years ago

  • Status changed from Accepted to Needs Decision

It’s unclear what to do about this in the case of tags and schedules. It looks like #6911 will touch this code in ways that will make it easier to fix only the case of failed dependencies without affecting tags and schedules, if that’s the way we choose to go. I sent an email asking for feedback about this issue to the puppet-dev list, but it hasn’t garnered any attention.

#9 Updated by Nigel Kersten about 3 years ago

  • Priority changed from Urgent to Normal

We have a clearer idea of what’s going on here, and the Exec resource example has complicated things somewhat due to behaving the same way when applied as when it refreshes.

We may need to rethink the semantics of “require”, and be clearer about how skipped resources (such as those that have a failed dependency) issue their refresh action when they receive a notify event, but we don’t have a broken model here.

#10 Updated by Nick Moffitt almost 3 years ago

I’m also confused by the “normal” priority of this ticket. This seems like a pretty fundamentally broken piece of a pretty fundamental mechanism in puppet!

#11 Updated by R.I. Pienaar almost 3 years ago

+1, heavily contesting the assertion the model is not broken. I think if you were to come up with a order of precedence then require should always win.

#12 Updated by John Florian almost 3 years ago

I too would like to contest the assertion that the model is not broken. This is so fundamental to how I need to use puppet that I may need to consider alternate products if this model is truly deemed correct. I concur that require should trump all and that it should be treated as a hard dependency, not just affect the “order of operations”.

Please also see #7486, which is looking more and more like a duplicate of this ticket.

#13 Updated by Jeff McCune almost 3 years ago

Related issues

This has been a long standing issue that continually surprises people about Puppet.

Added related tickets:

#14 Updated by Chris Hirsch almost 3 years ago

I would also like to add that this is a problem in our environment (2.6.2-1). I’m trying to install a package and if package installs correctly it should notify the service. Unfortunately the package is not currently available and the install fails but the services are still being restarted. This is a Bad Thing :–)

#15 Updated by Nigel Kersten over 2 years ago

Does the service require the package? The require should do the right thing.

#16 Updated by Jason Slagle over 2 years ago

Came across this class during training here at puppetconf attempting to validate the sudoers file in an example.

At least 3 seperate people in my class hit this issue – it’s very surprising and frustrating.

+1 on raising the priority back to urgent – this isn’t a POLA thing here – it seems to have been broken and I can’t see how fixing it would surprise people?

Here’s another simple test case that models what we were doing in class:

file{"/tmp/a": ensure=>present, content=>"Test", notify=>Exec['/bin/false']}
exec{"/bin/false": notify => File["/tmp/b"], refreshonly=>true}
file{"/tmp/b": ensure=>present, source=>'/tmp/a', require=>Exec['/bin/false']}

[root@localhost manifests]# puppet apply -v 5876.pp
info: Loading facts in facter_dot_d
info: Applying configuration version '1316532415'
notice: /Stage[main]//File[/tmp/a]/ensure: created
info: /Stage[main]//File[/tmp/a]: Scheduling refresh of Exec[/bin/false]
err: /Stage[main]//Exec[/bin/false]: Failed to call refresh: /bin/false returned 1 instead of one of [0] at /etc/puppetlabs/puppet/manifests/5876.pp:2
notice: /Stage[main]//File[/tmp/b]/ensure: defined content as '{md5}0cbc6611f5540bd0809a388dc95a615b'
notice: Finished catalog run in 0.24 seconds

#17 Updated by Ken Barber over 2 years ago

  • Category set to exec

I concur, I’ve also seen this in classes. It only seems to happen with ‘refreshonly => true’.

Of course one would ponder what is the right thing to do in cases like this:

#file { "file1":
#  path => "/tmp/broken1",
#  content => "broken",
#  notify => Exec["exec1"],
#}
exec { "exec1":
  command => "/usr/bin/false",
  refreshonly => true,
  before => File["file2"],
}
file { "file2":
  path => "/tmp/broken2",
  content => "broken",
}

By commenting out File[‘file1’] I’m trying to emulate the behaviour of an exec that isn’t refreshed due to perhaps its file not being changed. Exec[‘exec1’] is invalid and will fail, but without a proper run how would File[‘file2’] determine this?

My point is even if you fixed that behaviour – the pattern would be useless. Right? If you made a require force an exec as a check first that would break the inherent meaning of ‘refreshonly’ on the exec so this is not a solution either.

I think the problem is more complicated, but regardless it is surprising when you do get a failure as Jason is showing that the file installs regardless.

So therefore for these reasons I think for this use-case at least, the model is broken not just the implementation.

Jason Slagle wrote:

Came across this class during training here at puppetconf attempting to validate the sudoers file in an example.

At least 3 seperate people in my class hit this issue – it’s very surprising and frustrating.

+1 on raising the priority back to urgent – this isn’t a POLA thing here – it seems to have been broken and I can’t see how fixing it would surprise people?

Here’s another simple test case that models what we were doing in class:

[…]

#18 Updated by Nigel Kersten over 2 years ago

  • Status changed from Needs Decision to Needs More Information

(we’re past the point of needing a decision here, even if we could still do with a better summary of how it should work)

#19 Updated by Michael Stahnke over 2 years ago

  • Target version changed from 2.6.x to 2.7.x

2.6.x is closed. Moving to 2.7.x

#20 Updated by John Bollinger about 2 years ago

This issue seems to be complicated by the overlay of signaling / refreshing on top of conventional resource synchronization. Tags and schedules are just cream on top.

Suppose, then, we look first at conventional syncronization. It seems reasonable and expected that

(1) no attempt should be made to synchronize a resource if any of its direct or transitive dependencies fail to synchronize.

Furthermore,

(1a) if no attempt is made to sync a resource, then it is not synchronized, even if it has an otherwise trivial synchronization action (e.g. an Exec with refreshony => true).

Since we’re ignoring signaling for the moment, the dependencies in question are expressed only via ‘require’, ‘before’, and the non-signaling chain operators. I think this much works reliably.

It also seems reasonable and expected that:

(2) a resource that is not synchronized should not emit signals, whether it failed itself, a dependency failed, or it was excluded based on tags or schedules.

I think this also works.

As we move a bit more into signaling, I think it is reasonable and expected that:

(3) if a resource is scheduled to be refreshed, then the refresh happens immediately following its successful synchronization

It follows, however, that:

(4) a resource that is not synchronized cannot be refreshed, regardless of what resource notifies it or to what other resource it may be subscribed.

That stands to reason, but it seems not to be consistent with Puppet’s current behavior, at least under some circumstances. That’s the subject of this issue.

Now we come to a potential point of contention:

*should it be possible for resource A to be synchronized if it subscribes to or is notified by failed resource B, but has no non-signaling relationship with B? *

The current documentation suggests not, inasmuch as it describes subscribe and notify as augmented require and before, respectively.

I’m fine with that, but those constraints could be loosened to demand only an attempt at synchronization. That is, suppose Service[‘A’] subscribes to File[‘B’], but there are no other relationships between them.

Then certainly the agent must try to synchronize File[‘B’] before Service[‘A’], but if File[‘B’] fails then the agent could attempt to sync Service[‘A’] anyway (though it would not receive a signal from File[‘B’]).

In that case, the currently-documented behavior could still be achieved by naming File[‘B’] in both kinds of relationships (e.g. A could both require => File[‘B’] and subscribe => File[‘B’]). That would allow for gentler failure behavior in some cases, without precluding more stringent behavior where that is desired. There are therefore two possible rules here:

(5') No attempt is made to synchronize a resource if it has a signaling dependency on a resource that fails, or

(5") When one resource has a signaling dependency on another, any attempt to sync the dependency will be performed before any attempt to sync the dependent resource, but the latter attempt is not conditional on the success of the former.

Execs are a bit of a special case, though they might serve as a model for certain custom resources or perhaps future resource types. For them

(6) Execs should not be construed as having the same synchronization and refresh actions. Instead, they should be construed as having only one or the other.

Thus, when an Exec has refreshonly => true that should mean its command is a refresh action (and the sync action is equivalent to /bin/true), whereas when it has refreshonly => false, its command is the sync action. That will produce the same behavior as Exec current exhibits in the vast majority of cases — I think in all the cases that aren’t considered buggy.

There remains the question of what to do with dependencies on resources that sync successfully but fail to refresh. This is a second potential point of contention, but I suggest that

(7) When resource A has only a non-signaling dependency on resource B, then failure of resource B to refresh does not prevent resource A from being synchronized or refreshed.

The alternative, of course, is

(7') When resource A has any type of dependency on resource B, and resource B fails to refresh, then no attempt is made to sync resource A. (There’s no need to distinguish different types of dependencies in this case.)

If (7) is adopted, then that leaves the question of what to do when A has a signaling dependency on B, and B fails to refresh. I suggest that the rule should be the same as (7), but it would be possible to refuse to sync A in that case.

Wow, sorry for being long-winded (even for me). I hope this at least gets some meaningful discussion going on what the desired behavior is.

#21 Updated by Nigel Kersten about 2 years ago

John Bollinger wrote:

Wow, sorry for being long-winded (even for me). I hope this at least gets some meaningful discussion going on what the desired behavior is.

That was great John. I did some minor reformatting to bold out the premises and points of contention.

re-reading.

#22 Updated by Garrett Honeycutt almost 2 years ago

Pinging for update, as this affects the Puppet Master training materials.

#23 Updated by Andrew Parker over 1 year ago

  • Target version deleted (2.7.x)

#24 Updated by Andrew Parker over 1 year ago

As the 2.7.x line is winding down, I am removing the target at 2.7.x from tickets in the system. The 2.7 line should only receive fixes for major problems (crashes, for instance) or security problems.

#25 Updated by Celia Cottle about 1 year ago

  • Keywords set to customer

#27 Updated by Charlie Sharpsteen 12 months ago

  • Target version set to 3.x

Re-targeting at 3.x due to the number of duplicates and watchers.

#28 Updated by R.I. Pienaar 4 months ago

Redmine Issue #5876 has been migrated to JIRA:

https://tickets.puppetlabs.com/browse/PUP-1106

Also available in: Atom PDF