The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

This issue tracker is now in read-only archive mode and automatic ticket export has been disabled. Redmine users will need to create a new JIRA account to file tickets using https://tickets.puppetlabs.com. See the following page for information on filing tickets with JIRA:

Feature #3537

It should be possible to trigger (exec) resources with require

Added by Kjetil Torgrim Homme about 6 years ago. Updated almost 3 years ago.

Status:ClosedStart date:04/12/2010
Priority:NormalDue date:
Assignee:eric sorenson% Done:

0%

Category:metaparameters
Target version:-
Affected Puppet version:0.25.4 Branch:
Keywords:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com


Description

When an Exec has conditions associated with it (unless, creates, onlyif), it can be useful to be state prerequisites which are only run when the exec itself is run.

Consider this simple example::

  exec { "prereq":
      command => "/bin/echo prereq",
      refreshonly => true
 }
  
  exec { "main":
      command => "/bin/echo main",
      onlyif  => "/bin/grep foobar /etc/issue",
      require => Exec["prereq"]
 }

Here, the refreshonly will cause “prereq” to never run, since a require isn’t enough to trigger it. Without refreshonly, it will run every time, but the desired behaviour is that “prereq” is run iff the onlyif command succeeds.

Obviously the behaviour of “refreshonly => true” can’t change, and I can’t think of a good name for a tri-state alternative — “refreshonly => ‘requires-too’” ? “allevents” may be more workable.

My prefered solution would be a new parameter “requireonly”. Perhaps slightly misleading name, since “before” should trigger execution, too, but I think most people will understand that require/before are inherently intertwined. This could later be generalised into a metaparameter to work for more types, e.g. you could have a parent File which is only checked/updated/created when some other File requires it.


Related issues

Related to Puppet - Feature #2198: Install multiple package within a single call to the pack... Investigating 04/25/2009

History

#1 Updated by Luke Kanies about 6 years ago

  • Status changed from Unreviewed to Rejected

Can’t you just use ‘subscribe’ here instead of require?

If I’m missing something, please reopen the ticket.

#2 Updated by Kjetil Torgrim Homme about 6 years ago

  • Status changed from Rejected to Re-opened

As far as I understand it, that would state the relationship in the wrong direction. There can be many “main” (e.g., via a define), but only one “prereq”. With subscribe, the instances of “main” have to be listed explicitly in the “prereq”. In addition, “prereq” would run after the “main”(s), which is kind of a showstopper, too :–)

#3 Updated by micah - over 5 years ago

I have an example. Lets say I have an exec, which updates apt:

  exec {
    'refresh_apt':
      command => '/usr/bin/apt-get update && sleep 1',
      refreshonly => true,
      subscribe => [ File['/etc/apt/apt.conf.d'], Config_file['/etc/apt/sources.list'] ];

when I install packages, i want to make sure that my apt information is up-to-date before I do the install. So how can I trigger this exec before a package resource is realized? I cannot do “Package { require => Exec[refresh_apt] }” because ‘refreshonly’ is only triggered by ‘notify’ or ‘subscribe’. I can’t use ‘notify’ because that will be run after the package resource has been realized. Subscribing to the exec isn’t going to do anything either.

#4 Updated by Nigel Kersten over 5 years ago

So most people seem to be ok with the cost of a single apt-get update on each run. I take it this is not acceptable Micah and you have a slow update process thus wanting the “refreshonly”?

Another method people seem to use is to have an onlyif/unless that checks a timestamp you drop at the end of the exec.

Are either of these workarounds acceptable?

#5 Updated by Kjetil Torgrim Homme over 5 years ago

Nigel, it is not sufficient to run “apt-get update” at the start of the run — new repos can have been added in the interim. The simplest workaround is to put all repo-handling in a separate stage.

My case isn’t related to Apt at all, I just want a method where I can have a chain of Exec’s without duplicating the “onlyif” terms in each and every instance.

#6 Updated by Nigel Kersten over 5 years ago

So this is exactly the problem I used to solve by making sure that repositories were added before apt-get update, and that packages were added after. like:

AptRepo { before => Class["apt::update"] }
Package { require => Class["apt::update"] }

You can use stages to solve this, you can also use resource defaults or defined types.

#7 Updated by Kjetil Torgrim Homme over 5 years ago

Great. Now solve the problem in the issue description :–)

#8 Updated by Nigel Kersten over 5 years ago

:) That’s why I said:

So most people seem to be ok with the cost of a single apt-get update on each run. I take it this is not acceptable Micah and you have a slow update process thus wanting the “refreshonly”?

We don’t currently have a solution other than:

  • wear the cost of the single run each time
  • make the apt-get update dump a timestamp file, use onlyif/unless to only run if older than X hrs.
  • Dump complex logic around apt-get update into a script that does all the checking internally if you have multiple conditions. The script runs each time, but it may not actually run apt-get update.

#9 Updated by Kjetil Torgrim Homme over 5 years ago

My use case has nothing to do with “apt-get”. Exec[“prereq”] can have destructive qualities if the onlyif in Exec[“main”] is not satisfied. This means that the onlyif has to be duplicated, and if there is a long chain of such Execs, the onlyifs will progressively become more complicated.

The current workaround is to put the logic in a shell script, since it is so much simpler than reams of duplicated code, but IMHO this is not a bad fit for the Puppet DSL to handle on its own.

#10 Updated by Nigel Kersten over 5 years ago

Can you provide a more concrete example? It feels as if there is a better way to achieve what you want, but it’s a little difficult at a purely theoretical level.

#11 Updated by Alan Barrett over 5 years ago

I think I understand what Kjetil wants. Given these resources:

exec { "A":
    command => "A.command",
    onlyif => "A.onlyif",
    require => Exec["B"],
}
exec { "B":
    command => "B.command",
}

the current behaviour is:

run B.command;
run A.onlyif;
if A.onlyif was sucessful {
    run A.command
}

I believe that Kjetil wants a way to express the following behaviour:

run A.onlyif;
if A.onlyif was sucessful {
    run B.command;
    run A.command;
}

#12 Updated by Peter Meier over 5 years ago

I believe that Kjetil wants a way to express the following behaviour:

[…]

If you didn’t miss the refreshonly on B by accident, I don’t think that this change, as stated by you, would be a good idea.

If you missed the refreshonly, then yes, this could probably be changed and would make sense, although the name refreshonly would then be wrong.

#13 Updated by Alan Barrett over 5 years ago

Peter Meier wrote:

If you didn’t miss the refreshonly on B by accident, I don’t think that this change, as stated by you, would be a good idea.

I was simply attempting to explain his desired behaviour, and that he wants some way to be able to express the desired behaviour; I did not intend to make any suggestions about what syntax should be used to express the desired new behaviour, and I certainly did not intend to imply that the existing behaviour should be changed.

I think that the existing behaviour with the existing syntax is sensible. If new behaviour is desired, then I think it needs new syntax, but I have no suggestions for the new syntax. I do not think that overloading “refreshonly” for this purpose would be a good idea.

#14 Updated by micah - over 5 years ago

Nigel Kersten wrote:

So most people seem to be ok with the cost of a single apt-get update on each run. I take it this is not acceptable Micah and you have a slow update process thus wanting the “refreshonly”?

I dont mind a single apt-get update that is run before a package is to be installed, but having an apt-get update run on every puppet run is a bit much, when its not needed. There are various problems it exacerbates: a. development offline is impossible, if apt-get update is mandatory; b. slow internet connections make puppet run times take very long; c. unnecessary querying of the archive on every run by every client is not being a nice netizen, I dont like to involuntarily DDoS the upstream servers if possible, especially if you have hundreds of machines doing this on every puppet run; d. while puppet is doing its pointless apt-get update, no other apt operations can be done either through scripts/cronjobs on the system, or by administrators because the package db is locked. This is annoying if you are running something like cron-apt which periodically cant do its work because puppet is running, so you get cron errors. Its annoying as an admin because you cant do anything until the puppet run is finished, it would be acceptable if the run was actually doing the update because it was going to do a package operation, but to do it because of a limitation in puppet is annoying.

Another method people seem to use is to have an onlyif/unless that checks a timestamp you drop at the end of the exec.

I’m not sure what this solves except to make it so the exec is only run at certain intervals, which is what cron-apt does. What we want puppet to do is to run an apt-get update before a package is installed, not at random times, and not at every run.

Are either of these workarounds acceptable?

I admin that this is a bit of a corner case… but it is surprising that this is not possible, when it seems like a reasonable thing to want to do.

#15 Updated by James Turnbull over 4 years ago

  • Category set to metaparameters
  • Status changed from Re-opened to Needs Decision
  • Assignee set to Nigel Kersten

#16 Updated by Nigel Kersten almost 4 years ago

  • Assignee changed from Nigel Kersten to eric sorenson

#17 Updated by Alex Hewson over 3 years ago

I also am building a platform expected to scale to hundreds of nodes and would dearly love a way to trigger an apt-get update before new packages are fetched. Skipping the update entirely isn’t an option: the mirror (EC2’s) is very aggressive about dropping outdated packages so odds are if you don’t apt-get update before installing the packages you want won’t be there anymore.

Doing it the other way – an apt-get update happening whenever Puppet runs – would hammer our local Ubuntu mirror and create unnecessary load on all our hosts.

A ‘requireonly’ parameter would be the most generalised solution but I would be happy if we could simply tell the package provider to refresh its database before fetching a new package. We could do this with a ‘refreshdb’ parameter in package…

package { 'bsd-mailx':
  ensure     => present,
  refreshdb  => true,       # causes provider to run apt-get update or equivalent before fetching pkg
}

#18 Updated by David Burke over 3 years ago

On the apt-get update use case, there is also an annoyance of having each puppet check to be a “change”. I don’t consider checking for a updated package to be a change if the package was determined to be up to date already. If I want to make use of package it seems I then have to accept all puppet node check ins to be “changes” or accept it will fail on occasion from not having up to date sources.

I agree also with Alex’s suggestion on refreshdb. This seems a very common use case for puppet. It doesn’t seem obvious to me (a very new user) that you would need to exec apt-get update for package to work right but you really do or you will get errors occasionally.

#19 Updated by Paul Hinze almost 3 years ago

Hi Eric- no update from you in awhile, what’s the status on this issue?

I’ve been watching this issue mostly concerned about the apt-get update use case. I definitely don’t want to eat an update on every puppet run for my environment, but I’d like to be able to trigger it once Puppet has decided it should be installing a package. It seems like this is something that Puppet should be able to handle nicely for me.

Currently we’re using an Exec['apt-get-update'] with an unless that causes it to run if the cache is over 24h old [1], but this does not help us in the package upgrade use case. We have to choose between hitting apt servers with update requests tens of thousands of times a day (30m * number of nodes), or manually running around and kicking apt-get update when we need to upgrade packages.

There must be a better way!

[1] Something similar to this: http://stackoverflow.com/a/14754159/105022

#20 Updated by eric sorenson almost 3 years ago

Let me rephrase Alan’s note 11 because it is a useful point of discussion but has an error in the code

exec { "A":
    command => "A.command",
    onlyif => "A.onlyif",
    require => Exec["B"],
}
exec { "B":
    command => "B.command",
    refreshonly => true,
}

the current behaviour is:

run A.onlyif;
if A.onlyif was sucessful {
    run A.command
    run B.command
}

I believe that Kjetil wants a way to express the following behaviour:

run A.onlyif;
if A.onlyif was sucessful {
    run B.command;
    run A.command;
}

But to do that means we’d have to pause evaluating resource A, go to B, then come back to A. A much simpler way to model this would be not to use the onlyif at all but instead make its command a top-level exec resource too. It will be run every time either way so there’s no performance hit, and the savings in complexity seems quite worthwhile. So you’d have

exec { "prereq":
    command => "formerly A.onlyif"
    notify => Exec["B"]
}
exec { "A":
    command => "A.command",
    refreshonly => true,
}
exec { "B":
    command => "B.command",
    refreshonly => true,
    notify => Exec["A"]
}

Which, as far as I can tell, gets you to exactly the same state without weird runarounds.

More on the apt-get situation in a moment.

#21 Updated by eric sorenson almost 3 years ago

Sorry for a lack of updates. This is one that nigel transferred over to me when I started and it was lost in a flood of other similar transfers so I didn’t know I was on the hook for it. Thanks Nigel!! :-D

The specific problem of running apt-get update before installing a package — The requirement from Micah at note 14 might not be feasible:

I’m not sure what this solves except to make it so the exec is only run at certain intervals, which is what cron-apt does. What we want puppet to do is to run an apt-get update before a package is installed, not at random times, and not at every run.

The problem with this is that it would work for install but not for updates. apt would never know, due to the infinite lifetime of downloaded metadata, that there were upgrades available. So if there was a new metaparameter, or if the provider were twiddled to do apt-get update inside itself only if the ensure were found to be out of sync (as someone here proposed), it’d have to build logic to satisfy ANY of the following conditions:

  1. if any package is specified to be present and isn’t, then apt-get update
  2. if any package is specified to be a particular version and there’s an older installed version, then apt-get update
  3. if any package is specified to be ensure=>latest, then apt-get update

It’s that last one that gets me. Some questions for the watchers:

  • are you (micah, alex hewson, etc) specifying every version for every package?
  • Do you do ensure=>latest anywhere?
  • There are a couple of comments about the load on mirrors; is it really that bad? (Not snark, I honestly don’t know, I come from a redhat background and have had thousands of servers hitting metadata on repository servers at 30 minute intervals without negative affect, but that is only 3 files with if-modified-since headers so perhaps its much worse on apt servers)

#22 Updated by micah - almost 3 years ago

  1. if any package is specified to be present and isn’t, then apt-get update
  2. if any package is specified to be a particular version and there’s an older installed version, then apt-get update
  3. if any package is specified to be ensure=>latest, then apt-get update

It’s that last one that gets me. Some questions for the watchers:

What is it about #3 that is problematic?

  • are you (micah, alex hewson, etc) specifying every version for every package?

No, I have a mixture of settings, both versions and just installing it.

  • Do you do ensure=>latest anywhere?

Yes.

  • There are a couple of comments about the load on mirrors; is it really that bad? (Not snark, I honestly don’t know, I come from a redhat background and have had thousands of servers hitting metadata on repository servers at 30 minute intervals without negative affect, but that is only 3 files with if-modified-since headers so perhaps its much worse on apt servers)

I’m a Debian Developer, and I dont think it is such a big deal. However, I dont know why people would be thinking that this would increase the load on the mirrors.

It seems to me that if an apt-get update were to be done for any of those three conditions, I would think that the apt-get update run would be scheduled to run for that puppet run, not for each package that is installed. In otherwords, it seems sufficient to me to have 5 different package resources, satisfying any of those three conditions you outlined above, each notifying an apt-get update resource that would then run one time before doing those operations.

For my purposes, doing an apt-get update run 5 different times in that example is overkill (and really slows down manifest application).

micah

#23 Updated by Kjetil Torgrim Homme almost 3 years ago

eric sorenson wrote:

Let me rephrase Alan’s note 11 because it is a useful point of discussion but has an error in the code

the current behaviour is:

  run A.onlyif;
  if A.onlyif was sucessful {
    run A.command
    run B.command
  }

is that so? I don’t have a 3.x Puppet to test on, so… prior to 3.0, B.command is not run.

I believe that Kjetil wants a way to express the following behaviour:

[…]

true, that seems more intuitive given the require’d ordering.

But to do that means we’d have to pause evaluating resource A, go to B, then come back to A.

good point. I naively thought we could do the onlyif evaluation during the dependency graph, but that would be wrong, the require in the exec may be a prerequisite for the onlyif to run. which points to why the “require” becomes ambigious wrt. ordering of A and B.

A much simpler way to model this would be not to use the onlyif at all but instead make its command a top-level exec resource too. It will be run every time either way so there’s no performance hit, and the savings in complexity seems quite worthwhile. So you’d have […]

unfortunately this gives error messages in the log when Exec[‘prereq’] fails.

#24 Updated by eric sorenson almost 3 years ago

micah :

What is it about #3 that is problematic?

Because you need to apt-get update periodically for ensure => latest to work.

It seems to me that if an apt-get update were to be done for any of those three conditions, I would think that the apt-get update run would be scheduled to run for that puppet run, not for each package that is installed. In otherwords, it seems sufficient to me to have 5 different package resources, satisfying any of those three conditions you outlined above, each notifying an apt-get update resource that would then run one time before doing those operations.

OK, maximum 1 update per puppet run at the beginning of the run is exactly what Nigel’s two lines in note 6 accomplish, which seems OK to me but other comments here said that was too frequent.

#25 Updated by eric sorenson almost 3 years ago

Kjetil Torgrim Homme wrote:

true, that seems more intuitive given the require’d ordering.

This code works as intended:

Notice: /Stage[main]//Exec[prereq]/returns: executed successfully
Notice: /Stage[main]//Exec[B]: Triggered 'refresh' from 1 events
Notice: /Stage[main]//Exec[A]: Triggered 'refresh' from 1 events
Notice: Finished catalog run in 0.32 seconds

unfortunately this gives error messages in the log when Exec[‘prereq’] fails.

That’s true. logoutput => false is a little better but yeah, that resource is failing and is logged as such.

#26 Updated by micah - almost 3 years ago

eric sorenson wrote:

micah :

What is it about #3 that is problematic?

Because you need to apt-get update periodically for ensure => latest to work.

It is sufficient to do one apt-get update during a puppet run if any package resource requires that it be done. If something changes in the repository during a puppet run, we should not have to worry about that, the next run can handle that.

It seems to me that if an apt-get update were to be done for any of those three conditions, I would think that the apt-get update run would be scheduled to run for that puppet run, not for each package that is installed. In otherwords, it seems sufficient to me to have 5 different package resources, satisfying any of those three conditions you outlined above, each notifying an apt-get update resource that would then run one time before doing those operations. OK, maximum 1 update per puppet run at the beginning of the run is exactly what Nigel’s two lines in note 6 accomplish, which seems OK to me but other comments here said that was too frequent.

I think that one update per puppet run is fine, but only if it is necessary based on the package resources. I’m not sure about it happening at the beginning or not, because I would think that it would be better if it were only done if it should be done, which would require inspection of the package resources.

For clarity sake, lets describe an example. Lets pretend we have the following package resources:

package { 'foo': ensure => '2.6.5-2' }
package { 'bar': ensure => latest }
package { 'baz': ensure => '1.2.2-6' }

Ok, now puppet runs, it has a look at the system and it finds that ‘foo’ is installed and its version is ‘2.6.5-2’, so it doesn’t need to do anything for that package resource. Move along, nothing to see here.

When it looks at ‘bar’ it sees that it needs to be the latest version, so it schedules an apt-get update to make sure that it has the latest package state before doing anything.

Then it looks at package ‘baz’, and it sees that the version installed on the system is ‘1.1-1’, so it says ok, I need to install a different package. Puppet can do one of two things here: a) it tries to determine if 1.2.2-6 is possible to install right now (this can be done by querying the package database) and if so goes ahead with the install, and if not then it looks to see if an apt-get update has been run, and if not schedules an apt-get update; b) or look to see if an apt-get update has been run and if not schedule an apt-get update no matter what, to make sure that we have the latest package state before doing anything

So now that all package resources have been inspected, we have an apt-get update event scheduled (perhaps twice depending on what is done with package ‘baz’) so then the apt-get update happens and then those package resources are run again, this time being aware that an apt-get update has happened.

#27 Updated by eric sorenson almost 3 years ago

  • Status changed from Needs Decision to Closed

OK, thanks Micah. I’m going to mark this bug closed, because:

  1. the original request for a generalized ‘requireonly’ or the evaluate A – pause A – evaluate B – resume A functionality is, I think everyone agreed, not a great idea
  2. the discussion morphed into the as-few-as-possible apt-get update runs, and there’s no way to get the exact functionality you’re talking about in Update 26 in the current provider system without adding a ton of complexity. I think what you’re asking for is part of #2198, because we’d need to coalesce all the non-insync resources together and perform an action before attempting to bring any of them into sync.

(You can get close to this today — a predictable 1 apt-get update per run — without any additional complexity following the approach in note 6. As Micah says above, loading the mirrors should not be a huge concern.)

Also available in: Atom PDF