The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #2773

buglet in provider/service/launchd.rb

Added by eric sorenson almost 5 years ago. Updated over 1 year ago.

Status:ClosedStart date:11/03/2009
Priority:HighDue date:
Assignee:Gary Larizza% Done:

0%

Category:service
Target version:2.7.10
Affected Puppet version:0.25.1 Branch:https://github.com/puppetlabs/puppet/pull/324
Keywords:osx launchd customer

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This issue is currently not available for export. If you are experiencing the issue described below, please file a new ticket in JIRA. Once a new ticket has been created, please add a link to it that points back to this Redmine ticket.


Description

I added a persistent postfix plist to replace the run-on-demand one distributed with OSX. (Which causes our nagios mailqueue check to fail annoyingly because qmgr doesn’t run all the time). So upon copying the plist, process status should have been ‘enabled’ but ‘stopped’. But it wouldn’t start up by itself:

debug: Service[postfix](provider=launchd): Executing 'launchctl load /System/Library/LaunchDaemons/org.postfix.master.plist'
err: //mail::nullclient/Service[postfix]/ensure: change from stopped to running failed: Unable to start service: 
   org.postfix.master at path: /System/Library/LaunchDaemons/org.postfix.master.plist

I manually got it working with ‘launchctl load -w’ , but the conditional in the start() method was not being triggered. After resetting my system back to previous state and changing the check as below, I get the expected result:

debug: //mail::nullclient/Service[postfix]: Changing ensure
debug: //mail::nullclient/Service[postfix]: 1 change(s)
debug: Service[postfix](provider=launchd): Executing 'launchctl load -w /System/Library/LaunchDaemons/org.postfix.master.plist'
notice: //mail::nullclient/Service[postfix]/ensure: ensure changed 'stopped' to 'running'

So it seems ‘enabled’ (in puppet’s view of things) but ‘stopped’ may still need -w to overcome a ‘disabled’ key in launchd’s little mind. What do you think?

--- a/lib/puppet/provider/service/launchd.rb
+++ b/lib/puppet/provider/service/launchd.rb
@@ -163,7 +163,7 @@ Puppet::Type.type(:service).provide :launchd, :parent => :base do
         did_enable_job = false
         cmds = []
         cmds << :launchctl << :load
-        if self.enabled? == :false  # launchctl won't load disabled jobs
+        if self.enabled? == :false || self.status == :stopped # launchctl won't load disabled jobs
             cmds << "-w"
             did_enable_job = true
         end

postfix-launchd.plist (423 Bytes) eric sorenson, 11/03/2009 05:28 pm

History

#1 Updated by James Turnbull almost 5 years ago

  • Status changed from Unreviewed to Accepted
  • Target version set to 0.25.2

#2 Updated by Nigel Kersten almost 5 years ago

Can you post the contents of the plist too please eric?

Was this on 10.5 or 10.6 ?

#3 Updated by eric sorenson almost 5 years ago

Ack, sorry I should have included this straight off. Attached.

It’s on 10.6, I’ll try to repro on 10.5 as well.

#4 Updated by Nigel Kersten almost 5 years ago

That would be awesome Eric. There are some differences in 10.5 and 10.6 with launchd that mean this might only be a 10.6 bug.

#5 Updated by eric sorenson almost 5 years ago

Leopard appears to behave correctly. Same plist, same manifest, no code changes, produced this on first run:

debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list'
notice: //mail::nullclient/Service[postfix]: Triggering 'refresh' from 1 dependencies
debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list'
debug: Service[postfix](provider=launchd): Executing 'launchctl unload -w /System/Library/LaunchDaemons/org.postfix.master'
debug: Service[postfix](provider=launchd): Executing 'launchctl load /System/Library/LaunchDaemons/org.postfix.master'
debug: Finishing transaction 14181260 with 1 changes
debug: Storing state

qmgr and pickup are running as expected at the end of it.

#6 Updated by Nigel Kersten almost 5 years ago

(sent via email to Eric while projects.reductivelabs was down)

There were a couple of things that struck me about your plist.

a) you’re replacing a system plist, this is a bad idea due to OS updates potentially stomping over it. Better to leave the system one disabled and make a new one. b) you should use your own reverse domain notation for the label so it’s a separate job to the system one. c) It should go into /Library/LaunchDaemons, not /System/Library/LaunchDaemons

Even so… I can’t seem to replicate this, although I messed up while testing and deleted the original postfix plist, so I need to go do some digging to retrieve it.

Did you possibly have the system version already running and then dropped your plist on top of it? That’s about the only scenario I can think of that might produce this behaviour that I haven’t tested.

The provider should cope with the actual job plist changing, but I can see there might be problems with certain kinds of changes…

Eric is hoping to get to this next week, and this weekend I’ll run through some more tests with the various scenarios.

#7 Updated by Luke Kanies almost 5 years ago

Any progress on this? Sounds like you guys haven’t quite decided on the ultimate answer.

#8 Updated by Nigel Kersten almost 5 years ago

  • Target version changed from 0.25.2 to 4

I can’t reproduce it, but am not yet sure that there isn’t an issue somewhere.

I’ve switched the target version to unplanned until I get more info back or can reproduce the issue, and will make a target decision then.

#9 Updated by Nigel Kersten almost 5 years ago

To clarify, I feel like there might be a cluster of issues around replacing an existing plist, where the original job is loaded and has completely different characteristics to the new plist, but I haven’t had the time to go through all the possible permutations.

We may be able to do some fiddling around with getting launchctl info for a particular job (for 10.5 and higher) before we decide how to apply desired state, but it’s going to be rather complicated, so I’d like to find the common cases first.

#10 Updated by eric sorenson almost 5 years ago

I think Nigel’s update in #4 is right in that I am doing enough questionable things to make this an edge-case. His point about not replacing a system plist is a fair one, but rather disabling it and adding our own in the non-OS-controlled /Library/LaunchDaemons area is a good one.

I’d be OK with closing this as unable-to-repro or PEBCAK or whatever.

–=Eric

#11 Updated by eric sorenson almost 5 years ago

eric sorenson wrote:

I think Nigel’s update in Comment 4 is right in that I am doing enough questionable things to make this an edge-case.

I’ve spent some more time with this (and just clobbered my own update to the ticket! argh) and now I do think we ought to be more aggressive about using ‘launchctl load -w’ because, at least on snow leopard, my newly-added non-conflicting plists are not getting loaded otherwise. Using ‘-w’ doesn’t hurt anything if it’s already enabled AFAIK.

debug: Puppet::Type::Service::ProviderLaunchd: Executing '/bin/launchctl list'
debug: //puppet-client::service/Service[puppetd]: Changing ensure
debug: //puppet-client::service/Service[puppetd]: 1 change(s)
debug: Service[puppetd](provider=launchd): Executing 'launchctl load /Library/LaunchDaemons/com.reductivelabs.puppet.plist'
err: //puppet-client::service/Service[puppetd]/ensure: change from stopped to running failed: Unable to start service:
 com.reductivelabs.puppet at path: /Library/LaunchDaemons/com.reductivelabs.puppet.plist

Here’s the shell equivalents and the plist:

[root@leterel /usr/local/sbin]# launchctl load /Library/LaunchDaemons/com.reductivelabs.puppet.plist 
nothing found to load
[root@leterel /usr/local/sbin]# echo $?
1
[root@leterel /usr/local/sbin]# launchctl load -w /Library/LaunchDaemons/com.reductivelabs.puppet.plist
[root@leterel /usr/local/sbin]# cat /Library/LaunchDaemons/com.reductivelabs.puppet.plist 




    EnvironmentVariables
    
        PATH
        /usr/local/bin:/usr/local/sbin:/bin:/usr/bin:/sbin:/usr/sbin
    
    Label
    com.reductivelabs.puppet
    OnDemand
    
    ProgramArguments
    
        puppetd
        --verbose
        --logdest=/var/puppet/log/puppetd.log
        --no-daemonize
                --color=false
    
    RunAtLoad
    
    StandardErrorPath
    /var/puppet/log/puppetd.err
    StandardOutPath
    /var/puppet/log/puppetd.out



#12 Updated by Nigel Kersten almost 5 years ago

I’m completely swamped at work at the moment Eric, and I’d be overjoyed to take a patch from you on this with tests :)

#13 Updated by Nigel Kersten over 4 years ago

Eric, you sent a patch to fix this didn’t you?

#14 Updated by eric sorenson over 4 years ago

I’ve just been using the one-line patch inline at the problem description with no ill effects – I haven’t pushed a real git commit.

#15 Updated by Gary Larizza over 4 years ago

Eric,

Have you seen any ill-effects out of the change in launchd.rb? I’m seeing this pop up and am wondering if you’ve had good luck with that fix.

#16 Updated by eric sorenson over 4 years ago

Gary Larizza wrote:

Eric,

Have you seen any ill-effects out of the change in launchd.rb? I’m seeing this pop up and am wondering if you’ve had good luck with that fix.

Nope it’s been fine.

#17 Updated by James Turnbull over 3 years ago

  • Status changed from Accepted to Needs Decision

#18 Updated by Nigel Kersten about 3 years ago

  • Status changed from Needs Decision to Tests Insufficient
  • Priority changed from Normal to High
  • Target version changed from 4 to 2.7.x

It looks to me like we need some tests around this?

#19 Updated by James Turnbull almost 3 years ago

  • Assignee changed from Nigel Kersten to Gary Larizza

Gary – any chance you can look at the tests for this in your copious spare time? :)

#20 Updated by Gary Larizza almost 3 years ago

Sure – thanks for the reminder on it :) I just submitted a bunch of tests to launchd, and I guess I should probably work on it while it’s fresh in my head :)

#21 Updated by Gary Larizza almost 3 years ago

  • Branch set to https://github.com/puppetlabs/puppet/pull/324

I updated the code and created spec tests to account for this. Eric, take a look and see if it meets the condition you’re encountering.

#22 Updated by Anonymous almost 3 years ago

  • Status changed from Tests Insufficient to Merged - Pending Release
  • Target version changed from 2.7.x to 2.7.10

#23 Updated by Michael Stahnke almost 3 years ago

  • Status changed from Merged - Pending Release to Closed

released in 2.7.10rc1

#24 Updated by Charlie Sharpsteen over 1 year ago

  • Keywords changed from osx launchd to osx launchd customer

Also available in: Atom PDF