Bug #12082

systemu returns nil in some cases which causes run() to raise an exception

Added by David Gordon 4 months ago. Updated 3 months ago.

Status:Closed Start date:01/23/2012
Priority:Normal Due date:
Assignee:R.I. Pienaar % Done:

0%

Category:-
Target version:1.3.x
Keywords: Affected mCollective version:
Branch:
Votes: 0

Description

shell.rb has a def, runcommand which uses systemu to run a shell command. The systemu utility seems to do a fair amount of setup and eventually run the command, and also print the child PID to the pipe, which gets fed back to shell.rb.

The def launches a second thread to sleep for 1 sec and then wait on the child PID (as supplied by systemu). I believe that sometimes the systemu hasn’t finished setting up the child process by that time, the wait returns, then the def returns without valid status.

Essentially I get the following errors in the client when running commands that take a little while to start (WAN factor):

"undefined method `exitstatus' for nil:NilClass {hostname}                            : OK"

If I change the sleep to e.g. 30 seconds, it seems to work more reliably.

MCollective version is 1.3.2

History

Updated by R.I. Pienaar 4 months ago

  • Status changed from Unreviewed to Accepted
  • Assignee set to R.I. Pienaar
  • Target version set to 1.3.x

Updated by R.I. Pienaar 4 months ago

  • Status changed from Accepted to Needs More Information

I looked at this now in some more detail than when I first put in the sleep 1 and still cannot really figure out why that sleep actually helps matters.

Looking at systemu source I can see that at times it would return nil as exitstatus which would explain your behavior. But I cannot see how sleeping really helps here as in systemu it seems there is no way the thread with the waitpid will be created while the setup is still ongoing – it’s going to happen after.

But if anything goes wrong with setting up the child process the block does not get called, @status never get set and we have a nil that #exitstatus will be called on.

Maybe you had a different insight into the systemu code – its pretty complex code – can you clarify why you think its specifically the setup thats slow and causing it?

Does setting it to 30 seconds solve it or make it more stable? I want to add some debugging code to see if we can narrow it down or not but it seems we just have to find a way to gracefully handle nil return code from systemu anyway

Updated by David Gordon 3 months ago

Hi,

I’m new to Ruby, so had a bit of difficulty really grasping what was going on in systemu. It did seem to me that given the pipes, flushing, there was scope for there to be a little time between the shell cmd setup and the pid being filtered back to the calling process..

I was writing a puppeta plugin (to support puppet runs performed on many clients as puppet agents, rather than as puppetd. Puppetd itself didn’t seem to support the args I needed, like alternative var dirs, specific puppet environments). Our environment was a little odd at the time, in that the puppetmaster is in London, as are our dev servers but our app’s puppet modules were on a home directory on a Zürich-based filer. As a result, the puppet runs would take a little time to get going – most of the ruby stuff was in London, but the puppet agent running in London needed longer to load the app’s manifests & modules from the Zürich filer. Other, simpler MCollective agents running shell commands worked just fine, and when I saw that sleep, warning bells went off. Changing from 1 sec to 30 made the difference between it working 10-20% of the time to 100% of the time. I can’t say why unfortunately.

As you say, I think graceful handling of a nil from systemu is required, but at the same time, I don’t see why it should return nil.

Hope this helps a little

Dave


From: tickets@puppetlabs.com [mailto:tickets@puppetlabs.com] 
Sent: 01 February 2012 20:10
To: rip@devco.net; james@lovedthanlost.net; nan@puppetlabs.com; Gordon, David
Subject: [MCollective - Bug #12082] (Needs More Information) shell command synchronization problem in RPC::Agent?


Issue #12082 has been updated by R.I. Pienaar. 

*   Status changed from Accepted to Needs More Information 

I looked at this now in some more detail than when I first put in the sleep 1 and still cannot really figure out why that sleep actually helps matters.

Looking at systemu source I can see that at times it would return nil as exitstatus which would explain your behavior. But I cannot see how sleeping really helps here as in systemu it seems there is no way the thread with the waitpid will be created while the setup is still ongoing - it's going to happen after.

But if anything goes wrong with setting up the child process the block does not get called, @status never get set and we have a nil that #exitstatus will be called on.

Maybe you had a different insight into the systemu code - its pretty complex code - can you clarify why you think its specifically the setup thats slow and causing it?

Does setting it to 30 seconds solve it or make it more stable? I want to add some debugging code to see if we can narrow it down or not but it seems we just have to find a way to gracefully handle nil return code from systemu anyway

Bug #12082: shell command synchronization problem in RPC::Agent? <https://projects.puppetlabs.com/issues/12082> 


*   Author: David Gordon 
*   Status: Needs More Information 
*   Priority: Normal 
*   Assignee: R.I. Pienaar 
*   Category: 
*   Target version: 1.3.x 
*   Keywords: 
*   Branch: 
*   Affected mCollective version: 

shell.rb has a def, runcommand which uses systemu to run a shell command. The systemu utility seems to do a fair amount of setup and eventually run the command, and also print the child PID to the pipe, which gets fed back to shell.rb.

The def launches a second thread to sleep for 1 sec and then wait on the child PID (as supplied by systemu). I believe that sometimes the systemu hasn't finished setting up the child process by that time, the wait returns, then the def returns without valid status.

Essentially I get the following errors in the client when running commands that take a little while to start (WAN factor):

"undefined method `exitstatus' for nil:NilClass {hostname}                            : OK"


If I change the sleep to e.g. 30 seconds, it seems to work more reliably.

MCollective version is 1.3.2

You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://projects.puppetlabs.com/my/account

Updated by Anonymous 3 months ago

Hi David,

Would it be possible for you to give us the any stack traces in your /var/log/mcollective.log?

Updated by David Gordon 3 months ago

Hi,

I augmented shell.rb with a couple of log statements, resulting in:

    # Actually does the systemu call passing in the correct environment,
stdout and stderr
    def runcommand
      opts = {"env"    => @environment,
        "stdout" => @stdout,
        "stderr" => @stderr,
        "cwd"    => @cwd}
 
      opts["stdin"] = @stdin if @stdin
 
      # Running waitpid on the cid here will start a thread
      # with the waitpid in it, this way even if the thread
      # that started this process gets killed due to agent
      # timeout or such there will still be a waitpid waiting
      # for the child to exit and not leave zombies.
Log.info("running #@command")      
      @status = systemu(@command, opts) do |cid|
        begin
         pre_cid = cid
         sleep 1
         Process::waitpid(cid)
Log.info("waited on #{cid} #{pre_cid}")      
        rescue SystemExit
Log.info("SystemExit..")      
        rescue Errno::ECHILD
Log.info("Errno..ECHILD")      
        rescue Exception => e
          Log.info("Unexpected exception received while waiting for
child process: #{e.class}: #{e}")
        end
Log.info("returned2 #@status")      
      end
 

..gave the following output:

fxcomsd@xldn1290vdap ~/mcollective-1.3.2$  mc-puppeta -I xldn1290vdap
xldn1290vdap                            : undefined method `exitstatus'
for nil:NilClass
    undefined method `exitstatus' for nil:NilClass
/home/fxcomsd/git/mcollective/scripts/mc-puppeta:12:in `block in
': undefined method `[]' for nil:NilClass (NoMethodError) from /home/fxcomsd/git/mcollective/scripts/mc-puppeta:10:in `each' from /home/fxcomsd/git/mcollective/scripts/mc-puppeta:10:in `
'

..and the following log:

 
# Logfile created on 2012-02-06 10:56:05 +0000 by logger.rb/25413
W, [2012-02-06T10:56:05.074198 #5994]  WARN -- : config.rb:119:in `block
in loadconfig' Cannot find libdir:
D, [2012-02-06T10:56:05.081009 #5994] DEBUG -- : pluginmanager.rb:107:in
`loadclass' Loading Mcollective::Facts::Yaml_facts from
mcollective/facts/yaml_facts.rb
D, [2012-02-06T10:56:05.226961 #5994] DEBUG -- : pluginmanager.rb:44:in
`<<' Registering plugin facts_plugin with class
MCollective::Facts::Yaml_facts single_instance: true
D, [2012-02-06T10:56:05.227118 #5994] DEBUG -- : pluginmanager.rb:107:in
`loadclass' Loading Mcollective::Connector::Stomp from
mcollective/connector/stomp.rb
D, [2012-02-06T10:56:05.250944 #5994] DEBUG -- : pluginmanager.rb:44:in
`<<' Registering plugin connector_plugin with class
MCollective::Connector::Stomp single_instance: true
D, [2012-02-06T10:56:05.251195 #5994] DEBUG -- : pluginmanager.rb:107:in
`loadclass' Loading Mcollective::Security::Psk from
mcollective/security/psk.rb
D, [2012-02-06T10:56:05.273737 #5994] DEBUG -- : pluginmanager.rb:44:in
`<<' Registering plugin security_plugin with class
MCollective::Security::Psk single_instance: true
D, [2012-02-06T10:56:05.278802 #5994] DEBUG -- : pluginmanager.rb:107:in
`loadclass' Loading Mcollective::Registration::Agentlist from
mcollective/registration/agentlist.rb
D, [2012-02-06T10:56:05.302985 #5994] DEBUG -- : pluginmanager.rb:44:in
`<<' Registering plugin registration_plugin with class
MCollective::Registration::Agentlist single_instance: true
D, [2012-02-06T10:56:05.311563 #5994] DEBUG -- : pluginmanager.rb:47:in
`<<' Registering plugin global_stats with class MCollective::RunnerStats
single_instance: true
I, [2012-02-06T10:56:05.311629 #5994]  INFO -- : mcollectived.rb:31:in
`
' The Marionette Collective 1.3.2 started logging at debug level D, [2012-02-06T10:56:05.311679 #5994] DEBUG -- : mcollectived.rb:43:in `
' Starting in the background (true) D, [2012-02-06T10:56:05.336140 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin global_stats with class MCollective::RunnerStats D, [2012-02-06T10:56:05.336433 #6002] DEBUG -- : pluginmanager.rb:80:in `[]' Returning new plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:05.336786 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin global_stats with class MCollective::RunnerStats D, [2012-02-06T10:56:05.336938 #6002] DEBUG -- : pluginmanager.rb:80:in `[]' Returning new plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.337186 #6002] DEBUG -- : stomp.rb:123:in `connect' Connecting to xldn1290vdap:6163 D, [2012-02-06T10:56:05.368970 #6002] DEBUG -- : agents.rb:26:in `loadagents' Reloading all agents from disk D, [2012-02-06T10:56:05.373625 #6002] DEBUG -- : agents.rb:99:in `block in findagentfile' Found deja at /home/fxcomsd/git/mcollective/plugins/mcollective/agent/deja.rb D, [2012-02-06T10:56:05.373752 #6002] DEBUG -- : pluginmanager.rb:107:in `loadclass' Loading MCollective::Agent::Deja from mcollective/agent/deja.rb D, [2012-02-06T10:56:05.429231 #6002] DEBUG -- : agent.rb:147:in `activate?' Starting default activation checks for deja D, [2012-02-06T10:56:05.429382 #6002] DEBUG -- : pluginmanager.rb:44:in `<<' Registering plugin deja_agent with class MCollective::Agent::Deja single_instance: false D, [2012-02-06T10:56:05.435953 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.436076 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.deja.command D, [2012-02-06T10:56:05.437005 #6002] DEBUG -- : agents.rb:99:in `block in findagentfile' Found puppeta at /home/fxcomsd/git/mcollective/plugins/mcollective/agent/puppeta.rb D, [2012-02-06T10:56:05.437126 #6002] DEBUG -- : pluginmanager.rb:107:in `loadclass' Loading MCollective::Agent::Puppeta from mcollective/agent/puppeta.rb D, [2012-02-06T10:56:05.446353 #6002] DEBUG -- : agent.rb:147:in `activate?' Starting default activation checks for puppeta D, [2012-02-06T10:56:05.446510 #6002] DEBUG -- : pluginmanager.rb:44:in `<<' Registering plugin puppeta_agent with class MCollective::Agent::Puppeta single_instance: false D, [2012-02-06T10:56:05.446609 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.446715 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.puppeta.command D, [2012-02-06T10:56:05.454751 #6002] DEBUG -- : agents.rb:99:in `block in findagentfile' Found rpcutil at /home/fxcomsd/mcollective-1.3.2/plugins/mcollective/agent/rpcutil.rb D, [2012-02-06T10:56:05.454876 #6002] DEBUG -- : pluginmanager.rb:107:in `loadclass' Loading MCollective::Agent::Rpcutil from mcollective/agent/rpcutil.rb D, [2012-02-06T10:56:05.458178 #6002] DEBUG -- : agent.rb:147:in `activate?' Starting default activation checks for rpcutil D, [2012-02-06T10:56:05.458284 #6002] DEBUG -- : pluginmanager.rb:44:in `<<' Registering plugin rpcutil_agent with class MCollective::Agent::Rpcutil single_instance: false D, [2012-02-06T10:56:05.458359 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.458438 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.rpcutil.command D, [2012-02-06T10:56:05.460593 #6002] DEBUG -- : agents.rb:99:in `block in findagentfile' Found discovery at /home/fxcomsd/mcollective-1.3.2/plugins/mcollective/agent/discovery.rb D, [2012-02-06T10:56:05.460684 #6002] DEBUG -- : pluginmanager.rb:107:in `loadclass' Loading MCollective::Agent::Discovery from mcollective/agent/discovery.rb D, [2012-02-06T10:56:05.463579 #6002] DEBUG -- : agents.rb:86:in `activate_agent?' MCollective::Agent::Discovery does not have an activate? method, activating as default D, [2012-02-06T10:56:05.463671 #6002] DEBUG -- : pluginmanager.rb:44:in `<<' Registering plugin discovery_agent with class MCollective::Agent::Discovery single_instance: true D, [2012-02-06T10:56:05.463756 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.463844 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.discovery.command D, [2012-02-06T10:56:05.465437 #6002] DEBUG -- : agents.rb:99:in `block in findagentfile' Found shellcmd at /home/fxcomsd/mcollective-1.3.2/plugins/mcollective/agent/shellcmd.rb D, [2012-02-06T10:56:05.465522 #6002] DEBUG -- : pluginmanager.rb:107:in `loadclass' Loading MCollective::Agent::Shellcmd from mcollective/agent/shellcmd.rb D, [2012-02-06T10:56:05.468129 #6002] DEBUG -- : agent.rb:147:in `activate?' Starting default activation checks for shellcmd D, [2012-02-06T10:56:05.468215 #6002] DEBUG -- : pluginmanager.rb:44:in `<<' Registering plugin shellcmd_agent with class MCollective::Agent::Shellcmd single_instance: false D, [2012-02-06T10:56:05.468300 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.468377 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.shellcmd.command D, [2012-02-06T10:56:05.468493 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:05.468561 #6002] DEBUG -- : stomp.rb:201:in `subscribe' Subscribing to /topic/mcollective.mcollective.command D, [2012-02-06T10:56:05.468693 #6002] DEBUG -- : stomp.rb:169:in `receive' Waiting for a message from Stomp D, [2012-02-06T10:56:11.450405 #6002] DEBUG -- : runnerstats.rb:49:in `received' Incrementing total stat D, [2012-02-06T10:56:11.450537 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:11.450762 #6002] DEBUG -- : runnerstats.rb:38:in `validated' Incrementing validated stat D, [2012-02-06T10:56:11.450834 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:11.450963 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:11.451077 #6002] DEBUG -- : base.rb:146:in `validate_filter?' Message passed the filter checks D, [2012-02-06T10:56:11.451137 #6002] DEBUG -- : runnerstats.rb:26:in `passed' Incrementing passed stat D, [2012-02-06T10:56:11.451200 #6002] DEBUG -- : runner.rb:87:in `agentmsg' Handling message for agent 'discovery' on collective 'mcollective' D, [2012-02-06T10:56:11.451258 #6002] DEBUG -- : agents.rb:130:in `dispatch' Dispatching a message to agent discovery D, [2012-02-06T10:56:11.451339 #6002] DEBUG -- : stomp.rb:169:in `receive' Waiting for a message from Stomp D, [2012-02-06T10:56:11.451445 #6002] DEBUG -- : pluginmanager.rb:80:in `[]' Returning new plugin discovery_agent with class MCollective::Agent::Discovery D, [2012-02-06T10:56:11.451663 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:11.451736 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:11.451828 #6002] DEBUG -- : base.rb:161:in `create_reply' Encoded a message for request 41b0782e37c96b70b51e76caea5d3a25 D, [2012-02-06T10:56:11.451960 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:11.452121 #6002] DEBUG -- : stomp.rb:190:in `publish' Sending a broadcast message to STOMP target '/topic/mcollective.discovery.reply' D, [2012-02-06T10:56:11.452733 #6002] DEBUG -- : runnerstats.rb:56:in `block in sent' Incrementing replies stat D, [2012-02-06T10:56:27.042116 #6002] DEBUG -- : runnerstats.rb:49:in `received' Incrementing total stat D, [2012-02-06T10:56:27.042263 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:27.042402 #6002] DEBUG -- : runnerstats.rb:38:in `validated' Incrementing validated stat D, [2012-02-06T10:56:27.042480 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:27.042581 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:27.042733 #6002] DEBUG -- : base.rb:110:in `block (2 levels) in validate_filter?' Passing based on agent puppeta D, [2012-02-06T10:56:27.042819 #6002] DEBUG -- : base.rb:135:in `block in validate_filter?' Passing based on identity D, [2012-02-06T10:56:27.042881 #6002] DEBUG -- : base.rb:146:in `validate_filter?' Message passed the filter checks D, [2012-02-06T10:56:27.042936 #6002] DEBUG -- : runnerstats.rb:26:in `passed' Incrementing passed stat D, [2012-02-06T10:56:27.042991 #6002] DEBUG -- : runner.rb:87:in `agentmsg' Handling message for agent 'puppeta' on collective 'mcollective' D, [2012-02-06T10:56:27.043044 #6002] DEBUG -- : agents.rb:130:in `dispatch' Dispatching a message to agent puppeta D, [2012-02-06T10:56:27.043109 #6002] DEBUG -- : stomp.rb:169:in `receive' Waiting for a message from Stomp D, [2012-02-06T10:56:27.043185 #6002] DEBUG -- : pluginmanager.rb:88:in `[]' Returning new plugin puppeta_agent with class MCollective::Agent::Puppeta I, [2012-02-06T10:56:27.043322 #6002] INFO -- : agent.rb:59:in `initialize' timeout=60 D, [2012-02-06T10:56:27.054505 #6002] DEBUG -- : ddl.rb:59:in `block in findddlfile' Found puppeta ddl at /home/fxcomsd/git/mcollective/plugins/mcollective/agent/puppeta.ddl I, [2012-02-06T10:56:27.093575 #6002] INFO -- : shell.rb:75:in `runcommand' running /usr/bin/puppet agent --vardir=/sbclocal/app/fxcoms/var --environment=fxcoms --server=puppetmaster --verbose --test --report --color=false I, [2012-02-06T10:56:28.460579 #6002] INFO -- : shell.rb:88:in `rescue in block in runcommand' Errno..ECHILD I, [2012-02-06T10:56:28.460863 #6002] INFO -- : shell.rb:92:in `block in runcommand' returned2 D, [2012-02-06T10:56:28.461559 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:28.461667 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:56:28.461814 #6002] DEBUG -- : base.rb:161:in `create_reply' Encoded a message for request 7027ef8df5da7bc4aadd7d1fd15a193f D, [2012-02-06T10:56:28.461963 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:56:28.462065 #6002] DEBUG -- : stomp.rb:190:in `publish' Sending a broadcast message to STOMP target '/topic/mcollective.puppeta.reply' D, [2012-02-06T10:56:28.462284 #6002] DEBUG -- : runnerstats.rb:56:in `block in sent' Incrementing replies stat D, [2012-02-06T10:57:26.847634 #6002] DEBUG -- : runnerstats.rb:49:in `received' Incrementing total stat D, [2012-02-06T10:57:26.847803 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:57:26.847991 #6002] DEBUG -- : runnerstats.rb:38:in `validated' Incrementing validated stat D, [2012-02-06T10:57:26.848079 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:57:26.848230 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:57:26.848404 #6002] DEBUG -- : base.rb:110:in `block (2 levels) in validate_filter?' Passing based on agent puppeta D, [2012-02-06T10:57:26.848505 #6002] DEBUG -- : base.rb:135:in `block in validate_filter?' Passing based on identity D, [2012-02-06T10:57:26.848571 #6002] DEBUG -- : base.rb:146:in `validate_filter?' Message passed the filter checks D, [2012-02-06T10:57:26.848630 #6002] DEBUG -- : runnerstats.rb:26:in `passed' Incrementing passed stat D, [2012-02-06T10:57:26.848715 #6002] DEBUG -- : runner.rb:87:in `agentmsg' Handling message for agent 'puppeta' on collective 'mcollective' D, [2012-02-06T10:57:26.848786 #6002] DEBUG -- : agents.rb:130:in `dispatch' Dispatching a message to agent puppeta D, [2012-02-06T10:57:26.848872 #6002] DEBUG -- : stomp.rb:169:in `receive' Waiting for a message from Stomp D, [2012-02-06T10:57:26.848991 #6002] DEBUG -- : pluginmanager.rb:88:in `[]' Returning new plugin puppeta_agent with class MCollective::Agent::Puppeta I, [2012-02-06T10:57:26.849151 #6002] INFO -- : agent.rb:59:in `initialize' timeout=60 D, [2012-02-06T10:57:26.849298 #6002] DEBUG -- : ddl.rb:59:in `block in findddlfile' Found puppeta ddl at /home/fxcomsd/git/mcollective/plugins/mcollective/agent/puppeta.ddl I, [2012-02-06T10:57:26.850575 #6002] INFO -- : shell.rb:75:in `runcommand' running /usr/bin/puppet agent --vardir=/sbclocal/app/fxcoms/var --environment=fxcoms --server=puppetmaster --verbose --test --report --color=false I, [2012-02-06T10:57:27.941580 #6002] INFO -- : shell.rb:84:in `block in runcommand' waited on 6250 6250 I, [2012-02-06T10:57:27.941955 #6002] INFO -- : shell.rb:92:in `block in runcommand' returned2 D, [2012-02-06T10:57:27.943477 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:57:27.943635 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin security_plugin with class MCollective::Security::Psk D, [2012-02-06T10:57:27.943895 #6002] DEBUG -- : base.rb:161:in `create_reply' Encoded a message for request 8d835a6714c04a9bf83082c2491faab5 D, [2012-02-06T10:57:27.944135 #6002] DEBUG -- : pluginmanager.rb:83:in `[]' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-02-06T10:57:27.944273 #6002] DEBUG -- : stomp.rb:190:in `publish' Sending a broadcast message to STOMP target '/topic/mcollective.puppeta.reply' D, [2012-02-06T10:57:27.944740 #6002] DEBUG -- : runnerstats.rb:56:in `block in sent' Incrementing replies stat

mc-puppeta looks like:

#!/usr/bin/env ruby
require 'mcollective'
 
include MCollective::RPC
 
mc = rpcclient("puppeta")
mc.progress = false
 
exitcodes = 0
mc.runonce().each do |resp|
   printrpc(resp, :verbose => true)
   exitcodes = exitcodes | resp[:data][:exitstatus]
end
 
printrpcstats
mc.disconnect
 
if exitcodes == 2
  exit(0)
else
  exit(exitcodes)
end

The puppeta agent itself looks like:

module MCollective
  module Agent
    # An agent to manage the Puppet Daemon
    #
    # Configuration Options:
    #    puppetd.splaytime - Number of seconds within which to splay; no
splay
    #
    class Puppeta "Puppet Controller Agent",
                  :description => "Run puppet agent",
                  :author      => "....",
                  :license     => "....",
                  :version     => "1.0",
                  :url         => "...",
                  :timeout     => 60
 
       def startup_hook
         @puppet = @config.pluginconf["puppeta.puppet"] ||
"/usr/bin/puppet"
         @puppet_vardir = @config.pluginconf["puppeta.vardir"] ||
"/var/puppet"
         @puppet_environment = @config.pluginconf["puppeta.environment"]
|| "??"
         @puppet_server = @config.pluginconf["puppeta.server"] ||
"puppetmaster"
       end
 
      action "runonce" do
        vardir = "--vardir=#@puppet_vardir"
        environment = "--environment=#@puppet_environment"
        server = "--server=#@puppet_server"
 
        cmd = [@puppet, "agent", vardir, environment, server,
"--verbose", "--test", "--report", "--color=false"]
        cmd = cmd.join(" ")
        
        reply[:exitstatus] = run(cmd, :stdout => :out, :stderr =>
:error)
      end
    end
  end
end

Updated by R.I. Pienaar 3 months ago

  • Subject changed from shell command synchronization problem in RPC::Agent? to systemu returns nil in some cases which causes run() to raise an exception
  • Status changed from Needs More Information to Accepted

hello,

We cannot reproduce this, i have a few comments in general though:

  • the puppetd agent can take the arguments you mention, there really is no need for you to make a new agent, you can simply pass in the config option to change the puppetd command for the existing agent into “puppet agent” with your arguments
  • you are running puppet agent —test, you should not do this, this runs the puppet in the foreground attached to the mcollective process, this is not recommended and probably why you are getting strange results. The puppetd agent runs the puppet daemon dissociated from mcollective and you should do the same

We will though handle a nil there and just return -1 in the case where the command that gets run terminated unexpectedly. But we can’t figure out why increasing the sleep would do anything useful, I think your puppet agent command is dying or raising some exception that systemu just is not handling and so it returns nil which is expected behavior

Updated by R.I. Pienaar 3 months ago

  • Status changed from Accepted to Closed

We now handle failures from systemu and just return a status code of -1, also added test coverage for MC::RPC::Agent#run

http://git.io/8y9qOw 5 files changed, 101 insertions(+), 27 deletions(–)

Also available in: Atom PDF