Bug #11372
Race condition between certificate request and signing
| Status: | Investigating | Start date: | 12/13/2011 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | % Done: | 0% | ||
| Category: | cloudpack | Spent time: | - | |
| Target version: | - | |||
| Keywords: | aws, cloudpack, certificate, signing | Branch: | ||
Description
After launching an instance through the cloud provisioner, it completes nearly all the steps (installs Puppet and some other stuff since we have a custom script) until the signing happens, which results in the following error:
err: Signing certificate ... Failed err: Signing certificate error: Could not render to pson: private method `gsub' called for nil:NilClass err: exit err: Try 'puppet help node_aws bootstrap' for usage
The certificate request is still pending on the puppet master. If I leave it pending and launch another instance with the same name, it signs successfully. It looks like this is due to a delay in the time that the request appears on the puppet master. The cloud provisioner tries to sign the certificate before it is available.
As a work around, I edited the start_puppet function in the cloudpack script to sleep for 10 seconds after the startup script runs. This probably isn’t the most elegant solution. Perhaps cloudpack can check if the certificate is available and either delay and check again before exiting with a friendly error?
History
#1
Updated by Daniel Pittman over 1 year ago
- Description updated (diff)
- Status changed from Unreviewed to Investigating
- Assignee set to Jeff McCune
Jeff, can you take a look at this, please.
#2
Updated by Jeff McCune over 1 year ago
Ben,
Could you let me know the command(s) you’re using to launch the instance and sign the certificate. This will help me reproduce your problem quickly.
Thanks, -Jeff
#3
Updated by Ben Whaley over 1 year ago
Sorry Jeff, I missed this update.
I think you’ll have a hard time reproducing this. It’s intermittent and could possibly occur due to latency within AWS. However, per your request, this command caused the issue today when using an unpatched deployment of cloudpack:
puppet node_aws bootstrap
—credential
#4
Updated by Jonathan Spinks about 1 year ago
Just want to add a +1 one to this error. We are reliably seeing it occur when provisioning hosts in an AWS VPC in Singapore.
Adding a sleep in /opt/puppet/lib/site_ruby/1.8/puppet/cloudpack.rb before signing the certificate seems to have fixed the problem.
I’ll report back if it reappears.
#5
Updated by John Painter 12 months ago
+1 as detailed above by Jon. Have again hit this issue again while provisioning new environments in AWS Singapore.
#6
Updated by Colin van Niekerk 12 months ago
+1 please
#7
Updated by Mark Stanislav 11 months ago
Running into this as well.
#8
Updated by Jeff Blaine 6 months ago
I’ve hit the same problem in us-east. Adding a ‘sleep 20’ to the cloudpack.rb file only bought me a new error.
sudo RUBYLIB=/etc/puppet/modules/cloud_provisioner/lib puppet node_aws bootstrap \
--group=hadoop-nodes \
--keyname=jblaine \
--image=ami-3d4ff254 \
--type=t1.micro \
--puppet-version=2.7.20 \
--login=ubuntu \
--keyfile=/home/jblaine/.ssh/jblaine-bld.pem \
--server=<REDACTED> \
--verbose \
--debug
...
debug: Puppet installation finished!
debug: SSH Command Exit Code: 0
info: Executing remote command ... Done
info: Executing remote command ...
debug: Command: sudo puppet agent --configprint certname
debug: ec2-...blahblah
debug: SSH Command Exit Code: 0
info: Executing remote command ... Done
notice: Puppet is now installed on: ec2-...blahblah
notice: No classification method selected
[ SLEEPING 20 ]
notice: Signing certificate ...
...
warning: Signing certificate ... Failed
err: Signing certificate error: Error 400 on SERVER: Cannot sign for host ec2-...blahblah without a certificate request
And I am using ‘autosign = true’ on the master, BTW, to keep things simple right now.
#9
Updated by Jeff Blaine 6 months ago
It would appear there is some double-signing attempt happening? My Puppet run on the newly bootstrapped node continues to work fine (resources installed, etc) even with any of the errors above.
If I comment out the whole block below from cloudpack.rb, I get no error (of course), and everything continues to work fine indicating that the certs are all fine and the master is pleased.
# Puppet.notice "Sleeping 20 ..."
# sleep 20
#
# Puppet.notice "Signing certificate ..."
# begin
# Puppet::Face[:certificate, '0.0.1'].sign(certname, cert_options)
# Puppet.notice "Signing certificate ... Done"
# rescue Puppet::Error => e
# # TODO: Write useful next steps.
# Puppet.err "Signing certificate ... Failed"
# Puppet.err "Signing certificate error: #{e}"
# exit(1)
# rescue Net::HTTPError => e
# # TODO: Write useful next steps
# Puppet.warning "Signing certificate ... Failed"
# Puppet.err "Signing certificate error: #{e}"
# exit(1)
# end
The bootstrap:
jblaine@master$ sudo RUBYLIB=/etc/puppet/modules/cloud_provisioner/lib puppet node_aws bootstrap \
--group=hadoop-nodes \
--keyname=jblaine \
--image=ami-3d4ff254 \
--type=t1.micro \
--puppet-version=2.7.20 \
--login=ubuntu \
--keyfile=/home/jblaine/.ssh/jblaine-bld.pem \
--server=<REDACTED> \
--verbose \
--debug
...
debug: Puppet installation finished!
debug: SSH Command Exit Code: 0
info: Executing remote command ... Done
notice: Puppet is now installed on: ec2-23-20-205-245.compute-1.amazonaws.com
notice: No classification method selected
jblaine@master$
The run on the agent node, for kicks:
ubuntu@agentnode$ sudo puppet agent --onetime --no-daemonize --verbose --debug
[ bunch of resources that need no action because they are already done ]
[ java already installed by bootstrap run, hadoop as well, etc. ]
[ no cert issues ]
#10
Updated by Jeff Blaine 6 months ago
Confirmed this a few times — At least in my case, cloudpack is trying to get my node’s certificate signed after it’s already been signed and the CSR has been cleared out from the master.
When I bootstrap a new node with that whole signing block commented out of cloudpack.rb, a new cert shows up on the master under /var/lib/puppet/ssl/ca/signed. So something is handling it already.