The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #3910

Server is not authoritative over client environment when specified in an ENC

Added by Nigel Kersten almost 4 years ago. Updated about 1 year ago.

Status:ClosedStart date:
Priority:UrgentDue date:
Assignee:Patrick Carlisle% Done:

100%

Category:plumbing
Target version:3.0.0
Affected Puppet version:0.25.4 Branch:https://github.com/puppetlabs/puppet/pull/691
Keywords: customer

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This issue is currently not available for export. If you are experiencing the issue described below, please file a new ticket in JIRA. Once a new ticket has been created, please add a link to it that points back to this Redmine ticket.


Description

See: http://groups.google.com/group/puppet-dev/browse_thread/thread/b609965e377392ec

To summarize, when the client specifies one environment and the classifier specifies another, classes are evaluated from the server-specified environment, and yet files are retrieved from the client-specified environment.

3 environments defined, each with a single class “base”. /etc/puppet/puppet.conf

<...snip...> 
[one] 
  modulepath = /etc/puppet/environments/one/modules 
[two] 
  modulepath = /etc/puppet/environments/two/modules 
[three] 
  modulepath = /etc/puppet/environments/three/modules 

/etc/puppet/environments/one/modules/base/manifests/init.pp

class base { 
  notify { "hardwired one": } 
  notify { "variable $environment": } 
  file { "/tmp/environment_test": 
    source => "puppet:///base/tester", 
  } 
} 

/etc/puppet/environments/two/modules/base/manifests/init.pp

class base { 
  notify { "hardwired two": } 
  notify { "variable $environment": } 
  file { "/tmp/environment_test": 
    source => "puppet:///base/tester", 
  } 
} 

/etc/puppet/environments/three/modules/base/manifests/init.pp

class base { 
  notify { "hardwired three": } 
  notify { "variable $environment": } 
  file { "/tmp/environment_test": 
    source => "puppet:///base/tester", 
  } 
} 
$ cat /etc/puppet/environments/{one,two,three}/modules/base/files/tester 
one 
two 
three 

Right? So we have two notify resources and a file resource. – The “hardwired” notify is to illustrate which class is being loaded. – The “variable” notify is to illustrate what $environment evaluates to in the manifests. – The file source is to illustrate which file is being sourced. I also have an external node classifier that always returns this:

--- 
classes: 
 - base 
environment: one 

So our classifier always includes base, and always sets the environment. I then invoke a puppet run on a client, specifying the environment to be different to the classifier. Between all of these runs I delete cached client yaml info on the server. (find /var/puppet/yaml -type f -delete)

# puppetd -t --environment two 
notice: hardwired one 
notice: //base/Notify[hardwired one]/message: defined 'message' as 
'hardwired one' 
notice: variable two 
notice: //base/Notify[variable two]/message: defined 'message' as 'variable 
two' 
notice: Finished catalog run in 0.18 seconds 
# cat /tmp/environment_test 
two 

So we have the class being evaluated in environment “one”, but the file being sourced coming from environment “two” ! And less importantly, $environment evaluates to “two”. * * Now, to throw the big spanner in the works…. we try not specifying an environment at all.

# puppetd -t 
notice: hardwired one 
notice: //base/Notify[hardwired one]/message: defined 'message' as 
'hardwired one' 
notice: variable production 
notice: //base/Notify[variable production]/message: defined 'message' as 
'variable production' 
err: //base/File[/tmp/environment_test]: Failed to retrieve current state of 
resource: Error 400 on SERVER: Not authorized to call find on 
/file_metadata/base/tester Could not retrieve file metadata for 
puppet:///base/tester: Error 400 on SERVER: Not authorized to call find on 
/file_metadata/base/tester at 
/etc/puppet/environments/one/modules/base/manifests/init.pp:6 
notice: Finished catalog run in 0.08 seconds 

As we don’t have an environment “production” defined at all, the server tries to read the metadata from a non-existent environment and fails.


Related issues

Related to Puppet - Bug #2748: config file takes priority over external_nodes in 0.25.x Duplicate 10/22/2009
Related to Puppet - Bug #2834: external node classifier should take client's idea of the... Closed 11/18/2009
Related to Puppet Documentation - Feature #7595: Document undesirable behavior of setting environment in a... Closed 05/19/2011
Related to Puppet - Bug #11268: puppet master --compile ignores environments. Accepted 12/07/2011
Related to Puppet - Bug #15189: Default permissions need to allow 'find' on 'node' Closed 06/23/2012
Related to Puppet - Bug #16275: $environment in manifests is wrong when an enc overrides ... Closed 09/06/2012
Related to Puppet - Bug #16789: server overridding client set environments Closed 10/04/2012
Related to Puppet - Bug #16753: Need the ability to list all nodes Closed 10/03/2012
Related to Puppet - Bug #16698: external node classifier script is not being called when ... Closed 10/02/2012
Related to Puppet - Bug #22925: Cached catalogs no longer work in PE 3.0.0 and above. Merged - Pending Release
Duplicated by Puppet - Bug #5973: environment configuration parameter used inconsistently w... Duplicate 01/22/2011
Duplicated by Puppet - Feature #11520: external_nodes should optionally be per environment Duplicate 12/20/2011

History

#1 Updated by James Turnbull almost 4 years ago

  • Category set to plumbing
  • Status changed from Unreviewed to Accepted
  • Assignee set to Markus Roberts
  • Priority changed from Normal to High
  • Target version set to 2.6.0

This is more than the normal counter-intuitive – this has the potential to break things nastily.

#2 Updated by Markus Roberts almost 4 years ago

  • Target version changed from 2.6.0 to 49

Fix will be wanted in the 0.25.x series as well.

#3 Updated by Nigel Kersten over 3 years ago

What’s the progress on this bug? Are we still committing to pushing this out in 0.25.x?

I care much more about it being fixed in 2.6.x than I do about 0.25.x assuming the latter is a more difficult patch to backport.

#4 Updated by Markus Roberts over 3 years ago

Nigel —

It’s part of a nest of related issues that are stalled on a design question: what is the authoritative source of the environment? I’ll take a look & see if there’s a better answer, but off the top of my head the problem is that the puppetmaster isn’t required to abide by the client’s environment selection but never informs the client of this. The client thus requests files in the environment it thinks it’s in (the environment is part of the request) and the puppetmaster (when fileserving) does abide by the client’s selection so…

— Markus

#5 Updated by Markus Roberts over 3 years ago

  • Status changed from Accepted to Needs Decision
  • Assignee deleted (Markus Roberts)
  • Target version changed from 49 to 2.7.x

#6 Updated by Alan Barrett over 3 years ago

If the client began by asking the server “what environment do you think I should be in?” (issue #2748) then I think it would solve this problem.

#7 Updated by Nigel Kersten over 3 years ago

  • Assignee set to Nigel Kersten

Some people want to be able to make the client authoritative over the environment, some people want the server.

I think this is something we need to allow people to choose behavior, but in my opinion the default should be for the server to be authoritative.

#8 Updated by Patrick Mohr over 3 years ago

I don’t actually use envirements. I’m just throwing out an idea.

What if you can set on the client what the environment is and one of the options is “auto” or “let_server_choose” or something similar. Then you make the the default on the client.

Would that be helpful?

#9 Updated by Nigel Kersten over 3 years ago

I think that if you want the server to be authoritative over the client environment, you just allow the client to not define an environment, and if it does and the server overrides it, then a warning message is printed.

If you want the client to be authoritative over the environment, then you just don’t set it in your external node classifier.

#10 Updated by Felix Frank over 3 years ago

How much compatibility is needed here? My gut says “a lot”.

I propose an option for puppet.conf that can configure the puppetmaster to behave as authoritative, non-authoritative, or legacy (a.k.a. broken), which should be the default.

However, when set to authoritative mode, old clients must be entertained, and I don’t suppose there is a way for the puppetmaster to tell them the environment they are to use. So when set to authoritative, the puppetmaster should detect old clients and switch their sessions to non-authoritative instead.

This way, existing setups aren’t broken, and users that choose to set the option get a coherent (although potentially confusing) result.

#11 Updated by Nigel Kersten over 3 years ago

I don’t understand what about the current behavior needs to be preserved. Why do you think we need a legacy mode Felix?

#12 Updated by Nigel Kersten over 3 years ago

  • Subject changed from Class/File source mismatch when client/node classifier disagree on facts. to Class/File source mismatch when client/node classifier disagree on environment.

#13 Updated by Nigel Kersten over 3 years ago

Let me flesh out the proposal.

We have a setting for whether the master is authoritative over the client environment.

Server authoritative:

  • Server sets environment and client sets environment, produce warning client/server side, use server env.
  • Server does not set environment, client does, use client env.
  • Server sets environment, client does not, use server env.

Server not authoritative:

  • Server sets environment and client sets environment, produce warning server-side, use client env.
  • Server does not set environment, client does, use client env.
  • Server sets environment, client does not, use server env.

This does all assume that the idea of a “default” environment goes away client-side. I don’t see that being a huge problem, as users can simply set their default environment on the server, and they can still choose whether or not the server is authoritative.

Am I missing some use cases ?

#14 Updated by John Warburton over 3 years ago

Case “Server not authoritative” is what I am looking for. We use an external node classifier which always sets the environment. I’d like the ability to occasionally to override the server from the client for testing/debugging

#15 Updated by R.I. Pienaar over 3 years ago

Nigel Kersten wrote:

Let me flesh out the proposal.

We have a setting for whether the master is authoritative over the client environment.

Server authoritative:

  • Server sets environment and client sets environment, produce warning client/server side, use server env.
  • Server does not set environment, client does, use client env.
  • Server sets environment, client does not, use server env.

Why do you say the 2nd here? We shouldn’t trust the client ever if the master is set authoritive even if there’s been human error – forgot to set it on the master – it should default to production if master authoritive that seems to be the sane combo and means the 2nd point above can go

Server not authoritative:

  • Server sets environment and client sets environment, produce warning server-side, use client env.
  • Server does not set environment, client does, use client env.
  • Server sets environment, client does not, use server env.

This does all assume that the idea of a “default” environment goes away client-side. I don’t see that being a huge problem, as users can simply set their default environment on the server, and they can still choose whether or not the server is authoritative.

Why would the client when set authoritive not default to production? If we ship with this as default there’s no surprises for users when they upgrade but the people who wants new behavior can enable it, no surprises to anyone.

#16 Updated by Nigel Kersten over 3 years ago

To bring up another point that is probably out of scope for this bug, but others may think differently.

The config file option “environment=foo” means completely different things in [master] and [agent].

In the latter case it means “set my environment to this”. In the former case it means “set the default node environment to this”.

The problem is when users set the value in [main], it flows down to the other blocks and means different things in both.

#17 Updated by Alan Barrett over 3 years ago

Nigel Kersten wrote:

We have a setting for whether the master is authoritative over the client environment.

Why do you need such a setting? It seems to add needless complexity.

Here’s my proposal.

From the client’s perspective:

  1. The client/server interaction always begins with the client asking the server “I think my environment should be , and here are my facts and certificates and stuff; what do you think my environment should be?”

  2. In response to (1), the server informs the client of what environment to use. The server is free to believe the client’s environment, or not, according to server configuration, which the client doesn’t need to know about.

  3. The environment reported by the server to the client in (2) above is always authoritative. If the client doesn’t like it, then its only recourse is to fail.

From the server’s perspective:

  1. No new configuration settings are needed.

  2. If there’s an external node classifier, and if it specifies an environment, then that wins.

  3. The external node classifier should have access to the client’s idea of the environment, and all the facts (see issue #2834), so the external node classifier is free to let the client’s idea of the environment win, or not, according to arbitrarily complex conditions.

  4. If there’s no external node classifier, or if the external node classifier is silent about the environment, then the client’s environment is used, but if the client didn’t specify an environment, then the server’s default environment is used.

From the administrator’s perspective:

  1. If you want the client’s environment to win, then don’t use an external node classifier, or use an external node classifier that doesn’t override the client’s idea of the environment.

  2. If you want the server’s environment to win, then use an external node classifier that overrides the client’s environment.

  3. If you have more complex requirements, then use a more complex external node classifier that sometimes overrides the client’s environment and sometimes does not override it.

#18 Updated by James Turnbull over 3 years ago

I agree with Alan. That’s the simplest and principle of least surprise approach.

#19 Updated by John Warburton over 3 years ago

Alan’s proposal is fine, although I feel arbitrarily puts you at a disadvantage if you use an external node classifier.

I can live with this proposal, if and only if server point 2 is implemented: “The external node classifier should have access to the client’s idea of the environment, and all the facts

#20 Updated by Nigel Kersten over 3 years ago

I like that proposal too, but it doesn’t cover all the use cases I’ve had laid out in front of me over this issue.

I’m going to put together a more fleshed out post with all the use cases after the holiday season, as we’re going to have to either swallow complexity or deny some use cases.

#21 Updated by Ashley Penney over 3 years ago

I’ve been having similar problems and Nigel asked me to update this bug with what we’re looking for.

Alan’s proposal is absolutely fine for us, but I would suggest an alternative mode whereby if puppet is run without a puppet.conf then it simply attempts to connect to the hostname ‘puppet’ and if it is able to make the connection it retrieves all configuration from the external node classifier. This would allow foreman to be my authoritative source of configuration for clients and remove the need to negotiate anything between server/client.

#22 Updated by Thomas Bellman about 3 years ago

Alan Barrett wrote:

  1. The client/server interaction always begins with the client asking the server “I think my environment should be , and here are my facts and certificates and stuff; what do you think my environment should be?”

In order for the client to present its facts, it must first have downloaded any custom facts from the server. Which environment should it get the facts from?

  1. The environment reported by the server to the client in (2) above is always authoritative. If the client doesn’t like it, then its only recourse is to fail.

Really? You’re not very paranoid… I can think of another thing the client can do: continue to use the environment it wants, and see if it can trick the server into giving out things it shouldn’t.

The only way the server can be really authoritative about which environment a client gets, is for the server to decide the environment itself, every time the client asks for something, be it a file (including plugins) or a catalog. It must not trust that the next request from the client will be asking for the environment the server told it to use a nanosecond earlier. That basically means the server must call the node classifier for every client request. (It may be acceptable to cache the results of the classifier for a short time.)

Of course, if you only want to protect against mistakes (someone happened to write “test” instead of “production” in puppet.conf a late night), then your scheme will do. But I suspect some people will want protection against deliberate circumvention attempts as well.

#23 Updated by Mikael Fridh about 3 years ago

I honestly like the fact that I can run a client with puppet agent —environment=development and have that machine enter that environment.

In fact, I even do something similar to this: puppet.conf.erb:

[agent]
environment = <%= environment %>

After all, I do have the option of “hard coding” an environment on my server with PUPPET_EXTRA_OPTS=“—environment=production” from /etc/sysconfig/puppet

What I don’t like however is that I have to symlink modules from my development modulepath into the production modulepath when I’m testing new modules and providers because the server looks in the wrong environment.

#24 Updated by Marcello de Sousa about 3 years ago

I’m hitting this bug now with 2.6.6

    ...when the client specifies one environment and the classifier specifies another,
 classes are evaluated from the server-specified environment, 
 and yet files are retrieved from the client-specified environment

Our biggest problem now is not which one is authoritative, but the fact that it’s half working when using an external node classifier to set the environment. So it would be nice to have them asap at least consistent (classes and files evaluated from the same environment).

Now we get classes silently evaluated with the wrong files (with nasty results).

#25 Updated by Marcello de Sousa about 3 years ago

Btw, my node classifier is configured to tell “myhost” it should belong to “mycustomenv” environment. Then on the server I’ve noticed this :

> grep environment /var/lib/puppet/yaml/facts/myhost.yaml

  environment: production

> grep environment /var/lib/puppet/yaml/node/myhost.yaml

  environment: mycustomenv
    environment: production

Which just doesn’t seem quite ok to me (although I can’t really judge the impact of having 2 environment entries in the node yaml file).

#26 Updated by Nigel Kersten about 3 years ago

Thomas Bellman wrote:

The only way the server can be really authoritative about which environment a client gets, is for the server to decide the environment itself, every time the client asks for something, be it a file (including plugins) or a catalog.

I agree, but that’s a much larger change, and I think we can get to a very useful place without implementing that, even though we should probably aim for that in the future.

#27 Updated by Nigel Kersten about 3 years ago

How about this for a proposal?

We add the client-specified-environment as $2 for the ENC, so people can choose to implement logic around server-specified-environment vs client-specified-environment in their ENC.

If the server specifies an environment, that is what we use.

This seems to solve the most common problem, and we can debate the utility of “is server authoritative?” settings later.

#28 Updated by Nigel Kersten about 3 years ago

  • Subject changed from Class/File source mismatch when client/node classifier disagree on environment. to Server is not authoritative over client environment when specified in an ENC

#29 Updated by Nigel Kersten about 3 years ago

  • Status changed from Needs Decision to Accepted
  • Assignee deleted (Nigel Kersten)

#30 Updated by Nick Lewis about 3 years ago

  • Assignee set to Jacob Helwig

#31 Updated by Jacob Helwig about 3 years ago

  • Assignee changed from Jacob Helwig to Nick Lewis

#32 Updated by Aaron Grewell almost 3 years ago

Can somebody put a mention of this bug in the ENC documentation? As a new user trying to setup using environments, modules, and an ENC it was very confusing to have the fileserver broken out of the box. Right now we have http://docs.puppetlabs.com/guides/external_nodes.html which says:

“The value of the environment key is a string representing the master’s preferred environment for this agent node. The interaction between agent-specified and master-specified environments is currently under active design consideration.”

I suggest adding “For now you must set the environment in the client’s puppet.conf in order for the fileserver to work as expected.”

#33 Updated by Nigel Kersten almost 3 years ago

Thank you for that reminder Aaron. #7595

This is the current first step proposal:

If the server sets the environment for a client, that environment is embedded in the catalog the client receives. If the client sees an environment in a catalog it is due to apply, that’s the one it uses.

Additionally, we add the client-specified environment as $2 for the ENC.

#34 Updated by Daniel Pittman over 2 years ago

Nigel Kersten wrote:

Additionally, we add the client-specified environment as $2 for the ENC.

Passing additional arguments is an ABI change for the ENC script: previously we documented that the script was invoked with a single argument, the node name. However, an easy and compatible work around is to establish the data in the process environment. An aware ENC script can collect from there, and an unaware script will not see the change.

This makes the change safe for the 2.7.x tree.

#35 Updated by Nigel Kersten over 2 years ago

+1

#36 Updated by Bryan Seitz over 2 years ago

This is definitely a high priority fix for us as well. I am currently working around this bug with a major hack :(

#37 Updated by Mario Verbelen over 2 years ago

What is the current status? I’m having the same issue … Only work-around that we use currently is templating

I hope that this will be fixed in 0.25 as well –> Markus Roberts 1 year ago “Fix will be wanted in the 0.25.x series as well.” (we are still on version 0.25.5 currently)

#38 Updated by Jerome Loyet over 2 years ago

Hi there,

we are using puppet on a very large number of nodes and we are facing the exact same problem. This is really a major issue for us.

so long and thx all the puppets ;)

++ fat

#39 Updated by John De Stefano over 2 years ago

+1: the work-around is kind of a pain when dealing with more than a few servers, not to mention having to remember that you’ve done the work-around in the first place and need to change environments.

#40 Updated by Lars Solberg over 2 years ago

Is there any ETA for a fix for this bug? We are also having the same issue in a production environment. I haveto swap environment temporary on servers to work around this bug; and I know its gonna bite me a day..

#41 Updated by Peter Meier over 2 years ago

If you’re struggling with environment changes and your server should be authoritative over the environments: I have pushed a workaround, that is in my opinion not a major hack, to: https://github.com/duritong/handle-puppet-environment

It is simply extending the current dashboard ENC script and is using 2 puppet runs (where the first one is a really safe one) to properly switch the environment. Comments are welcomed on github as pull-requests or whatever.

And as a side-note: Although, making the environment available to the ENC will safe us from storing the previous ENC-data to compare the two environments, we will still have to use 2 puppet runs to safely switch an environment. Otherwise, plugins such as facts and providers will still be synced with the wrong environment and also client file sources will be wrong.

#42 Updated by Nan Liu over 2 years ago

Not sure if this should be a separate bug:

Passing environment on master for compilation results in it being completely ignored:

puppet master --compile raiden.local --environment=test
notice: Scope(Node[default]): scope production
notice: Scope(Node[default]): production

This works from the client, but it’s ineffective to compile large number of catalogs by changing the puppet master configuration:

info: Expiring the node cache of raiden.local
info: Not using expired node for raiden.local from cache; expired at Fri Dec 02 10:58:06 -0500 2011
info: Caching node for raiden.local
notice: Scope(Node[default]): scope production 
notice: Scope(Node[default]): production
notice: Compiled catalog for raiden.local in environment production in 0.03 seconds
info: Expiring the node cache of raiden.local
info: Not using expired node for raiden.local from cache; expired at Fri Dec 02 10:58:18 -0500 2011
info: Caching node for raiden.local
notice: Scope(Node[default]): scope test 
notice: Scope(Node[default]): test

#43 Updated by Daniel Pittman over 2 years ago

I just wanted to give the watching world an update on this: this is actually a fairly hard problem, because fixing it requires a change to the protocol between the Puppet master and the ENC.

We are currently working on some design around that change, and how to keep it compatible, but ultimately it should empower the ENC to fully own all the data required to make this work.

There is no committed timeline to a fix, but we hope it will be addressed soon – and, ideally, in a way that we can deploy as part of the 2.7 series, rather than forcing everyone to wait on a new major version.

#44 Updated by Nigel Kersten about 2 years ago

  • Target version changed from 2.7.x to 3.x

The plan is:

If the server sets the environment for a client, that environment is embedded in the catalog the client receives. If the client sees an environment in a catalog it is due to apply, that’s the one it uses.

We understand there are other use cases where people want flexibility over taking the client-specified environment into account. This will need to be handled in your ENC itself by querying the inventory service API for the client environment information.

#45 Updated by Timur Batyrshin about 2 years ago

The plan is: If the server sets the environment for a client, that environment is embedded in the catalog the client receives. If the client sees an environment in a catalog it is due to apply, that’s the one it uses.

In that case an attack is possible of retrieving information (and possibly sensitive information like passwords) from side environments.

#46 Updated by Jeff McCune about 2 years ago

On Wed, Jan 25, 2012 at 10:08 AM, tickets@puppetlabs.com wrote:

Issue #3910 has been updated by Timur Batyrshin.

In that case an attack is possible of retrieving information (and possibly sensitive information like passwords) from side environments.

Timur,

Could you please explain this attack you’re thinking of in more detail? I don’t understand the attack and I’m trying to understand this better.

-Jeff

#47 Updated by Timur Batyrshin about 2 years ago

I’m sorry I seem to have misunderstood you at first.

If the server is able to override environment sent by client it would be ok.

#48 Updated by Nigel Kersten about 2 years ago

Yes, but to be clear, if the agent is applying a cached catalog, the environment in the catalog will be trusted as the authoritative source, and the ENC will not be consulted.

This doesn’t expose any more than we have now, where strictly speaking you could retrieve a file that isn’t in your catalog or your environment.

In the future we’ll be looking to enforce better partitioning there, but if you have particularly sensitive data, you can restrict access using auth.conf and/or fileserver.conf with static mountpoints and explicit controls.

Another relatively simple option is to not expose such passwords as files served by the fileserver, but instead use functions to insert them into manifests to avoid leakage.

#49 Updated by Nigel Kersten about 2 years ago

  • Priority changed from Normal to Urgent

#50 Updated by Patrick Carlisle about 2 years ago

  • Assignee changed from Nick Lewis to Patrick Carlisle

#51 Updated by Patrick Carlisle about 2 years ago

  • Status changed from Accepted to In Topic Branch Pending Review
  • Branch set to https://github.com/puppetlabs/puppet/pull/569

#52 Updated by Michael Stahnke about 2 years ago

I am so happy to see code on this :)

#53 Updated by Nigel Kersten about 2 years ago

+1 !

#54 Updated by Nick Fagerlund about 2 years ago

(Capturing something I was talking w/ Patrick about out in the park—)

With this change set, the agent will remember its server-assigned environment for as long as the process sticks around, but if you’re running agent from cron, it will have to re-learn it on each run. This may as much as double the compilation workload on your puppet master. (Depends on the percentage of nodes that aren’t in the production environment or have a wrong environment in puppet.conf.)

Managing puppet.conf with a template is the smart way to handle that, but that’s tribal knowledge/best practices, and the default behavior should be more forgiving than this. Agent should be able to remember its environment across runs.

We probably shouldn’t edit puppet.conf ourselves, but maybe we could read the server-set environment from the cached catalog when the agent starts up. Also, Nick0 brought up the possibility of treating production-the-default differently from production-set-on-purpose. (Like, always read from the catalog if there’s no environment in puppet.conf, but if there’s an environment in puppet.conf, try it first.)

#55 Updated by Nigel Kersten about 2 years ago

“Agent should be able to remember its environment across runs”

I disagree if we’re talking about the same thing. The server must be able to reassign a new environment for the agent across runs, which seems to conflict with this.

I was under the impression we were producing a relatively simple model.

  • The server can set the environment for the agent.
  • If it does so, that is in the catalog, and persists for any application of that catalog.
  • When the agent gets another catalog, the server can set the environment.

“the agent will remember its server-assigned environment for as long as the process sticks around,”

What does “remember” mean here?

#56 Updated by Andrew Forgue about 2 years ago

Nigel Kersten wrote:

“the agent will remember its server-assigned environment for as long as the process sticks around,”

What does “remember” mean here?

I hope it doesn’t mean that the agent, once started and receives a catalog, will always be bound to that environment in the catalog. The net effect seems to require that you’ll need to restart the puppet agent on all of your hosts if you change the server-enforced environment. I can see that being a pain in the ass.

#57 Updated by Nigel Kersten about 2 years ago

Andrew Forgue wrote:

Nigel Kersten wrote:

“the agent will remember its server-assigned environment for as long as the process sticks around,”

What does “remember” mean here?

I hope it doesn’t mean that the agent, once started and receives a catalog, will always be bound to that environment in the catalog. The net effect seems to require that you’ll need to restart the puppet agent on all of your hosts if you change the server-enforced environment. I can see that being a pain in the ass.

It would seem reasonable to be bound to a catalog-specified environment if you’re applying a locally cached catalog, but yes, your last point is the one that concerns me.

#58 Updated by Nick Fagerlund about 2 years ago

Nigel Kersten wrote:

What does “remember” mean here?

Patrick is the one to clarify what it currently means; what he told me was that the environment a long-running agent process received from the master on the previous run will be the environment it tries first on the next run; I think it can still get overridden, but it’s less likely to, since the last assigned environment is probably still valid.

But to clarify what I was asking for: Yes, the server should be able to override the agent’s environment on any run. But if the agent’s config file got overridden last time, it’s probably going to get overridden again, right? I’m just suggesting that we can save compilation resources on the master if the agent checks what happened last time, and tries the most recently assigned environment first instead of trying the environment in the config file first.

Mostly I’m just worried about people’s puppet masters getting slammed if they’re running agent from cron and assigning environments server-side, since that could potentially as much as double the number of catalogs it has to compile.

#59 Updated by R.I. Pienaar about 2 years ago

Nick Fagerlund wrote:

Mostly I’m just worried about people’s puppet masters getting slammed if they’re running agent from cron and assigning environments server-side, since that could potentially as much as double the number of catalogs it has to compile.

From reading this ticket I share this concern, people who do not run the daemon – say they prefer cron or mcollective based scheduling, this can be quite bad on the master resources.

#60 Updated by Oliver Hookins about 2 years ago

What worries me most about this current line of discussion is that it is based on entirely practical concerns of load avoidance rather than approaching the core issue of configuration management.

Performance issues can always be worked around in a number of interesting ways but designing the system around a concept of previously known state is entirely avoiding the spirit of minimising entropy of the client systems, and ensuring the configuration of the client system matches what we want it to be.

So I’d like to see this resolved in a way that the master can always force the client into a certain environment, and force whatever configuration is desired to be achieved from that environment. Anything less seems like we are leaving way too much to eventual consistency (where eventually may be never).

#61 Updated by Nigel Kersten about 2 years ago

“What worries me most about this current line of discussion is that it is based on entirely practical concerns of load avoidance rather than approaching the core issue of configuration management.”

I think it’s actually the other way around Oliver.

The current code as I understand it will deal with an environment mismatch between agent and server by throwing away the catalog that was compiled using the agent environment, and instead re-doing the whole process of pluginsync + catalog retrieval with the server specified environment.

I am concerned about load here however, particularly in situations where the agent doesn’t specify an environment (and thus defaults to ‘production’) and the server sets a different environment.

It feels like a better approach would be to do something like:

  • agent requests catalog with environment “A”.
  • agent pluginsyncs with environment “A”
  • server notes the mismatch as it is assigning environment “B”
  • agent re-does pluginsync with environment “B”
  • server delivers catalog with environment “B”

so we still maintain the server being able to operate as an authoritative source of truth, but remove the second catalog compilation.

It’s entirely possible I’m missing something here, so I’ll chat to Patrick today.

#62 Updated by Oliver Hookins about 2 years ago

That’s still a huge amount of steps during which the server is attempting to reach environment “sync” with the client, when effectively the server knows which environment the client is attempting to be configured for as soon as the catalog request comes in.

I stand by my original statement that this is somewhat getting bogged down in the details of implementation rather than viewing the problem holistically. The server should be the source of truth for configuration, and based on at least principles of REST (and probably others) should guide the client immediately through the business process of achieving the correct configuration without violating that process (in this case, plugin syncing with the wrong environment or anything else).

#63 Updated by R.I. Pienaar about 2 years ago

Nigel Kersten wrote:

  • agent requests catalog with environment “A”.
  • agent pluginsyncs with environment “A”
  • server notes the mismatch as it is assigning environment “B”

why do we get this far before doing something about it? Why not redirect to a different environment in the first catalog request already? Without compiling the catalog completely

Actually pluginsync happens first, so just redirect to the right environment then already before the catalog enters the picture

I am guessing it is because we only call the ENC on the catalog request so we wont know the required environment till the compile.

I see the first thing a client does is:

GET /production/file_metadatas/plugin

If there’s some way to figure out its wrong there we can respond with either a temp or perm redirect (which will instruct the agent to store or temp change) and then the following request and subsequent compile request should be in that environment.

I’m probably just missing some discussions held in person there so this is mostly asking for clarification on the thinking involved in creating hte list of actions

#64 Updated by Nick Fagerlund about 2 years ago

Something I just noticed which is unrelated to the above conversation:

The “exec” node terminus ALWAYS inserts an environment into the node object; if the ENC script doesn’t return one, it inserts “&id001 production”. (Note that the “plain” terminus will return node objects with no environment, so it should be possible for ENC nodes to have no environment specified; it just wasn’t implemented that way.) I haven’t tested what the ldap node terminus does.

That behavior seems worth reconsidering — not specifying an environment on the server should mean “let the client’s environment win,” not “force every client into ‘production’.”

I took a quick look at the acceptance tests, and none of them would catch that. The test_name "Agent should use agent environment if there is no enc-specified environment" test is using the plain terminus; there should be another test that checks the exact same thing but uses the exec terminus with a script that leaves the environment key out.

#65 Updated by Daniel Pittman about 2 years ago

  • Status changed from In Topic Branch Pending Review to Merged - Pending Release

#66 Updated by Daniel Pittman about 2 years ago

R.I. Pienaar wrote:

Nigel Kersten wrote:

  • agent requests catalog with environment “A”.
  • agent pluginsyncs with environment “A”
  • server notes the mismatch as it is assigning environment “B”

why do we get this far before doing something about it?

Because one of the explicit promises is that the fact data will have been stored in the inventory service and through any other “write somewhere” terminus by the time the ENC is invoked for a node.

Since we have a strict RPC model in the agent/master interaction, and fact submission necessarily gets a catalog back, we are obliged to follow through and compile – since we can’t store facts without getting facts, and getting facts forces a compile and new catalog.

Actually pluginsync happens first, so just redirect to the right environment then already before the catalog enters the picture

…and because of the promise about the facts, we can’t just ask the ENC here either.

#67 Updated by Nick Fagerlund about 2 years ago

A possibility I mentioned to Nigel but probably not to anyone else: If the master spots an environment mismatch, it could skip the “real” compilation process and spit back a stub catalog containing an environment but no resources. That would

  • not break the RPC model
  • save compilation resources — throwing together a largely empty catalog object is probably pretty cheap, right?
  • probably be catastrophic for previous agent versions, which wouldn’t know to loop through the catalog request with a new environment in the event of a mismatch.

…well, not totally perfect. :/

#68 Updated by Oliver Hookins about 2 years ago

That almost entirely describes the current behaviour of our ENC. It looks at the environment that is being requested, and if it doesn’t match what we expect to be assigning it, we send back a minimal catalog which just updates the puppet.conf with the correct environment and schedules another Puppet run to occur a couple of minutes later.

It’s not pretty, but it works. If a similar process could be orchestrated within the one runtime of the agent it would be ideal as opposed to waiting for the timer loop to trigger again, or a second cron/at-run process to start.

#69 Updated by Daniel Pittman about 2 years ago

After talking to Nigel today, we are agreed that the double compilation is undesirable – and that we are generally happy to ask the ENC, before we begin, what the environment should be.

This is motivated, in part, by the fact that the stored facts that the ENC references would be wrong – drawn from the pluginsync facts in the incorrect environment – in the current model. It is also motivated by the performance costs that people highlight here.

The current proposal is:

  1. In 2.7 we changed the authentication rules to allow an agent to query the ENC for node definition, for itself. We can use that to perform an initial, and relatively inexpensive, query to determine what the environment should be.
  2. We will then proceed as defined: use the current idea of the environment – updated by step 1 – to pluginsync.
  3. We will then submit facts, which will trigger an ENC query after they are stored, and compile the catalog.

This uses a less expensive operation to determine the correct environment, and means that even a cron driven “run and exit” agent with the wrong environment specified won’t cost very much it terms of load on the master.

There are two points of concern around this:

It is possible that the ACL to allow the agent to query the ENC directly will be absent; in that case the effort to ask the master what the environment should be will fail. The proposal is to preserve the current behaviour – in most cases, we can avoid the restart, but if that is not possible we should eventually converge on the correct environment and succeed in running.

It is possible that the ENC query is expensive, and doubling the number of queries there will cause trouble for external services. The proposal is to add a variable to the environment to indicate the type of ENC query: ENC_QUERY_FOR set to either query or compile.

#70 Updated by John Warburton about 2 years ago

Does this solution implement Nigel’s setting in note-13 for whether the master is authoritative over the client environment?

#71 Updated by Daniel Pittman about 2 years ago

  • Status changed from Merged - Pending Release to Code Insufficient

John Warburton wrote:

Does this solution implement Nigel’s setting in note-13 for whether the master is authoritative over the client environment?

I believe it should do, but will get Patrick to confirm that before we consider the ticket fully closed.

#72 Updated by Patrick Carlisle about 2 years ago

Nick Fagerlund wrote:

Something I just noticed which is unrelated to the above conversation:

The “exec” node terminus ALWAYS inserts an environment into the node object; if the ENC script doesn’t return one, it inserts “&id001 production”. (Note that the “plain” terminus will return node objects with no environment, so it should be possible for ENC nodes to have no environment specified; it just wasn’t implemented that way.) I haven’t tested what the ldap node terminus does.

That behavior seems worth reconsidering — not specifying an environment on the server should mean “let the client’s environment win,” not “force every client into ‘production’.”

I took a quick look at the acceptance tests, and none of them would catch that. The test_name "Agent should use agent environment if there is no enc-specified environment" test is using the plain terminus; there should be another test that checks the exact same thing but uses the exec terminus with a script that leaves the environment key out.

Thanks. I’ve added this in my branch. It seems to work, I am seeing the catalog come back with the agent-specified environment instead of production. Take a look at https://github.com/pcarlisle/puppet/commit/d0012efe141c37f22aabd4d5c1bd2ae6b821dc11 and see if that does what you think it should, since it seems different from the results you were reporting.

#73 Updated by Patrick Carlisle about 2 years ago

John Warburton wrote:

Does this solution implement Nigel’s setting in note-13 for whether the master is authoritative over the client environment?

No. The implementation is based on what appeared to be the most recent specification in note-44.

#74 Updated by Patrick Carlisle about 2 years ago

I’ve been a little busy but changes that implement the below proposal should be coming soon. This ticket should remain open until then.

Daniel Pittman wrote:

The current proposal is:

  1. In 2.7 we changed the authentication rules to allow an agent to query the ENC for node definition, for itself. We can use that to perform an initial, and relatively inexpensive, query to determine what the environment should be.
  2. We will then proceed as defined: use the current idea of the environment – updated by step 1 – to pluginsync.
  3. We will then submit facts, which will trigger an ENC query after they are stored, and compile the catalog.

#75 Updated by Patrick Carlisle about 2 years ago

  • Status changed from Code Insufficient to In Topic Branch Pending Review
  • Branch changed from https://github.com/puppetlabs/puppet/pull/569 to https://github.com/puppetlabs/puppet/pull/691

New pull request at https://github.com/puppetlabs/puppet/pull/691

I’ve commented on it asking that it be left open for now, so that it can get enough review.

#76 Updated by Daniel Pittman almost 2 years ago

  • Status changed from In Topic Branch Pending Review to Merged - Pending Release

This has been merged, including my correction for a set of bad stubbing in some other agent tests that this revealed.

#77 Updated by James Turnbull almost 2 years ago

This is going to need a swath of docs updates too. Hello Nick and Fred… :)

#78 Updated by Nick Fagerlund almost 2 years ago

K, I’m putting a placeholder in pre-docs called “telly/environments.md.” It should also get updated with whatever’s going on with the plugins in environments for the master, which I have kinda lost track of.

#79 Updated by Nigel Kersten almost 2 years ago

James Turnbull wrote:

This is going to need a swath of docs updates too. Hello Nick and Fred… :)

Generally it shouldn’t need that much. It basically will mean that the ENC environment setting we’ve documented will work as promised.

Troubleshooting might be required for people who don’t have the node-retrieve-their-own-definitions stanza in place.

#80 Updated by James Turnbull almost 2 years ago

Nick – also http://docs.puppetlabs.com/guides/external_nodes.html.

#81 Updated by Daniel Pittman almost 2 years ago

  • Target version changed from 3.x to 3.0.0

#82 Updated by Jerome Loyet over 1 year ago

Hi

just a question:

In the case the environment set in the client is different from the environment returned from the ENC. The client is switching to the environment set by ENC:

Warning: Local environment: "production" doesn't match server specified node environment "mutu_ads__puppet_server", switching agent to "mutu_ads__puppet_server".

this works as expected. But if I want to use the $environment variable in any of my manifest, it’s then equal to the agent environment (aka production in my example). I’ve added the following line in the node definition to demonstrate it:

notify{"Environment var seen in the manifest: ${environment}":}

And when it executes, $environment equals “production” instead of the environment returned by the ENC. Environment var seen in the manifest: production

It’s like the $environment variable is overrided by the environment value returned from facter.

To fix this, I’ve used the following workaround: my ENC also returns a parameter “real_environment” so that I can use the $real_environment variable in my manifests. I’m using it to generate the file /etc/puppet/puppet.conf to avoid the warning.

Is this a bug or a feature ? If it’s a feature, how can I retrieve the environment from the ENC without passing it as a parameter extra variable ?

thx for the answer.

++ fat

#83 Updated by Nigel Kersten over 1 year ago

That feels like a bug to me.

#84 Updated by Patrick Carlisle over 1 year ago

Jerome, that is definitely a bug. I am tracking it in #16275.

#85 Updated by Jerome Loyet over 1 year ago

Patrick Carlisle wrote:

Jerome, that is definitely a bug. I am tracking it in #16275.

Thanks you very much !

#87 Updated by Matthaus Owens over 1 year ago

  • Status changed from Merged - Pending Release to Closed

Released in Puppet 3.0.0rc1

#88 Updated by Nigel Kersten over 1 year ago

and there was much rejoicing :)

#89 Updated by Charlie Sharpsteen about 1 year ago

  • Keywords set to customer

Also available in: Atom PDF