The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Feature #11915

Segregate client facts, server facts and ENC params in topscope hashes

Added by Brice Figureau over 2 years ago. Updated 8 months ago.

Status:Needs DecisionStart date:01/12/2012
Priority:NormalDue date:
Assignee:J.D. Welch% Done:

0%

Category:language
Target version:-
Affected Puppet version:2.7.9 Branch:
Keywords:ux

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket is now tracked at: https://tickets.puppetlabs.com/browse/PUP-1040


Description

Having to use $operatingsystem (and soon $::operatingsystem) in our manifests is:

  • confusing for new users
  • prone to name-clashing

Those variables are really specific in Puppet because they come from the exterior.

My proposal would be to move them to separate Puppet hashes of names:

  • $facts
  • $server_facts
  • $parameters

So usage would be:

 ...
 firewall { "http": protocol => "TCP", src => $facts['ipaddress'] }
 
 file { "/etc/issue.net": 
   content => "This host is in ${server_facts['environment']} environment"
 }
 ...

We could also have some custom methods in the template wrappers so accessing facts in templates could be even easiers, like facts['ipaddress'].

Of course to help migrate users, the first release would also put the facts/server facts and parameters in the node top scope (and issue a deprecation warning on lookup).


Related issues

Related to Puppet - Feature #16937: Improved Ruby DSL Closed 10/11/2012
Related to Puppet - Feature #19514: Provide validated clientcert name variable for use in man... Closed

History

#1 Updated by R.I. Pienaar over 2 years ago

Have brought this up with Nigel before, I’m all for it. Makes everything more obvious

#2 Updated by Anonymous over 2 years ago

  • Status changed from Unreviewed to Needs Decision
  • Assignee set to Anonymous

This is technically achievable, and can go through a period in which we have both versions of the facts, etc, work correctly, so migration can be safe.

I am generally inclined to think that this is appropriate; even if the current hash access syntax is cumbersome, that can be improved independently of this, and it certainly reduces the ambiguity about where values come from – and, importantly, what “wins” when two sources supply the same variable.

#3 Updated by R.I. Pienaar over 2 years ago

should also consider doing @server_settings[“foo”] to replace the current $settings::foo weird thing

#4 Updated by Anonymous over 2 years ago

R.I. Pienaar wrote:

should also consider doing @server_settings[“foo”] to replace the current $settings::foo weird thing

Arri, do you really mean @server_settings, or was that a typo for $server_settings?

#5 Updated by Anonymous over 2 years ago

As I understand it there are several issues here:

  1. Make access to facts more convenient (i.e., “$::” sucks). This is slated for Telly and no one loves it.
  2. Facts being in the same namespace as variables is bad.
  3. Should we expose server_facts? We don’t now, do we? What’s the use case?
  4. What are parameters?

Exactly how we expose facts is an interesting question. What we’re doing now with the hash syntax is awfully heavy; there are many characters that no one cares about and must type only because we say so ($, facts, [, ], ‘). This is worse when they’re interpolated. I’d rather find something simpler.

#6 Updated by Anonymous over 2 years ago

Thinking about this, I have a few things to note:

I can’t see a good reason to expose either the server facts, or the server settings, to the manifest. These might make some sort of sense in a simple, one machine deployment today, but as we move to, eg, having dedicated compiler pools, or changing the data flows that lead from facts to catalogs, the “machine that compiled this” becomes a lot less … predictable, or useful, or sensible to talk about in any way.

It seems to me that we have chosen syntax as the approach of our DSL, which is awesome.

It also seems that the single most important “variable” in the language are the client facts – almost every use of variables otherwise is to set configuration or move data about, or to mutate or reprocess facts to some other form.

The former seems like it is better solved by way of Hiera, and data/model separation. The later seems like a lot of the demand will go away with structured facts. In both cases it seems a poor decision to treat the current – if critical – need to work around other shortfalls as a good basis for design going forward.

I propose something different:

  • facts are dynamic, as they were in earlier versions.
  • facts have their own syntax: %foo, or @foo, or &foo, or *foo
  • facts are read-only

This seems to have all the advantages and none of the drawbacks:

  • facts, the most important data, are immediately accessible with a minimum of ceremony.
  • facts are available everywhere.
  • facts are visually distinguished from variables.

Especially the last point allows us to have different rules – variables are lexically scoped, but facts are dynamically scoped and global. It also makes it clearer to users what sort of data they are handling, and how.

#7 Updated by Ben Hughes over 2 years ago

This sounds like a much stronger idea.

If we’re going to do something ugly with facts, like $:: because they’re special, why not make them special and not ugly as you propose.

I imagine they will be “Puppet Labs” are changing the syntax yet again! But hopefully this time we’ll do it because it’s sane, and not because “this is what lexical scoping dictates one should do”. We’re still not Python.

#8 Updated by Anonymous over 2 years ago

Ben Hughes wrote:

This sounds like a much stronger idea.

If we’re going to do something ugly with facts, like $:: because they’re special, why not make them special and not ugly as you propose.

Technically, $:: is “less special”, in that it treats them like any other variable. I think that is a mistake because they are actually much more special than other variables, if for no other reason than they come from a special source. (The syntax obviously also removes the clash of names between variables and facts, because $foo and %foo or distinct entities…)

I imagine they will be “Puppet Labs” are changing the syntax yet again! But hopefully this time we’ll do it because it’s sane, and not because “this is what lexical scoping dictates one should do”. We’re still not Python.

I am much less afraid of complaints that we are changing syntax than complaints we are making everyone type three, or five, extra characters without substantial gains.

I appreciate the feedback, though, on how this hangs together.

#9 Updated by Ben Hughes over 2 years ago

Technically, $:: is “less special”, in that it treats them like any other variable. I think that is a mistake because they are actually much more special than other variables, if for no other reason than they come from a special source. (The syntax obviously also removes the clash of names between variables and facts, because $foo and %foo or distinct entities…)

Yes, I know technically it’s just a variable, but I was meaning special in teh sense of “looks ugly and not just like a normal dollar variable”.

I am much less afraid of complaints that we are changing syntax than complaints we are making everyone type three, or five, extra characters without substantial gains. I appreciate the feedback, though, on how this hangs together.

I think a symbol such as @ or % for facts would be far nicer than putting it as part of some hash. Seeing as they’re going to be commonly used.

In ERB templates would they all still be referenced the same? How will that deal with the possible name clashes?

#10 Updated by Tim Stoop over 2 years ago

Daniel Pittman wrote:

I can’t see a good reason to expose either the server facts, or the server settings, to the manifest.

The server settings are used in our environment, in which we have a “testpuppetmaster”, where we can test if code compiles and deploys and a normal production environment that only runs tested code. For example, we export firewall settings, but not all test environments have all exported resources, therefore, when the run is started from a testpuppetmaster, we don’t want to clean old rules. We use things like $server_name and $server_port to determine if we’re on a testpuppetmaster or not.

#11 Updated by Nigel Kersten over 2 years ago

I haven’t seen use cases for exposing server facts to the manifests, but there are certainly a few cases for the server settings.

Daniel, can you elaborate on this?

“facts are dynamic, as they were in earlier versions.”

Even with the plans to remove dynamic scoping in Telly?

I’m not convinced it makes sense for facts to have a special syntax compared to variables, but I can see the attraction. I’d love to see more people weigh in on that point.

#12 Updated by Anonymous over 2 years ago

Ben Hughes wrote:

I think a symbol such as @ or % for facts would be far nicer than putting it as part of some hash. Seeing as they’re going to be commonly used.

In ERB templates would they all still be referenced the same? How will that deal with the possible name clashes?

Actually, using @fact would be ideal. We can absolutely make variables in ERB templates work with the same literal syntax:

puppet apply <<EOT
notice("fact value is @fact")
notice(inline_template("fact value is <%= @fact %>"))
EOT

That would return the same value, with the same syntax, in both cases. A nice win for everyone.

#13 Updated by Anonymous over 2 years ago

Nigel Kersten wrote:

I haven’t seen use cases for exposing server facts to the manifests, but there are certainly a few cases for the server settings.

What are those, please? Several people have indicated that they are useful, and nobody can tell us what those cases are. If we can get the story from a user who wants to do that, we can absolutely support it, but without it …

Daniel, can you elaborate on this?

“facts are dynamic, as they were in earlier versions.” Even with the plans to remove dynamic scoping in Telly?

Absolutely. I am specifically suggesting that facts and variables are different things, and can have different rules. Facts should have “dynamic” scope – @factname – that are available everywhere. Variables should have lexical scope.

I’m not convinced it makes sense for facts to have a special syntax compared to variables, but I can see the attraction. I’d love to see more people weigh in on that point.

It is my view that facts should have a special syntax, and special behaviour, because they are the single most commonly accessed data in the system. They are universal things that most every part of the tree – manifests in modules you install, for example – want to reference.

#14 Updated by R.I. Pienaar over 2 years ago

Daniel Pittman wrote:

Nigel Kersten wrote:

I haven’t seen use cases for exposing server facts to the manifests, but there are certainly a few cases for the server settings.

What are those, please? Several people have indicated that they are useful, and nobody can tell us what those cases are. If we can get the story from a user who wants to do that, we can absolutely support it, but without it …

I’ve never found a use for them either. Here are some I’ve seen in #puppet based on a search through my logs

  • template() doesnt work like source which is what people want. So people hack around this by referencing $settings::modulepath with file()
  • scripts that interact with the exported resources database, someone wrote the config file for that based on $settings::dbpassword etc
  • while generating mcollective facts.yaml they use $settings::dynamicfacts to exclude the variables that changes all the time
  • they want to place files into the puppet config dir when managing puppet master using puppet so $settings::configdir is a good variable to have
  • testing infrastructure that check puppet code might run the manifests in a odd place so people want to find $settings::manifestdir to base checks off. Don’t really get what this guy was doing, suspect its related to the first point
  • someone wanted to know if pluginsync is enabled, presumably to figure out if they can use some custom type.
  • they want to configure the puppet.conf of the master using a template and want to use these variables there…thats just a terrible idea

And thats every time that variable ever came up in #puppet since its release. Nothing of this sounds like something we cant do without and most of them sounds like people working around other deficiencies or design issues.

Daniel, can you elaborate on this?

“facts are dynamic, as they were in earlier versions.” Even with the plans to remove dynamic scoping in Telly?

Absolutely. I am specifically suggesting that facts and variables are different things, and can have different rules. Facts should have “dynamic” scope – @factname – that are available everywhere. Variables should have lexical scope.

I’m not convinced it makes sense for facts to have a special syntax compared to variables, but I can see the attraction. I’d love to see more people weigh in on that point.

It is my view that facts should have a special syntax, and special behaviour, because they are the single most commonly accessed data in the system. They are universal things that most every part of the tree – manifests in modules you install, for example – want to reference.

I agree.

More though I think special syntax tells you when reading code that what is required is a fact about the system and not someting else. This is important information to know when evaluating code. Having special syntax means thats obvious. It’s important that the syntax we choose is also obvious in templates and not too far from each other. $::foo vs scope.lookupvar(“::foo”) is an example of where the disconnect is too big. $facts[“foo”] vs @facts[“foo”] maybe not.

Having them in a hash aids discovery of information, there are legit reasons to ask “what are all the facts for the node I am compiling” having a hash rather than a series of variables that you cant iterate using something like keys() means that’s something thats doable vs just outright a pain.

We seem to have a number of special types of variables here:

  • facts
  • settings
  • global variables such as those set in site.pp or provided by a ENC
  • server facts
  • node level variables
  • variables passed in via paramters such as parameters to a define or class

If we wanted to call more than one of these out in some way to donate their specialness I think we’d be better served by hash dynamics than just something like *variable is a fact while %variable is a server settings. that would be too confusing to new users and too alien for people coming from other systems.

Hashes are a universal concept even in bash scripts. I think it would be very obvious for any newcomer to know what $facts[“foo”] means once the concept of a fact has been explained, but something like *foo will just forever increase the confusion around the language

I’d specifically disagree with the statement that hash syntax is “awfully heavy” especially in the context of that very same syntax being something many people already know and would feel comfortable with. A specific example of this is the $settings::foo syntax which is alien to everyone and people are just not remembering how it works. We do have hashes in the language they’re widely used and understood and serves the purpose of storing all facts well.

Combine with this that data in hiera will be a hash using JSON hash syntax, nested facts will be a hash using ruby hash syntax and retrievable on the command line using something that approximates ‘pp’ or a hash or a simple JSON dump. Mcollective is all about hashes and displays output in JSON representation of a hash, and has tooling around selecting/searching/interacting with hashes. I’d say that hashes are pretty heavily entrenched in our world and will not go away, so rather than add more types of hash like syntax – or finding ways to avoid them where they make sense to people familiar with programming languages – we should just embrace hashes.

I’d be interested to see a main stream language that went with the conclusion that hash like syntax is too much of a burdon and came up with an alternative approach though. I would not want to see Puppet as the only or one of a very small handful that went down that route though.

#15 Updated by Brice Figureau over 2 years ago

R.I. Pienaar wrote:

So many good things

I’m more than 200% with R.I. on this. We don’t need to reinvent the wheel with new syntax, just exploit what we currently have. It just feels natural to use hash for this. I’d prefer the current status quo than the route with variable decorators (did we really learn the lessons of the spaceship operator?).

#16 Updated by Nigel Kersten over 2 years ago

Something that may not be obvious to everyone is that the whole requirement of facts to be referred to as $::factname is actually a bug in the deprecation warning system.

It only occurs in classes that are included via an ENC and not via any other means.

That may color this discussion somewhat.

I have no problem with not exposing the server settings to the manifests anymore. I’ve never needed to use it myself, and think Daniel is right in that people use it to work around other limitations.

#17 Updated by R.I. Pienaar over 2 years ago

Nigel Kersten wrote:

Something that may not be obvious to everyone is that the whole requirement of facts to be referred to as $::factname is actually a bug in the deprecation warning system.

It only occurs in classes that are included via an ENC and not via any other means.

That’s not true. $::fact says get fact from the top scope, if you set $fact in a node block and then include a class you get the warning, no ENC needed.

#18 Updated by Nigel Kersten over 2 years ago

Adding Nick as a watcher, as that’s the report I got from him debugging the deprecation warning.

#19 Updated by Ryan Coleman over 2 years ago

Nigel Kersten wrote:

I’m not convinced it makes sense for facts to have a special syntax compared to variables, but I can see the attraction. I’d love to see more people weigh in on that point.

When instructing new Puppet users, everyone is on-board with the variable syntax until we hit scoping. Then, the difference in their heads between variables they set and Facter facts becomes very grey and I’m usually forced to discuss how best to set Facter facts as students will either try or be confused by the idea of setting variables in top scope that match facts.

I feel that giving Facter variables a different syntax makes it immediately clear that these are special and makes it immediately clear how users should reference facts they set in environment, facts.d, ruby code, whatever. It also allows someone to reference a top scope variable and know that they’re never going to conflict with facts. In short, our users must understand less of how Puppet works and just use the software.

#20 Updated by Cody Herriges over 2 years ago

From the point of view of someone that has to teach all this stuff and answer the questions about why…hashes to group the differences between types of variable collections is going to be substantially easier to teach in our future education classes than some special arbitrary identifier.

#21 Updated by Cody Herriges over 2 years ago

Actually, I changed my mind. Hashes would be easier to teach than something arbitrary in our current education courses, not sure about our future ones.

#22 Updated by Cody Herriges over 2 years ago

Originally I thought $settings namespace was agent settings. Now that I know it is server settings I question its usefulness and now desire the ability to obtain agent settings. I currently have two use cases.

  1. Want to know where my agent’s SSL certs and cached server CA is being kept so I can reuse them. This came up when I wanted to setup stunnel between to machines for encrypting rsync.

  2. For doing file fragment patterns I would like to know where the agents vardir is so I can place my spool directories there and keep them out of the way.

#23 Updated by Anonymous over 2 years ago

So, bringing this together:

There seems to be a general feeling that additional syntax, in our syntax focused DSL, is not good. Using a hash is generally the right path; so we end up with expressions that look like this:

notice("hello from $facts[osrelease]")
if $facts[puppetversion] < 1.2 { … }
file { "example.conf": path => $facts[customfact] ? { … } }

We have some reasonably convincing use cases for the server facts, but mostly related to other headaches in usability – people want it to make file() work more like template(), for example, or people who want to do database queries because we don’t offer sufficient cross-machine communication capabilities.

We have some very reasonable use cases on the client, which are not radically different in cause than the server ones.

We have an absolute case of confusion, in that settings are naturally assumed to be the agent settings, and a different name for master settings is strongly preferable. We should make that change, even if it is potentially breaking, though an ideal implementation can detect and warn about the bad name being used.

#24 Updated by R.I. Pienaar over 2 years ago

If we are agreed to start having a bunch of hashes in default use we should consider picking some functions from stdlib like keys() and make them available as core. These would be things most people would expect to just be there in a language that uses hashes widely

#25 Updated by Nigel Kersten over 2 years ago

I don’t have a significantly more constructive suggestion right now, but I feel quite uneasy about pushing our users from:

notice("hello from $osrelease")
if $puppetversion < 1.2 { … }

to:

notice("hello from $facts[osrelease]")
if $facts[puppetversion] < 1.2 { … }

particularly in a world without dynamic scoping.

#26 Updated by R.I. Pienaar over 2 years ago

Nigel Kersten wrote:

particularly in a world without dynamic scoping.

can you elaborate? I think these var names should effectively become reserved words that raise an error if ever anyone tries to create a var called $facts etc

#27 Updated by Nigel Kersten over 2 years ago

R.I. Pienaar wrote:

Nigel Kersten wrote:

particularly in a world without dynamic scoping.

can you elaborate? I think these var names should effectively become reserved words that raise an error if ever anyone tries to create a var called $facts etc

I have a strong feeling that many of our problems with the current intermingling of facts, node-scope, top-scope variables as well as in-manifest variables are due to the complexities of dynamic scoping, and that with that resolved, we may not have as big a usability problem as we do today.

#28 Updated by Anonymous almost 2 years ago

  • Assignee changed from Anonymous to J.D. Welch

#29 Updated by J.D. Welch almost 2 years ago

  • Keywords changed from scope facts lookup hash to scope facts lookup hash backlog dsl

#30 Updated by J.D. Welch almost 2 years ago

  • Keywords changed from scope facts lookup hash backlog dsl to backlog

#31 Updated by eric sorenson almost 2 years ago

  • Keywords changed from backlog to ux

#32 Updated by Chris Spence over 1 year ago

Based on the $clientcert conversation we also need to bear in mind issues around whether node supplied data is trustable – we need some data that isn’t fact based to use to securely identify nodes whilst using conditionals/hierarchy in manifests. This data should not go in a fact namespace. I added a feature request for this: https://projects.puppetlabs.com/issues/19514

#33 Updated by Henrik Lindberg 11 months ago

Throwing one more option into the fire…

If access to facts should be via [] (“hash access”), it could be an operator on a Fact type. e.g.

Fact[osfamily]

It can also be a function call, and when used with future parser’s “inline” call, it can be written

osfamily.fact

I could also imagine supporting both a notation that is guaranteed to get the original fact (i.e. $fact[name], Fact[name], fact(name), name.fact, or whatever) at the same time as facts are also available as variables (which abide by the language scoping rules Du Joure. This would make it backwards compatible, while those that want more strict behavior can use the longer form.

There is also an expectancy that the fact variables are always strings. The new syntax could be smarter and provide proper types (String, Boolean, Number, Array, Hash, etc.). Again, this to be backwards compatible. In addition to this, there is also the discussion of trusted facts to take into account. (There may be the need for the user to accept only a trusted fact, vs. untrusted).

#34 Updated by Erik Dalén 8 months ago

Redmine Issue #11915 has been migrated to JIRA:

https://tickets.puppetlabs.com/browse/PUP-1040

Also available in: Atom PDF