The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

This issue tracker is now in read-only archive mode and automatic ticket export has been disabled. Redmine users will need to create a new JIRA account to file tickets using https://tickets.puppetlabs.com. See the following page for information on filing tickets with JIRA:

Bug #16753

Need the ability to list all nodes

Added by James Turnbull over 3 years ago. Updated over 3 years ago.

Status:ClosedStart date:10/03/2012
Priority:ImmediateDue date:
Assignee:-% Done:

0%

Category:indirector
Target version:3.1.0
Affected Puppet version:3.0.0 Branch:https://github.com/puppetlabs/puppet/pull/1317
Keywords:backlog

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com


Description

In Puppet 3.0 we’ve disabled the default YAML node cache (see https://github.com/puppetlabs/puppet/commit/5a79d9abd96e73ff166527cdee69a30da8ab0f87).

I use this code (and a number of others in the community use similar) to return a list of nodes:

    Puppet[:clientyamldir] = Puppet[:yamldir]
      if Puppet::Node.respond_to? :terminus_class
        Puppet::Node.terminus_class = :yaml
        nodes = Puppet::Node.search("*")
      else
        Puppet::Node.indirection.terminus_class = :yaml
        nodes = Puppet::Node.indirection.search("*")
      end 

This now doesn’t work.

We need a method of returning the current list of nodes the master knows about.

Currently available is:

puppet node find 'hostname'

If we had:

puppet node search '*'

That would meet my needs.


Related issues

Related to Puppet - Bug #3910: Server is not authoritative over client environment when ... Closed
Related to Puppet - Bug #16698: external node classifier script is not being called when ... Closed 10/02/2012
Related to Puppet - Bug #17692: Puppet doesn't report environment as fact Duplicate

History

#1 Updated by Anonymous over 3 years ago

  • Status changed from Unreviewed to Needs Decision
  • Assignee set to eric sorenson

The reason this was removed was to support the changes that made the ENC authoritative over the agent environment. As part of that we had a bootstrapping problem: the agent had an idea of the environment to request, used that in pluginsync, and then as part of the request for the catalog.

If that idea was wrong, the catalog would be returned from the correct, ENC specified environment, but it would have been generated with the wrong set of plugins – including custom facts. So, the agent would detect that, pluginsync to the new environment in the catalog, and compile a new catalog.

That fixed the problem, but was inefficient – every agent run with an incorrect environment would mean two catalog compilations, and doubling master load in a common situation (ENC says !production, agent run from cron) was pretty unacceptable.

So, instead, the agent was changed to query the master for node data about itself – and to use the environment that came back from that.

This had a side effect: it also changed the sequence of indirection calls on the master. Previously the first thing to ask for a node explicitly disabled the cache, so it went to the back-end, and then any subsequent queries would use the updated YAML cache data.

Now that the agent was asking for the same information it was getting stale YAML cache data – a problem observed in the real world – and leading to the same inefficiency every time the ENC changed the catalog definition.

The root cause was cache invalidation: the client couldn’t know if the master was configured with a different (and important) node data cache, so couldn’t just bypass cache entirely. We could have added more special case code to handle that, and have the terminus implementations aware of it, but that seemed uglier than just eliminating the cache entirely.

The primary consumers of this don’t seem to care about data being read from the YAML cache as part of the operation of the master, they just want to interact with it outside the normal compilation processes.

It should be practical to use, instead of a real YAML cache, a “write only” YAML cache over the node terminus in the master. This would store the node data in YAML form on disk, but would never return anything from find. This terminus should probably be named something other than “YAML”.

That allows the following: 1. Users can just use the yaml terminus to read this data 2. The master writes yaml terminus data, but never reads from the cache 3. The agent never sees stale data 4. The yaml terminus can search, even if the node backend terminus is plain

Other approaches exist, but are not awesome. (Ideally, of course, we will just push users to consult PuppetDB instead of this cache hack, but until that is everywhere this is likely the best course.)

Also of note: the YAML cache was not user configurable unless they used the routes.yaml facility, so it is unlikely that many users have changed this away from the default behaviour.

#2 Updated by eric sorenson over 3 years ago

  • Target version changed from 3.x to 3.0.1

We need to restore the master-side yaml writing in the first point release. This was pretty clearly an unintended (and therefore unannounced) consequence of another change.

#3 Updated by eric sorenson over 3 years ago

  • Status changed from Needs Decision to Accepted

#4 Updated by Anonymous over 3 years ago

The YAML cache for the node terminus was removed in order to make the ENC authoritative in #3910. Commit https://github.com/puppetlabs/puppet/commit/5a79d9abd96e73ff166527cdee69a30da8ab0f87 is the one that made this change.

This removal also caused #16698; however, simply putting the node cache back in would not work since we would regress on #3910, so I think we need to follow Daniel’s suggestion and create a special terminus that is just for writing out YAML and will never read it back in.

#5 Updated by Anonymous over 3 years ago

  • Target version changed from 3.0.1 to 3.0.x

This isn’t making the cut for 3.0.1. Retargeting at 3.0.x.

#6 Updated by Anonymous over 3 years ago

After some looking into this there is a discussion on the mailing list at https://groups.google.com/forum/?fromgroups=#!topic/puppet-users/5c7bg1thdIQ in which it is mentioned that puppet node find node2.mylocal --terminus plain --render-as yaml produces the same output as looking at the YAML files. If this is true, then is using that a viable alternative?

I would prefer to push integrations with puppet to use these commands and the REST API rather than relying on files that puppet happens to write to disk.

#7 Updated by Nigel Kersten over 3 years ago

Note too that another option is to use the PuppetDB API to list nodes.

http://docs.puppetlabs.com/puppetdb/1/spec_q_nodes.html

#8 Updated by Nigel Kersten over 3 years ago

Andrew Parker wrote:

After some looking into this there is a discussion on the mailing list at https://groups.google.com/forum/?fromgroups=#!topic/puppet-users/5c7bg1thdIQ in which it is mentioned that puppet node find node2.mylocal --terminus plain --render-as yaml produces the same output as looking at the YAML files. If this is true, then is using that a viable alternative?

Unless James and other integrations know the node name before the call, I don’t think so, as you only have find, not list or search, and can’t search on ‘*’ with that API.

#9 Updated by James Turnbull over 3 years ago

  • Subject changed from YAML node cache disabled to Need the ability to list all nodes
  • Description updated (diff)

#10 Updated by eric sorenson over 3 years ago

  • Keywords set to backlog

#11 Updated by eric sorenson over 3 years ago

  • Priority changed from Normal to High

#12 Updated by Anonymous over 3 years ago

  • Target version changed from 3.0.x to 3.1.0

This is part of our work for the 3.1 release.

#13 Updated by Henrik Lindberg over 3 years ago

  • Assignee changed from eric sorenson to Henrik Lindberg

#14 Updated by Henrik Lindberg over 3 years ago

  • Status changed from Accepted to In Topic Branch Pending Review
  • Branch set to https://github.com/puppetlabs/puppet/pull/1305

An implementation of Daniel Pittman’s idea is now in https://github.com/puppetlabs/puppet/pull/1305

By default, the master application will now “cache” nodes using the Write Only Yaml (woy) terminus. If this is not wanted (since there will be one file written for every node ever talking to the master), the setting node_cache_terminus can be set to nothing.

With the woy terminus in place, it is possible to use find and search using the yaml terminus. (The original problem of reading stale data from the cache is avoided since the write only cache never finds anything).

The recommended way to query for node information is to use PuppetDB and set node_cache_terminus to nothing.

#15 Updated by Anonymous over 3 years ago

  • Branch changed from https://github.com/puppetlabs/puppet/pull/1305 to https://github.com/puppetlabs/puppet/pull/1317

Fixed a few issues in Henrik’s changes. New pull request submitted.

#16 Updated by Anonymous over 3 years ago

The new implementation can be turned off by doing the following in puppet.conf.

[master]
node_cache_terminus = 

The code given in the original report should work, but you can also use the puppet node subcommand by creating a routes file

routes.yaml

---
node:
  node:
    terminus: yaml
    cache: yaml

and then execute

puppet node search '*' --route_file routes.yaml --clientyamldir `puppet master --configprint yamldir`

Using this approach is more likely to continue working than setting the terminus directly since the test that was written for this validates using the puppet node search command.

I don’t think that the structure of that data is actually documented anywhere, and so relying on anything beyond the name element in the top of each hash is probably as far as can be guaranteed at the moment.

#17 Updated by Anonymous over 3 years ago

Turns out that the routes file isn’t needed at all. The same thing can be achieved by using the --node_terminus parameter.

puppet node search '*' --node_terminus yaml --clientyamldir `puppet master --configprint yamldir`

#18 Updated by Anonymous over 3 years ago

  • Status changed from In Topic Branch Pending Review to Merged - Pending Release
  • Assignee deleted (Henrik Lindberg)

#19 Updated by eric sorenson over 3 years ago

  • Priority changed from High to Immediate

#20 Updated by Matthaus Owens over 3 years ago

  • Status changed from Merged - Pending Release to Closed

Released in Puppet 3.1.0-rc1

Also available in: Atom PDF