The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

This issue tracker is now in read-only archive mode and automatic ticket export has been disabled. Redmine users will need to create a new JIRA account to file tickets using https://tickets.puppetlabs.com. See the following page for information on filing tickets with JIRA:

Feature #3669

Make puppet honor DNS SRV records

Added by Martin Marcher about 6 years ago. Updated over 3 years ago.

Status:ClosedStart date:04/25/2010
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:3.0.0
Affected Puppet version:development Branch:https://github.com/jhelwig/puppet/tree/ticket/master/3669-make-puppet-honor-DNS-SRV-records
Keywords:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com


Description

I’d like to be able to define where puppet looks for the master server.

I propose the following:

By default try in the following order:

  1. Look for a “x-puppet.tcp.example.com” SRV record (or any name that you think is appropriate, but keep it a SRV record)
  2. For backwards compatibility, if no SRV record is present look for puppet.example.com as a fallback or any value that is configured in the puppet config file

Reasoning:

A System Administrator can easily spread out the load over multiple puppet servers in this way or define some split horizon which answers with the “correct” hostname to use as a puppet master.

Thanks, Martin


Related issues

Related to Puppet - Bug #12623: Long timeout for SRV DNS rsolution Accepted 02/14/2012
Duplicated by Puppet - Feature #4967: Support _srv dns records for locating puppet master and p... Duplicate 10/08/2010

History

#1 Updated by James Turnbull about 6 years ago

  • Status changed from Unreviewed to Needs Decision
  • Assignee set to Luke Kanies

Comments Luke?

#2 Updated by Martin Marcher about 6 years ago

Hi,

(DISCLAIMER: I haven’t looked at the puppet code at all!)

as for “design decision” I don’t think that much of the Puppet design is affected. I’d think the naive approach would be to:

  • let an admin configure it (say: “srv_record = _x-puppetcluster12”) to override the default
  • provide a default value to look for (regarding the DNS SRV)
  • if the “srv_record” config entry is not present and the DNS SRV lookup fails in any way (not configured, querying default value, not receiving results) proceed in the “old” way

To me that seems like:

  • people not using SRV records wouldn’t be affected at all
  • people that want the new feature can very simply upgrade.

Leads me to one open question:

Behaviour if srv_record and server are defined. Options are:

  • fail the puppet client – least desirable IMHO as one could easily break LOTS of machines
  • favor server over srv_record – just the opposite of the above. It would be the conservative option. But that doesn’t mean it’s a safe option. By still using the “old” server I could get unexpected results like erroneously decommissioning the puppet.example.org server and the clients still expecting it to exist
  • favor srv_record over server – also not desireable. I could for example simply add the srv_record config option and NOT remove the server. Since it’s a new feature I’d expect it to pick up and probably change the behaviour… (get config from a new master)

Another option:

  • don’t use 2 different config switches, rather introduce new syntax in the server config option – this would make the config itself slightly more “complex” but would ensure that there’s no possibility of confusion how puppet finds the host to connect to…

Examples:

server = puppet # old style server = SRV:_x-puppet # new style

Personally I’d either use:

  • favor srv_record over server or new syntax in the server config option

#3 Updated by Luke Kanies about 6 years ago

  • Status changed from Needs Decision to Accepted
  • Assignee deleted (Luke Kanies)

#4 Updated by Ohad Levy almost 6 years ago

  • will this support weight as well (e.g. try local server first, fallback to a remote server)?
  • what happens on a failure (timeout, manifest error etc)

#5 Updated by Martin Marcher almost 6 years ago

Ohad Levy wrote:

  • will this support weight as well (e.g. try local server first, fallback to a remote server)?
  • what happens on a failure (timeout, manifest error etc)

In my opinion it should. Not supporting weight would mean that actually there’s no support for SRV records as it’s part of the SRV record definition…

#6 Updated by Ken Barber almost 6 years ago

Great feature request Martin.

what happens on a failure (timeout, manifest error etc)

I’d love to see this handled properly. This will obviate the need for LB’s removing some complexity by making the client do the work.

For example – equal weight/priority should have RR behaviour while failing back to lower priority when necessary.

Timeout and failed connection scenarios are obviously handled this way.

Manifest errors are deeper and changing server won’t guarantee success so probably fall outside of the scope of such a mechanism IMHO. Of course this could be an option in config for those who wanted this behaviour. This is the beauty of doing this in the client above just a basic LB – application awareness.

#7 Updated by Martin Marcher almost 6 years ago

I’d expect “standard” RR behaviour.

  • Honour the weight and priority of SRV record same weight causes simple round robin distribution different weight causes to only request instructions according to standard SRV weight behaviour
  • Failure of all SRV records is the same as now

The wikipedia entry about SRV records is quite complete and links the RFC: http://en.wikipedia.org/wiki/SRV_record – also mentions how weight and priority are supposed to work.

I’d even say: do not try to fall back to _x-puppet SRV record (or the default puppet A/CNAME record). Just fail the client and do nothing.

Manifest errors are deeper and changing server won’t guarantee success so probably fall outside of the scope of such a mechanism IMHO. Of course this could be an option in config for those who wanted this behaviour. This is the beauty of doing this in the client above just a basic LB – application awareness.

I agree if you load balance between servers by using DNS records and you have different manifests on them (working + non-working) you have a huge problem where load balancing can’t help you in any way. Manifest errors are just that: Manifest errors (read: actually code bugs)

#8 Updated by Silviu Paragina almost 6 years ago

The main problem, for load balancing on fail, aren’t the manifests but related files. I’ve noticed that sometimes loading facts and/or files fails despite the manifest being ok. I haven’t had the time to investigate that issue further, to check if it’s a timeout or something else, but this may lead to some nasty problems, when, for example, a fact isn’t loaded.

So imho: * the client should be able to find out if the error is a puppet server error (no memory, timeout etc) or configuration one(missing files or any other type of user/administrator error) * I’m not sure if the client should send all the requests in a puppet run to a single server or should load balance each one. Load balancing each request sounds great, but in fact may do more damage than good as it adds an authentication overhead for each request (just wanted to point out this in any case). * if the client encounters a puppet server error it should try to repeat the request on another server.

Is this kind of behavior implementable without changing half of puppets infrastructure or is it far fetched. Or it may lead to bad behavior I have not foreseen?

Maybe this should be a different ticket, but the note above, about the manifest code not being the same, made me think about the synchronization issues between servers. Most notably that there should be some type of manifest version checking on the client, in case let’s say a “git pull” fails on only one server. If i remember right the client can currently tell the version of the currently applied manifest.

#9 Updated by Luke Kanies almost 6 years ago

  • Target version set to 2.7.x

We’ve got a few related tickets that are already slated for Statler, so I’ll throw this in there for now, and we’ll see if we can get it done then.

#10 Updated by Andrew Forgue over 5 years ago

I hacked a bit tonight on this for a proof of concept. It works for me, but I haven’t done any tests or tried to break it. I may be going about it the wrong way with regard to the indirectors (currently I only modified the REST indirector).

I have a branch on github, here

It uses Resolv::DNS to find SRV records for the host specified as Puppet[:server] and just blindly overrides masterport with the one provided in DNS. if no SRV records are found, it just passes the supplied hostname up so it’ll check A/CNAME records. I defaulted to _puppet._tcp.domain where domain is the domain supplied by facter. It supports the priority and the weighting functionality of the SRV spec.

Here’s some output of a sample run:

  debug: Loaded state in 0.00 seconds
  debug: Searching for SRV records for _puppet._tcp.bosboot.com
  debug: Found 4 SRV records.
  debug: Yielding next server of puppet1.bosboot.com:8140
  debug: Using cached certificate for ca
  debug: Using cached certificate for centauri.bosboot.com
  debug: Using cached certificate_revocation_list for ca
  debug: catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson
  warning: Error connecting to puppet1.bosboot.com:8140: No route to host - connect(2)
  debug: Remaining servers: 3
  debug: Yielding next server of puppet3.bosboot.com:8140
  debug: catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson
  warning: Error connecting to puppet3.bosboot.com:8140: No route to host - connect(2)
  debug: Remaining servers: 2
  debug: Yielding next server of puppet2.bosboot.com:8140
  debug: catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson
  warning: Error connecting to puppet2.bosboot.com:8140: No route to host - connect(2)
  debug: Remaining servers: 1
  debug: Yielding next server of centauri.bosboot.com:8140
  debug: catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson
  info: Caching catalog for centauri.bosboot.com
  debug: Creating default schedules
  debug: Loaded state in 0.00 seconds
  info: Applying configuration version '1292147144'
  debug: Finishing transaction 70221121596680
  debug: Storing state
  debug: Stored state in 0.00 seconds
  notice: Finished catalog run in 0.01 seconds

I’ll post something to the -dev list tomorrow, but I’m going to bed now. Any comments or input (your code sucks, etc) are appreciated.

#11 Updated by Andrew Forgue over 5 years ago

I messed with it some more and made the file resource type able to use SRV records. I also wrote some spec tests for it.

Here’s how it works, it modifies the REST indirector and the file resource type to resolve the host via a new Puppet::Network::Resolver class. This class does a DNS query for SRV record types. If it finds none, it just returns the hostname and port specified for other libraries to resolve the A record. Otherwise, it will yield a list of hostnames, ports and the number of hosts remaining (to decide whether or not to hard fail), the basic way to connect to a host is:

Puppet::Network::Resolver.servers(hostname, port) do |server, port, remaining|
  begin
    .. connect to host ..
  rescue SystemCallError => e
    # We only want to catch things like connection refused, no route to host, etc.

    # Re-raise the last exception if there are no more hosts
    raise unless remaining > 0 
  end
end

Some questions about the behavior that I have and need input for:

  1. Should the client ‘stick’ to the first successful servicer of the SRV record or should it find a new one for every request?
  2. The default host should be _puppet._tcp.$domain, is this acceptable? It’s what the RFC seems to say. I didn’t actually change the default.
  3. Is my weighting method accurate?
  4. Is there anything that connects somewhere else besides the rest indirector and file type that should use SRV records?

I haven’t modified any configuration variables since I would think that the host used in the server configuration variable can be either an A record or a SRV record with SRV records taking precedence since if you have them, you obviously set them up on purpose. I’ll try this in my environment this week before submitting the patch to -dev, but for now you can get it from my github branch: https://github.com/ajf/puppet/commits/feature%2F2.6.x%2F3669

#12 Updated by Anonymous about 5 years ago

  • Assignee set to Anonymous

We have some work-in-progress work integrating this with the next branch, but it is not yet complete. This is required because there has been some API drift since, and because testing identified some minor defects in the algorithms for managing server selection. Further work is required to validate the changes to file serving before this is able to merge.

The WIP branch is available here: https://github.com/daniel-pittman/puppet/tree/feature/next/3669-make-puppet-honor-DNS-SRV-records

Be warned that this will have history rewritten; if you intend to base code off this please communicate with us. :)

#13 Updated by Matt Robinson about 5 years ago

  • Assignee changed from Anonymous to Jacob Helwig

Jacob and Jesse did some review of this code and have comments that need to be reflected in this ticket, so I’m assigning this to Jacob.

#14 Updated by Jacob Helwig about 5 years ago

  • Assignee deleted (Jacob Helwig)
  • Branch set to https://github.com/jhelwig/puppet/tree/ticket/next/3669-make-puppet-honor-DNS-SRV-records

My branch has another “Work-In-Progress” commit on top of the work Daniel had done on the branch. My WIP commit fixes a couple of missed method renames, and adjusts the weighted shuffle used for picking servers to better match what is described in RFC 2782 for handling 0 weighted entries.

There is still some work left to do that Jesse and I discovered in our review of the branch.

SRV record support shouldn’t be turned off if you specify a server to use, since you should be able to specify a fall-back server to use other than the default of “puppet”. Also, the CA, reports, and file-serving look like they would be broken by this, unless you had “puppet” resolvable by DNS.

We should probably open up discussion about whether the full SRV record name should be specified in the setting, or just the domain to use for the SRV record. Jesse and I are of the opinion that only the domain should be specified in the config.

It would be really nice to have SRV services for _puppet_ca, _puppet_report, and _puppet_fileserver (all of which falling back to _puppet, then the server setting), though these don’t seem absolutely necessary for a first-round feature, provided you can still adjust the server setting when SRV record support is enabled.

#15 Updated by Jacob Helwig about 5 years ago

  • Status changed from Accepted to Code Insufficient

#16 Updated by Jacob Helwig almost 5 years ago

  • Assignee set to Jacob Helwig

#17 Updated by Jacob Helwig almost 5 years ago

  • Status changed from Code Insufficient to In Topic Branch Pending Review

I’ve had some time to work on my WIP branch, and address the concerns that were raised previously. I’d appreciate it if people could test this branch out, and provide feedback.

(#3669) Find servers via DNS SRV records

This adds two new configuration variables:

  * use_srv_records: Will attempt to lookup SRV records for hostname
                     found in srv_record (default: true)

  * srv_domain: The domain that will be queried for SRV records,
                (default: $domain)

If use_srv_records is set to true, then Puppet will attempt to find
the list of servers to use from SRV records on the domain specified
via srv_domain.  The CA, report, and file servers can all be specified
via independent SRV records from the SRV records to use for looking up
the catalog server.

The SRV records must be for hosts in the form:

  _puppet._tcp.$srv_domain
  _puppet_ca._tcp.$srv_domain
  _puppet_report._tcp.$srv_domain
  _puppet_fileserver._tcp.$srv_domain

If no records are found for the _puppet_ca, _puppet_report, or
_puppet_fileserver services , then the SRV records for the _puppet
service.  However, if records exist for any of the more specific
services, Puppet will not attempt to use the _puppet service to find
an applicable server.

If Puppet is unable to connect to any of the servers specified in the
SRV records, then it will attempt to connect to the "normal" servers
settable via puppet.conf.

#18 Updated by Martin Marcher almost 5 years ago

Status changed from Code Insufficient to In Topic Branch Pending Merge

I love you guys :)

#19 Updated by Jacob Helwig almost 5 years ago

  • Target version changed from 2.7.x to 3.x

Daniel reminded me that the service name goes off of Assigned Numbers. Unless someone goes through the trouble of registering those services with IANA, we’ll need to keep the s/_puppet/_x-puppet/ on the service names (also replacing the internal _ with -).

I’ve updated the branch to reflect this, and the updated SRV hostnames to use would follow the form:

_x-puppet._tcp.$srv_domain
_x-puppet-ca._tcp.$srv_domain
_x-puppet-report._tcp.$srv_domain
_x-puppet-fileserver._tcp.$srv_domain

#20 Updated by Luke Kanies almost 5 years ago

Isn’t this released now?

#21 Updated by Jacob Helwig almost 5 years ago

Not yet. It hasn’t been reviewed & merged.

#22 Updated by Nigel Kersten over 4 years ago

Jacob, should we take this to the community to try and get more testing?

I’d love to see some feedback from folks actually running with SRV records in the real world .

#23 Updated by Jacob Helwig over 4 years ago

  • Status changed from In Topic Branch Pending Review to Code Insufficient

Yes, but not quite yet. It needs to be updated with all the changes that have happened in master since it was last worked on. It’s currently causing all tests to fail once it’s rebased on to the tip of master. Also, the code that is in the branch currently will cause problems for people that explicitly specify a master in their puppet://[$master]/$path sources, since it will ignore the specified master. Once these two issues have been resolved then I completely agree that having more public testing of it would be great.

#24 Updated by Jacob Helwig over 4 years ago

It looks like commit:f898749946195cc4e27c7502b07a25aa57e5abd9 is where things start failing across the board. Will be investigating further.

#25 Updated by Jacob Helwig over 4 years ago

  • Status changed from Code Insufficient to In Topic Branch Pending Review
  • Branch changed from https://github.com/jhelwig/puppet/tree/ticket/next/3669-make-puppet-honor-DNS-SRV-records to https://github.com/jhelwig/puppet/tree/ticket/master/3669-make-puppet-honor-DNS-SRV-records

Test failures, and overriding specified servers in puppet://$master/... URIs should be resolved now.

Opened pull request 163.

#26 Updated by Josh Cooper over 4 years ago

So, just to document some decisions we’ve reconfirmed:

  1. The functionality is enabled by default, otherwise, it’s not zero conf. The behavior can be disabled by setting the following on the agent: use_srv_records=false

  2. SRV lookups are chatty, partly because a new Puppet::Network::Resolver is created for every REST terminus method called, e.g. find. And sometimes the same method is called multiple times for a single request. I would have thought the OS would cache SRV lookups in the same way that it caches A records, but this does not appear to be the case. The output below shows the agent trying to find the ca certificate and in the process issuing 3 SRV lookups:

$ sudo tcpdump -i en1 'udp port 53'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en1, link-type EN10MB (Ethernet), capture size 65535 bytes
17:02:53.490502 IP puppetmaster.31993 > imana.puppetlabs.lan.domain: 32207+ SRV? _x-puppet-ca._tcp.perlninja.com. (49)
17:02:53.536354 IP imana.puppetlabs.lan.domain > puppetmaster.31993: 32207 1/1/0 CNAME jacob.ath.cx. (136)
17:02:53.539332 IP puppetmaster.11017 > imana.puppetlabs.lan.domain: 45795+ SRV? _x-puppet._tcp.perlninja.com. (46)
17:02:53.554410 IP imana.puppetlabs.lan.domain > puppetmaster.11017: 45795 1/0/0 SRV sfa-5.perlninja.com.:8140 20 0 (85)
17:02:53.558851 IP puppetmaster.57268 > imana.puppetlabs.lan.domain: 46337+ AAAA? jacob.ath.cx. (30)
17:02:53.637247 IP imana.puppetlabs.lan.domain > puppetmaster.57268: 46337 0/1/0 (91)
17:02:53.891668 IP puppetmaster.58364 > imana.puppetlabs.lan.domain: 64554+ A? jacob.ath.cx. (30)
17:02:53.944065 IP imana.puppetlabs.lan.domain > puppetmaster.58364: 64554 1/0/0 A 50.53.17.227 (46)
17:02:54.049313 IP puppetmaster.64864 > imana.puppetlabs.lan.domain: 15898+ PTR? 1.100.168.192.in-addr.arpa. (44)
17:02:54.050980 IP imana.puppetlabs.lan.domain > puppetmaster.64864: 15898* 1/0/0 PTR imana.puppetlabs.lan. (78)
17:02:54.737792 IP puppetmaster.16716 > imana.puppetlabs.lan.domain: 18454+ SRV? _x-puppet-ca._tcp.perlninja.com. (49)
17:02:54.753520 IP imana.puppetlabs.lan.domain > puppetmaster.16716: 18454 1/1/0 CNAME jacob.ath.cx. (136)
17:02:54.755431 IP puppetmaster.39986 > imana.puppetlabs.lan.domain: 17417+ SRV? _x-puppet._tcp.perlninja.com. (46)
17:02:54.771359 IP imana.puppetlabs.lan.domain > puppetmaster.39986: 17417 1/0/0 SRV sfa-5.perlninja.com.:8140 20 0 (85)
17:02:55.155234 IP puppetmaster.17065 > imana.puppetlabs.lan.domain: 64870+ SRV? _x-puppet-ca._tcp.perlninja.com. (49)
17:02:55.198347 IP imana.puppetlabs.lan.domain > puppetmaster.17065: 64870 1/1/0 CNAME jacob.ath.cx. (136)
17:02:55.200282 IP puppetmaster.37872 > imana.puppetlabs.lan.domain: 56142+ SRV? _x-puppet._tcp.perlninja.com. (46)
17:02:55.215672 IP imana.puppetlabs.lan.domain > puppetmaster.37872: 56142 1/0/0 SRV sfa-5.perlninja.com.:8140 20 0 (85)

This does mean that within a single transaction, the agent could connect to different file servers. If this becomes an issue, we can investigate creating one resolver per request or per transaction (somewhat analogous to load-balancer affinity/stickyness)

#27 Updated by Josh Cooper over 4 years ago

  • Status changed from In Topic Branch Pending Review to Merged - Pending Release

This was merged in commit:26ce9c79672d578e9aa03d8341d8c315fcf30c8b in master https://github.com/puppetlabs/puppet/commit/26ce9c79672d578e9aa03d8341d8c315fcf30c8b

NOTE: make sure when documenting this feature that it is enabled by default, in the sense that agents will make additional SRV DNS queries, but will gracefully fall back to the old behavior, e.g connecting to Puppet[:server]. You must set use_srv_records=false in order to disable DNS SRV lookups.

#28 Updated by Trevor Vaughan about 4 years ago

Any guesses as to when this will be in a full release?

Thanks.

#29 Updated by Anonymous about 4 years ago

Trevor Vaughan wrote:

Any guesses as to when this will be in a full release?

It is in Telly, which we expect to release Q2 this year.

#30 Updated by Ben Hughes about 4 years ago

  • Status changed from Merged - Pending Release to Re-opened
  • Assignee deleted (Jacob Helwig)

I can’t seem to get this to work.

Running head, 32172d5:

[root@hackday:puppet]# ./ext/envpuppet puppet agent -t -v                                                                                                   puppet:master:32172d5
info: Retrieving plugin
err: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate: getaddrinfo: Name or service not known
err: /File[/var/lib/puppet/lib]: Could not evaluate: getaddrinfo: Name or service not known Could not retrieve file metadata for puppet://puppet/plugins: getaddrinfo: Name or service not known

and the DNS side of things:

[root@hackday:puppet]# tcpdump -ntpi eth0 -v port 53                                                                                                        puppet:master:32172d5
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
IP (tos 0x0, ttl 64, id 4940, offset 0, flags [DF], proto UDP (17), length 71)
    10.0.1.135.33341 > 10.0.1.20.53: 45256+ A? puppet.example.org. (43)
IP (tos 0x0, ttl 64, id 11813, offset 0, flags [none], proto UDP (17), length 116)
    10.0.1.20.53 > 10.0.1.135.33341: 45256 NXDomain* 0/1/0 (88)
IP (tos 0x0, ttl 64, id 4941, offset 0, flags [DF], proto UDP (17), length 71)
    10.0.1.135.33341 > 10.0.1.20.53: 50664+ AAAA? puppet.example.org. (43)
IP (tos 0x0, ttl 64, id 11814, offset 0, flags [none], proto UDP (17), length 116)
    10.0.1.20.53 > 10.0.1.135.33341: 50664 NXDomain* 0/1/0 (88)
IP (tos 0x0, ttl 64, id 4941, offset 0, flags [DF], proto UDP (17), length 52)
    10.0.1.135.42901 > 10.0.1.20.53: 57120+ A? puppet. (24)
IP (tos 0x0, ttl 64, id 4942, offset 0, flags [DF], proto UDP (17), length 52)
    10.0.1.135.42901 > 10.0.1.20.53: 12381+ AAAA? puppet. (24)
IP (tos 0x0, ttl 64, id 11815, offset 0, flags [none], proto UDP (17), length 127)
    10.0.1.20.53 > 10.0.1.135.42901: 57120 NXDomain 0/1/0 (99)
IP (tos 0x0, ttl 64, id 11816, offset 0, flags [none], proto UDP (17), length 127)
    10.0.1.20.53 > 10.0.1.135.42901: 12381 NXDomain 0/1/0 (99)

My puppet.conf even has the following just in case:

[agent]
use_srv_records = true

Am I doing it wrong?

#31 Updated by Nigel Kersten about 4 years ago

Ben, any chance you could roll back master to the original commit from Jacob and see whether that works? Then we can whip up a quick git-bisect test to find out when we broke it.

#32 Updated by Josh Cooper about 4 years ago

When issuing a request to a specific server, e.g. puppet://server/path, we do not perform SRV lookups https://github.com/puppetlabs/puppet/commit/26ce9c79672d578e9aa03d8341d8c315fcf30c8b#diff-4

  # We were given a specific server to use, so just use that one.
  # This happens if someone does something like specifying a file
  # source using a puppet:// URI with a specific server.

However, the default pluginsource is puppet://$server/plugins, which explains why the SRV lookups are being skipped. I didn’t notice this earlier, because pluginsync used to be off by default.

With that said, I’m not sure what the correct thing to do here. Omitting the server from the URI, e.g. puppet:///plugins, will likely allow pluginsync to use SRV lookups, but factsource will have the same problem.

#33 Updated by Ben Hughes about 4 years ago

Excuse my ignorance, but isn’t this before pluginsync? This is just talking to the master to try to get anything? Before it reads its catalogue and the likes?

#34 Updated by Nigel Kersten about 4 years ago

FWIW ‘factsource’ should be deprecated, and I’d actually thought it already was… ?

pluginsync replaces factsource as a more general solution for delivering plugins of all kinds, not just facts.

As pluginsync happens first, it should use the SRV records if no server has been explicitly specified by the user.

#35 Updated by Ben Hughes about 4 years ago

Checking out 26ce9c79672d578e9aa03d8341d8c315fcf30c8b and trying it doesn’t yield any different results for me. So I must be doing something wrong I imagine.

#36 Updated by Ben Hughes about 4 years ago

Ah-ha! I can’t use env puppet:

[root@hackday:puppet]# tcpdump -ntpi eth0 -v port 53 | grep SRV
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
    10.0.1.135.28940 > 10.0.1.20.53: 15878+ SRV? _x-puppet._tcp.example.org. (44)
    10.0.1.135.28940 > 10.0.1.20.53: 35005+ SRV? _x-puppet._tcp.example.org.example.org. (56)
    10.0.1.135.32001 > 10.0.1.20.53: 53744+ SRV? _x-puppet-report._tcp.example.org. (51)
    10.0.1.135.32001 > 10.0.1.20.53: 18766+ SRV? _x-puppet-report._tcp.example.org.example.org. (63)
    10.0.1.135.36640 > 10.0.1.20.53: 45958+ SRV? _x-puppet._tcp.example.org. (44)
    10.0.1.135.36640 > 10.0.1.20.53: 43641+ SRV? _x-puppet._tcp.example.org.example.org. (56)

With the 26ce9c79672d578e9aa03d8341d8c315fcf30c8b code base.

and with head:

[root@hackday:puppet]# tcpdump -ntpi eth0 -v port 53 | grep SRV 
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
    10.0.1.135.37645 > 10.0.1.20.53: 51039+ SRV? _x-puppet._tcp.example.org. (44)
    10.0.1.135.37645 > 10.0.1.20.53: 360+ SRV? _x-puppet._tcp.example.org.example.org. (56)
    10.0.1.135.4455 > 10.0.1.20.53: 4292+ SRV? _x-puppet-report._tcp.example.org. (51)
    10.0.1.135.4455 > 10.0.1.20.53: 24676+ SRV? _x-puppet-report._tcp.example.org.example.org. (63)
    10.0.1.135.39931 > 10.0.1.20.53: 6178+ SRV? _x-puppet._tcp.example.org. (44)
    10.0.1.135.39931 > 10.0.1.20.53: 24194+ SRV? _x-puppet._tcp.example.org.example.org. (56)

Putting back in the SRV record…

[root@hackday:~]# bash  ~/src/puppet/ext/envpuppet puppet agent  -t                                                                                                               
info: Retrieving plugin
err: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate: getaddrinfo: Name or service not known
err: /File[/var/lib/puppet/lib]: Could not evaluate: getaddrinfo: Name or service not known Could not retrieve file metadata for puppet://puppet/plugins: getaddrinfo: Name or service not known
info: Loading facts in /var/lib/puppet/lib/facter/myfact.rb
 info: Caching catalog for hackday.example.org
....

and see

[root@hackday:puppet]# tcpdump -ntpi eth0 -v port 53 | grep SRV 
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
   10.0.1.135.49497 > 10.0.1.20.53: 14566+ SRV? _x-puppet._tcp.example.org. (51)
   10.0.1.20.53 > 10.0.1.135.49497: 14566* 1/1/2 _x-puppet._tcp.example.org. SRV puppetmaster.example.com.:8140 0 0 (144)
   10.0.1.135.60249 > 10.0.1.20.53: 47303+ SRV? _x-puppet-report._tcp.example.org. (58)

#37 Updated by Anonymous almost 4 years ago

  • Status changed from Re-opened to Closed
  • Target version changed from 3.x to 3.0.0

#38 Updated by eric sorenson over 3 years ago

Since this is the original bug for SRV record support, I’ll note here that contra Josh’s comments 26 and 27, srv record lookup is disabled by default in 3.0. It’s true that this makes it not-zero-config, but there are a raft of issues including substantial slowdowns if the records are not available.

https://github.com/puppetlabs/puppet/commit/ac1b9d55a95f27a54ff924081d68687792dbd194

Also available in: Atom PDF