The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #13244

Class and stage dependencies lead to expensive graph computation

Added by Oliver Hookins about 2 years ago. Updated almost 2 years ago.

Status:Needs DecisionStart date:03/20/2012
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:stages
Target version:-
Affected Puppet version:2.6.7 Branch:
Keywords:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket may be automatically exported to the PUP project on JIRA using the button below:


Description

It’s an odd problem and I’m not 100% on the cause, but I’m throwing it out there in case someone knows for sure it has been fixed (or if it sounds like it could be a legitimate bug), as gathering repeatable data for testing may not be straightforward (it’s intrinsically linked to our code which I can’t release).

We have three classes and three stages which seem to be playing a role:

Classes: – common – yum::base – platform::users (has a ‘require’ on common)

Stages: – common (contains the common and platform::users classes) – yum (contains the yum::base class) – main (contains everything else)

The stage ordering is this: Stage[‘yum’]–>Stage[‘common’]–>Stage[‘main’]

When this scenario is in place, we see an extremely long processing time on the client before any resources are run (perhaps 10 minutes). The run itself takes about 45 minutes.

When we remove the require from platform::users to common, this problem goes away and the catalog run takes about 20 seconds.

For some comparison, in the former case the extended resource graph output size is 11MB and in the latter about 800KB. I can’t attach them right now for confidentiality reasons but if a test scenario can be generated of course I’ll attach all output.

Does this sound like a bug?

foo.pp (104 KB) Oliver Hookins, 03/28/2012 12:13 pm

History

#1 Updated by Daniel Pittman about 2 years ago

  • Status changed from Unreviewed to Needs More Information

I can’t say for sure that it is a bug qua bug, but it is absolutely something we would like to fix if at all possible. If you could cut down the data so that we could have access to at least the catalogs, and ideally the input manifests, that would be awesome. We might be able to work something out WRT confidentiality if you can’t do the cutting, but might entertain giving us private access to the data.

#2 Updated by Oliver Hookins about 2 years ago

Actually it is very easy to duplicate – see the attached file.

Everything in the ‘main’ stage:

$ puppet apply foo.pp 
warning: Could not retrieve fact fqdn
notice: Finished catalog run in 5.11 seconds

With class ‘a’ in stage ‘pre’ which is set to come before ‘main’:

$ puppet apply foo.pp 
warning: Could not retrieve fact fqdn
notice: Finished catalog run in 1224.36 seconds

During this time, Puppet is simply chewing CPU cycles (assembling/traversing/something the graph, I’d imagine).

It seems like run stages are just computationally expensive by nature, and when the number of resources grows over a few hundred we really start to notice the scaling problems.

#3 Updated by Oliver Hookins about 2 years ago

FWIW it actually looks like the problem is at least partially mitigated in 2.7.12 (only version I’ve tested with in 2.7.x):

No extra run stages:

$ puppet apply foo.pp 
warning: Could not retrieve fact fqdn
notice: Finished catalog run in 7.52 seconds

Separate run stages:

$ puppet apply foo.pp 
warning: Could not retrieve fact fqdn
notice: Finished catalog run in 25.64 seconds

#4 Updated by Oliver Hookins about 2 years ago

  • Status changed from Needs More Information to Needs Decision

#5 Updated by eric sorenson almost 2 years ago

  • Category set to stages

Sweeping stage-related bugs, Oliver please update if you’ve got further information on this (i.e. is it still affecting you in late 2.7.x , where x >= 15?)

Also available in: Atom PDF