Bug #13244
Class and stage dependencies lead to expensive graph computation
| Status: | Needs Decision | Start date: | 03/20/2012 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | - | % Done: | 0% | |
| Category: | stages | |||
| Target version: | - | |||
| Affected Puppet version: | 2.6.7 | Branch: | ||
| Keywords: | ||||
Description
It’s an odd problem and I’m not 100% on the cause, but I’m throwing it out there in case someone knows for sure it has been fixed (or if it sounds like it could be a legitimate bug), as gathering repeatable data for testing may not be straightforward (it’s intrinsically linked to our code which I can’t release).
We have three classes and three stages which seem to be playing a role:
Classes: – common – yum::base – platform::users (has a ‘require’ on common)
Stages: – common (contains the common and platform::users classes) – yum (contains the yum::base class) – main (contains everything else)
The stage ordering is this: Stage[‘yum’]–>Stage[‘common’]–>Stage[‘main’]
When this scenario is in place, we see an extremely long processing time on the client before any resources are run (perhaps 10 minutes). The run itself takes about 45 minutes.
When we remove the require from platform::users to common, this problem goes away and the catalog run takes about 20 seconds.
For some comparison, in the former case the extended resource graph output size is 11MB and in the latter about 800KB. I can’t attach them right now for confidentiality reasons but if a test scenario can be generated of course I’ll attach all output.
Does this sound like a bug?
History
#1
Updated by Daniel Pittman about 1 year ago
- Status changed from Unreviewed to Needs More Information
I can’t say for sure that it is a bug qua bug, but it is absolutely something we would like to fix if at all possible. If you could cut down the data so that we could have access to at least the catalogs, and ideally the input manifests, that would be awesome. We might be able to work something out WRT confidentiality if you can’t do the cutting, but might entertain giving us private access to the data.
#2
Updated by Oliver Hookins about 1 year ago
- File foo.pp added
Actually it is very easy to duplicate – see the attached file.
Everything in the ‘main’ stage:
$ puppet apply foo.pp warning: Could not retrieve fact fqdn notice: Finished catalog run in 5.11 seconds
With class ‘a’ in stage ‘pre’ which is set to come before ‘main’:
$ puppet apply foo.pp warning: Could not retrieve fact fqdn notice: Finished catalog run in 1224.36 seconds
During this time, Puppet is simply chewing CPU cycles (assembling/traversing/something the graph, I’d imagine).
It seems like run stages are just computationally expensive by nature, and when the number of resources grows over a few hundred we really start to notice the scaling problems.
#3
Updated by Oliver Hookins about 1 year ago
FWIW it actually looks like the problem is at least partially mitigated in 2.7.12 (only version I’ve tested with in 2.7.x):
No extra run stages:
$ puppet apply foo.pp warning: Could not retrieve fact fqdn notice: Finished catalog run in 7.52 seconds
Separate run stages:
$ puppet apply foo.pp warning: Could not retrieve fact fqdn notice: Finished catalog run in 25.64 seconds
#4
Updated by Oliver Hookins about 1 year ago
- Status changed from Needs More Information to Needs Decision
#5
Updated by eric sorenson 12 months ago
- Category set to stages
Sweeping stage-related bugs, Oliver please update if you’ve got further information on this (i.e. is it still affecting you in late 2.7.x , where x >= 15?)