The Puppet Labs Issue Tracker has Moved:

Bug #14759

state.yaml balooning when using dictionary as resource $name

Added by Rich Rauenzahn almost 3 years ago. Updated over 2 years ago.

Status:Needs More InformationStart date:05/30/2012
Priority:NormalDue date:
Assignee:Rich Rauenzahn% Done:


Target version:-
Affected Puppet version: Branch:

We've Moved!

Ticket tracking is now hosted in JIRA:


We had some puppet processes ballooning and I spent some time trying to figure out why. Turns out our state.yaml is something like 500MB after the run.

Examining the state.yaml shows a particular resource ballooning that we use for iterating over a moderately large list of dictionaries.

We use a custom function to generate a list of dictionaries, and then we pass the dictionary to the resource, the resource picks apart the dictionary, filters based on some keys, etc., and creates other resources.

I verified the custom function is behaving appropriately — it creates something around 180 items. The state.yaml balloons it up to about 400,000 – kind of a cartesian product I think.

oversimplified example:

$list = generate_list()

define process_list {

if $list[‘foo’] == ‘xxxx’ {

   file {

} }

What I found in the state.yaml was each element (a dict) of $list stringified and listed as Process_List[] multiple times, but the stringification listed the dictionary in different orders, causing the same entry to be duplicated, but represented as a different strinfication of the dict.

My guess is that internally Puppet is stringifying the dictionary in multiple locations and this causes duplicate resources to be defined when creating the state to be saved in the state.yaml (I don’t think it is doing that for determining resources to be created; otherwise we’d be seeing duplication errors).

Anyway, I know it is kind of an odd case we’re using here, but the ballooning is so severe (~4GB) I wanted to report it.


#1 Updated by Rich Rauenzahn almost 3 years ago

$ puppet —version 2.7.1

#2 Updated by Rich Rauenzahn almost 3 years ago

Anyone looking for a workaround, instead of passing a list of dicts to iterate over a list of data in order to create resources like this…

define a_resource { $foo = $name[‘key1’] $bar = $name[‘key2’] }

a_resource { $my_list_of_dicts: }

Write a custom function that transforms your dict into something like:

$enumed = [ $list_of_keys, $dict ]

Where $list_of_keys is something like [ 1,2,3,4 ] and $dict is { 1: { … }, 2: {…} }

Then change a_resource to

define a_resource($key=$name, $dict) { $foo = $dict[$key][‘key1’] $bar = $dict[$key][‘key2’] }

a_resource { $enumed[0]: dict => $enumed[1] }

Which is really the better way to be doing this anyway rather than relying on puppet allowing $name to be a dict.

#3 Updated by Kelsey Hightower over 2 years ago

  • Description updated (diff)
  • Status changed from Unreviewed to Needs More Information
  • Assignee set to Rich Rauenzahn


Thanks for reporting this. You’ve got me really curious now. I’ve noticed that you’re running Puppet 2.7.1; are you in a position to test this on never versions of Puppet? I would like to know if this is an issue on later versions of Puppet.

#4 Updated by Rich Rauenzahn over 2 years ago

Let me see if I can make an isolated test case. Can I make a state.yaml be created from running a puppet /tmp/foo.pp?

Also, I should mention this list of hashes of ours is often created by reading in some json in a custom function, parsing it with pson, and returning the object to puppet.

#5 Updated by Rich Rauenzahn over 2 years ago

So, I had to use the PSON parser to get this to reproduce:

I added this to one of our nodes temporarily:

$json = '[{"lookforme9": 9, "lookforme8": 8, "lookforme3": 3, "lookforme2": 2, "lookforme1": 1, "lookforme0": 0, "lookforme7": 7, "lookforme6": 6, "lookf
orme5": 5, "lookforme4": 4}, {"lookforme9": 19, "lookforme8": 18, "lookforme3": 13, "lookforme2": 12, "lookforme1": 11, "lookforme0": 10, "lookforme7": 17, "
lookforme6": 16, "lookforme5": 15, "lookforme4": 14}, {"lookforme9": 29, "lookforme8": 28, "lookforme3": 23, "lookforme2": 22, "lookforme1": 21, "lookforme0"
: 20, "lookforme7": 27, "lookforme6": 26, "lookforme5": 25, "lookforme4": 24}, {"lookforme9": 39, "lookforme8": 38, "lookforme3": 33, "lookforme2": 32, "look
forme1": 31, "lookforme0": 30, "lookforme7": 37, "lookforme6": 36, "lookforme5": 35, "lookforme4": 34}, {"lookforme9": 49, "lookforme8": 48, "lookforme3": 43
, "lookforme2": 42, "lookforme1": 41, "lookforme0": 40, "lookforme7": 47, "lookforme6": 46, "lookforme5": 45, "lookforme4": 44}, {"lookforme9": 59, "lookform
e8": 58, "lookforme3": 53, "lookforme2": 52, "lookforme1": 51, "lookforme0": 50, "lookforme7": 57, "lookforme6": 56, "lookforme5": 55, "lookforme4": 54}, {"l
ookforme9": 69, "lookforme8": 68, "lookforme3": 63, "lookforme2": 62, "lookforme1": 61, "lookforme0": 60, "lookforme7": 67, "lookforme6": 66, "lookforme5": 6
5, "lookforme4": 64}, {"lookforme9": 79, "lookforme8": 78, "lookforme3": 73, "lookforme2": 72, "lookforme1": 71, "lookforme0": 70, "lookforme7": 77, "lookfor
me6": 76, "lookforme5": 75, "lookforme4": 74}, {"lookforme9": 89, "lookforme8": 88, "lookforme3": 83, "lookforme2": 82, "lookforme1": 81, "lookforme0": 80, "
lookforme7": 87, "lookforme6": 86, "lookforme5": 85, "lookforme4": 84}, {"lookforme9": 99, "lookforme8": 98, "lookforme3": 93, "lookforme2": 92, "lookforme1"
: 91, "lookforme0": 90, "lookforme7": 97, "lookforme6": 96, "lookforme5": 95, "lookforme4": 94}]'
$items = our_json_parse($json)
define iterate_list {
   $x = $name['lookforme0']
   file { "/tmp/tmp.$x":
      ensure => file,
      content => "hello world",

iterate_list { $items:

our_json_parse is basically this, although we add checks for exceptions, etc.


module Puppet::Parser::Functions
   newfunction(:our_json_parse, :type => :rvalue, :doc => "
Parse json string and return result.
") do |args|

Here is the state file, which has duplicates:

$ sudo grep lookforme /var/lib/puppet/state/state.yaml
$ sudo grep lookforme /var/lib/puppet/state/state.yaml | wc -l

The json was generated with:

import json                                                                     
keys = []                                                                       
    for k in range(0,10):                                                           
        keys.append("lookforme%d" % k)                                              
items = []                                                                      
v = 0                                                                           
for count in range(0,10):                                                       
    d = {}                                                                      
    for key in keys:                                                            
        d[key] = v                                                              
        v += 1                                                                  

print json.dumps(items)    

Also available in: Atom PDF