The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Feature #11331

Add 'foreach' structure in manifests

Added by Steve Shipway over 2 years ago. Updated over 1 year ago.

Status:ClosedStart date:
Priority:ImmediateDue date:
Assignee:J.D. Welch% Done:

100%

Category:language
Target version:3.2.0
Affected Puppet version: Branch:https://github.com/puppetlabs/puppet/pull/1435
Keywords:ux backlog

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This issue is currently not available for export. If you are experiencing the issue described below, please file a new ticket in JIRA. Once a new ticket has been created, please add a link to it that points back to this Redmine ticket.


Description

I doubt this would be simple to do, but it would be very useful to be able to do something like this:

$variable = [ 'a', 'b', 'c' ]
foreach $loop $variable {
  file { "/tmp/$loop": content=>"I am file $loop"; }
}

While it is already possible to use an array as a namevar to get something similar, it doesnt allow you to have calculated parameters as well.

This would not be expected to break the declarative nature of puppet, though it would bring in a new variable scope in the loop.

Using a define with an array for the namevar would work provided the top level could not be called multiple times.

We want to have something like this:

define firewall($users,$port) {
  iptables::open { $users: port=>$port; }
}
node foo {
  $webusers = [ 'fred', 'sid' ]
  $sshusers = [ 'fred', 'joe' ]
  firewall { port80: users=>$webusers, port=>80; }
  firewall { port22: users=>$sshusers, port=>22; }
}

This example would fail because the iptables::open define is called with user ‘fred’ two times (although with a different port parameter). If we could instead have a foreach iteration then something like this would be useable:

define firewall($users,$port) {
  foreach $users {
    iptables::open { "$loop:$port": user=>$loop, port=>$port; }
  }
}

This would ensure a unique namevar for iptables::open. We would also be able to do things like initialise an array of users with different metadata parameters (eg, their full names pulled form LDAP)


Subtasks

Feature #18764: Add support for enumerable methods/functionsClosed

History

#1 Updated by Anonymous over 2 years ago

  • Status changed from Unreviewed to Needs Decision
  • Assignee set to Anonymous

I am certain this is a duplicate of another ticket, but I can’t locate it.

This is a common, and fairly valid request, but I would strongly prefer something that modelled the semantics of creating, and applying, an anonymous define on an array instead of an explicit “loop” construct in the language.

That “expansion” model rather than looping is more declarative, and more in keeping with the desires we have for the language long term.

Randall, ultimately this is your baby, along with other DSL work. Can you take ownership of it and, ideally, merge it with the ticket I couldn’t find?

#2 Updated by Anonymous almost 2 years ago

  • Assignee changed from Anonymous to J.D. Welch

#3 Updated by Henrik Lindberg almost 2 years ago

  • Keywords changed from foreach language to foreach language collections

This is part of a larger theme: collections – added a keyword.

#4 Updated by J.D. Welch almost 2 years ago

  • Keywords changed from foreach language collections to foreach language collections dsl backlog

#5 Updated by J.D. Welch almost 2 years ago

  • Keywords changed from foreach language collections dsl backlog to backlog

#6 Updated by eric sorenson almost 2 years ago

  • Keywords changed from backlog to ux

#7 Updated by eric sorenson over 1 year ago

  • Priority changed from Normal to High
  • Target version set to 3.x
  • Keywords changed from ux to ux backlog

#8 Updated by Henrik Lindberg over 1 year ago

I would like to see something like this:

define firewall($users,$port) {
  foreach $users { |$u|
    iptables::open { "$u:$port": user=>$u, port=>$port; }
  }
}

i.e. that the loop variable needs to be declared (instead of always being a reserved variable name), and instead of being restricted to picking just one value at a time.

Picking more than one value is valuable when iterating over a hash – e.g.

define firewall($a_hash) {
  foreach $a_hash { |$key, $value|
    ...
  }

Naturally, the foreach variables are not available outside of the foreach scope.

Also suggest that if only one variable is defined for hash, the key is applied. If more than one variable is used for an array, the loop picks more than one value per iteration (array must be of size % loop_variable_count == 0)

I think this would be unsurprising syntax for people used to lambdas in Ruby and other languages.

grammar could probably be something like

FOREACH rvalue { | parameters | statements } 

Thus allowing to iterate directly the result of a variable, function, or literal.

#10 Updated by Henrik Lindberg over 1 year ago

Thought a bit more about this, and that it is also wanted to be able to perform other operations on collections. At the same time as the previously suggested syntax would be easy to implement, it also adds one keyword (foreach) – and if other similar operations are wanted (e.g. collect, exists, all, first, etc.) it becomes quite a problem.

Instead, I propose that foreach (and similar) are added as an expression on the form: expr . funcname { |parameters| statements }

Thus, the earlier example would be written:

define firewall($users,$port) {
  $users.foreach { |$u|
    iptables::open { "$u:$port": user=>$u, port=>$port; }
  }
}

This is better because it does not interfere with any existing syntax, and does not introduce any new keywords. A first implementation can be hard wired to only support the forach function, but opens up for exploring additional functions, user supplied functions etc. (There are quite a few issues to deal with if the mechanism is generalized).

#11 Updated by Henrik Lindberg over 1 year ago

This was much easier to implement that I first imagined. (Pull request coming soon).

By introducing a “parameterized block” (a Puppet::Parser::AST::Lambda) and some minor improvements to the ephemeral scope support I was able to quickly make progress on this.

The syntax for calling a “method” is:

lhs.funcname
lhs.funcname()
lhs.funcname {|...| ... }
lhs.funcname() {|...| ... }

This is equivalent to calling the same function with the lhs as the first parameter, and the lambda (if present) as the last argument). (This means any existing function can be called using this syntax – e.g:

$a = $b.split('[.:]')

Support for foreach is then simply implemented as a function.

The syntax for the LAMBDA is:

{||  }
{|| = expression }

Although the later is not required for the foreach function, it is required to be able to support lambdas that function as predicates e.g.

 $b = $a.reject {|$x| = $x =~ /ugly/ }

The requirement to use a syntax marker (i.e. =) is a bit unfortunate, but it is currently not possible to support statements | expression without this marker unless a major refactoring is done of the grammar / evaluation as there are ambiguities between function calls with and without parentheses. It is also not possible to implement this as an eval function, since functions invokable as statements are not allowed to produce an rvalue. (The use of = to turn an expression into a statement is only allowed at the opening of a lambda, and this may later be made optional (when/if grammar is refactored).

The supports default values, but if used, they must be at the end of the list as call is made “by position” and not “by name”.

The foreach function allows iterating array slices of 1 or more elements (i.e. pairs, triplets, etc). and a hash can be iterated over keys or keys and values. e.g:

$a = [1, present, 2, absent]
$a.foreach { |$x, $y| file ( "/file_$x: ensure => $y}

$a = {'1' => present, '2' => absent}
$a.foreach { |$x, $y| file ( "/file_$x: ensure => $y}

To produce /file_1 with ensure present, and /file_2 with ensure absent.

Other examples of iterating over hashes.

# iterate over keys
$a = {'1' => present, '2' => absent}
$a.foreach { |$x| file ( "/file_$x: ensure => present}

# access variable outside of lambda
$a = {'1' => present, '2' => absent}
$a.foreach { |$x| file ( "/file_$x: ensure => $a[$x]}

It is not allowed to use class, define, nor node-statements in the lambda, and variables that are assigned in the lambda are local and can not be referenced outside of the scope of the lambda (but they are visible to any nested inner lambdas). Lambda parameters shadow parameters with the same name in outer scopes.

Since the implementation makes it possible to easily add custom methods (i.e. done just like any other custom function), the implementation of lambda evaluation is also very simple. If passed a lambda it is evaluated by:

a_lambda.call(scope, *arguments)

A lambda has two additional useful methods #parameter_count and ´#optional_parameter_count`.

#12 Updated by Henrik Lindberg over 1 year ago

Also see #18764 for additional methods making use of lambda.

#13 Updated by Henrik Lindberg over 1 year ago

  • Branch set to https://github.com/puppetlabs/puppet/pull/1420

The implementation is now ready for review. There are some fixes required for Ruby 1.8.7, but what is in PL 1420 works on 1.9.3

The additional methods discussed in #18764 are currently available in https://github.com/hlindberg/puppet/tree/18764-enumerable-methods.

#14 Updated by Steve Shipway over 1 year ago

Many thanks for your work on this feature, which will help us no end — once we fix our current modules to be v3.x compatible and we upgrade to v3 on our puppet server, of course. The new syntax looks convenient and understandable, and more Ruby-ish (I’m from a Perl background so my initial example was perl-ish)

#15 Updated by Henrik Lindberg over 1 year ago

Pull request updated with code that works for Ruby 1.8.7 (seems that Array#zip produces different result under some conditions).

#16 Updated by Henrik Lindberg over 1 year ago

After some discussion on #puppet-dev an idea was raised that a more function call style (as opposed to method call style) may be more appreciated. The pull request is now updated with this.

These are now equivalent:

$a.foreach {|$x| ... }
foreach($a) {|$x| ... }

It is still the same foreach function, only difference is in who gets to make the call (Function, or MethodCall).

#17 Updated by Brice Figureau over 1 year ago

Henrik Lindberg wrote:

After some discussion on #puppet-dev an idea was raised that a more function call style (as opposed to method call style) may be more appreciated. The pull request is now updated with this.

These are now equivalent: […]

It is still the same foreach function, only difference is in who gets to make the call (Function, or MethodCall).

I didn’t fully the development nor this ticket, but I really think the syntax should be a bit more distinct from ruby blocks. I mean puppet is a different language, and should always be (or we should all use the ruby dsl).

I’d favor something like this that more closely looks like puppet code to me:

FOREACH rvalue { parameters => statements } 

Also, I’m not sure I got the underlying reasons for the syntax with an expression instead of a statement (unless it is a requirement from the existing grammar and racc being fiddly). But if you could merge both syntax to my aforementioned example that’d be perfect :)

#18 Updated by Anonymous over 1 year ago

It would be really nasty to be limited to only a single expression in the loop body. That still requires writing an explicit define for any non-trivial case, such as needing two resource specifications.

It would be much nicer to treat the whole thing like the “use an array and a define” expansion trick, so that you get the same capabilities as the current model – just a nicer syntax.

(I think the foreach name is a bit ugly, too, and would prefer something like expand, but whatever. :)

#19 Updated by Henrik Lindberg over 1 year ago

Brice Figureau wrote:

Henrik Lindberg wrote:

After some discussion on #puppet-dev an idea was raised that a more function call style (as opposed to method call style) may be more appreciated. The pull request is now updated with this.

These are now equivalent: […]

It is still the same foreach function, only difference is in who gets to make the call (Function, or MethodCall).

I didn’t fully the development nor this ticket, but I really think the syntax should be a bit more distinct from ruby blocks. I mean puppet is a different language, and should always be (or we should all use the ruby dsl).

I’d favor something like this that more closely looks like puppet code to me:

FOREACH rvalue { parameters => statements } 

Also, I’m not sure I got the underlying reasons for the syntax with an expression instead of a statement (unless it is a requirement from the existing > grammar and racc being fiddly). But if you could merge both syntax to my aforementioned example that’d be perfect :)

To support the exact syntax you propose, FOREACH must be made into a keyword. The proposed implementation would require adding () around the parameters (or it becomes ambiguous).

FOREACH(rvalue) { parameters => statements}

There are additional difficulties with recognizing { $a => statements } as it is a potential conflict with hash (currently hash only allows NAME or quoted text, but there is no reason (other than obvious grammar conflicts) why it should not allow variable value. I am more or less convinced that a syntactic marker is needed to make the lambda part recognizable (which it is if FOREACH is made into a keyword).

IMO, making foreach into a keyword requires waiting to puppet 4 as it is a backwards incompatible change, and it is also neither extensible nor general. If it is ok to forever after not being able to evaluate a variable as a hash entry key, then the proposed syntax can be supported with the small change that the argument must be enclosed in parentheses.

Finally, the quirk in the current implementation (that a = is required before an expression lambda to turn it into a statement), can be removed when/if the grammar is transformed from being statement oriented to being expression oriented (this is exploratory work that I am doing right now with very promising results).

You are right that Puppet Language is not Ruby, but still, a concrete syntax has to be used, and Ruby happen to have one that fits very well. Whatever we do, the functor (“foreach”), the LHS value, what to pick (one or several things at a time), what to call these parameters, and then one or several statements/expressions are naturally needed whatever way we choose. Since I felt that stealing $ as the opening of a {} to mean it is a parameterized puppet block was stealing too much “grammatical space”, I opted for a known syntax from Ruby as opposed to inventing something different.

I did consider placing the fat arrow first i.e. {=>parameters, but that list need a terminator as the first statement may start with a variable, there are several to choose from `‘:’, ‘;’, ‘|’ – the one I considered the best of these were ‘;’ as it has less of a tendency to cause confusion with ‘::’ in qualified variables even if it would work technically – i.e. that would mean your example would look like this:

foreach(rvalue) { =>parameters; statements}

If you want to use a block without parameters you would have to do something like:

foreach(rvalue) { ; statements}
foreach(rvalue) { =>; statements}
foreach(rvalue) { => statements} # does not work as statements may start with variable

To be compared with what I propose:

foreach(rvalue) { |parameters| statements }
rvalue.foreach { |parameters| statements }

foreach(rvalue) { || statements }
rvalue.foreach { || statements }

I also considered placing the block parameters before the block but this leads to other clashes.

#20 Updated by Henrik Lindberg over 1 year ago

Daniel Pittman wrote:

It would be really nasty to be limited to only a single expression in the loop body. That still requires writing an explicit define for any non-trivial case, such as needing two resource specifications.

The proposal allows multiple statements.

It would be much nicer to treat the whole thing like the “use an array and a define” expansion trick, so that you get the same capabilities as the current model – just a nicer syntax.

With the current proposal it would look like this (implies using a defined ‘my_thing’) . showing both proposed equivalent styles:

data.forearch  {|$x| my_thing { 'somename_$x': param_1 => $x }}
foreach(data)  {|$x| my_thing { 'somename_$x': param_1 => $x }}

(I think the foreach name is a bit ugly, too, and would prefer something like expand, but whatever. :)

“foreach” is not my favorite either, that is probably the easiest thing of them all to change and at the same time be what will take the longest to decide :)

How would you like to see the “expand” taking place; how would you specify what to map from LHS structure (array or hash) onto a define, what if you want/need to make a transformation, would you have to write another define for that? (i.e. comparable to being able to map to only named functions as opposed to mapping to unnamed lambdas).

#21 Updated by eric sorenson over 1 year ago

  • Priority changed from High to Immediate

#22 Updated by Henrik Lindberg over 1 year ago

The proposal for this issue is now found here

#23 Updated by Poul H. Sørensen over 1 year ago

I think that “.foreach” is way too “programmatic”. I suggest “.each” or “.list” as in:

# either:
data.each { |$x| thing { 'somename_$x': param_1 => $x }}
# or:
data.list { |$x| thing { 'somename_$x': param_1 => $x }}

This page http://template-toolkit.org/docs/manual/VMethods.html is worth a glance – you may steal some ideas…

#24 Updated by Henrik Lindberg over 1 year ago

Based on other conversations as well; using each do seem to be favored over foreach.

#25 Updated by Andrew Parker over 1 year ago

Support for rgen, which is needed for the prototype of this, was merged in 493e9208b77feb2d2b3931a694c39c98dfef8b00.

#26 Updated by Andrew Parker over 1 year ago

  • Status changed from Needs Decision to In Topic Branch Pending Review
  • Branch changed from https://github.com/puppetlabs/puppet/pull/1420 to https://github.com/puppetlabs/puppet/pull/1435

There is a new PR that implements the various proposals. The current plan is to get this merged in as an optional feature for puppet 3.2

#27 Updated by Andrew Parker over 1 year ago

  • Status changed from In Topic Branch Pending Review to Merged - Pending Release
  • Target version changed from 3.x to 3.2.0

The experimental parse has been merged into master in https://github.com/puppetlabs/puppet/commit/c8182d68ff179956efc34dc8d0f79084582aa6f3. Note that this is not complete.

#28 Updated by Matthaus Owens over 1 year ago

  • Status changed from Merged - Pending Release to Closed

Released in Puppet 3.2.0-rc1

#29 Updated by Matthaus Owens over 1 year ago

Released in Puppet 3.2.0-rc1

Also available in: Atom PDF