The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Feature #2198

Install multiple package within a single call to the package manager

Added by Stéphan Gorget almost 5 years ago. Updated 4 months ago.

Status:InvestigatingStart date:04/25/2009
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:transactions
Target version:-
Affected Puppet version:0.25.0 Branch:http://github.com/phantez/puppet/commit/51ff88c950c172e6060ae63c1c71968e7898b462
Keywords:communitypatch customer

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket is now tracked at: https://tickets.puppetlabs.com/browse/PUP-1061


Description

During the configuration applying process the package manager is called for each package installation. It is possible to reduce the number of calls to the package manager by gathering package installation and delayed some package installation. Naturally, this modification should not break the dependency graph.

ready-queue.txt Magnifier - Ready queue traced over time (1.94 MB) Andrew Parker, 09/17/2013 09:47 am


Related issues

Related to Puppet - Bug #3156: batchable yum and RPM transactions should be batched Needs More Information 02/06/2010
Related to Puppet - Feature #4983: Remove packages at same time Duplicate 10/11/2010
Related to Puppet - Bug #1935: Unable to handle 2 packages with a circular dependency Closed 02/05/2009
Related to Puppet - Feature #3537: It should be possible to trigger (exec) resources with re... Closed 04/12/2010
Duplicated by Puppet - Feature #4797: Providers should be able to process more than one resourc... Duplicate 09/16/2010
Duplicated by Puppet - Bug #4846: uninstalling packages gives failed dependencies errors Duplicate 09/27/2010
Duplicated by Puppet - Bug #18238: Upgrading multilib versions (i686 + x86_64) not working Duplicate

History

#1 Updated by Luke Kanies almost 5 years ago

  • Category set to transactions
  • Status changed from Unreviewed to Accepted

#2 Updated by Stéphan Gorget almost 5 years ago

I’d like this feature request to be assigned to me.

#3 Updated by James Turnbull almost 5 years ago

  • Assignee set to Stéphan Gorget
  • Target version set to 4

#4 Updated by Stéphan Gorget almost 5 years ago

  • Status changed from Accepted to Needs Decision

Discussion about implementation and architecture design : http://groups.google.com/group/puppet-dev/browse_thread/thread/584c9db44f5e2253

#5 Updated by Stéphan Gorget over 4 years ago

Comments about implementation : http://groups.google.com/group/puppet-dev/browse_thread/thread/424b7cbfe52ccfd0

#6 Updated by James Turnbull about 4 years ago

Stephan – is this code current – any updates to it?

#7 Updated by Stéphan Gorget about 4 years ago

The last update is here : http://github.com/phantez/puppet/tree/features/master/2198 But I guess many changes have be done on resource transactions and the code might have to be updated.

#8 Updated by Stéphan Gorget about 4 years ago

I’ve just rebased the branch against master (no conflict appeared) but I haven’t test that the patch is still working. (I no longer have CentOS/RedHat based OS around me)

#9 Updated by Stéphan Gorget about 4 years ago

Is there still someone interested by this patch ? I’ve now some time to go over it and rewrite it. The patch can be found here : http://github.com/phantez/puppet/commit/51ff88c950c172e6060ae63c1c71968e7898b462 and here : http://groups.google.com/group/puppet-dev/browse_thread/thread/424b7cbfe52ccfd0

I’ll be pleased to have some comments and help.

#10 Updated by Peter Meier about 4 years ago

Is there still someone interested by this patch ? I’ve now some time to go over it and rewrite it.

yes, I think it would be usefull!

#11 Updated by Peter Meier about 4 years ago

you might want to have as well a look at #3156

#12 Updated by Nigel Kersten about 4 years ago

absolutely this would be useful!

#13 Updated by Stéphan Gorget about 4 years ago

I’ve tried the patch rebased on the master and it is still working fine on CentOS 5.4


package { “w3m”: ensure => installed, combine => true, } package { “elinks”: ensure => installed, combine => true, } package { “lynx”: ensure => installed, combine => true,

}

The patch is only active when combine is defined. If not, the classical behaviour will applied.

The modified behaviour is that each time a transaction has to occur on an element that implement combine it looks in the dependency graph and if an other element has the combine argument and its dependency is in a state that make it possible to applied it, and then the elements are combined. It only gather packages that can be installed (dependency graph resolved and ok) and that has the combine flag set to true

#14 Updated by Stéphan Gorget about 4 years ago

I need someone to review the patch and tell me what have to be improved or rethink to make it acceptable. I have not resend it on the mailing list it is already there : http://groups.google.com/group/puppet-dev/browse_thread/thread/424b7cbfe52ccfd0

#15 Updated by Mike Lococo almost 4 years ago

Any progress on reviewing this patch? Being able to batch package transactions would also solve Bug #1935 as a side effect.

#16 Updated by Nigel Kersten over 3 years ago

ugh. How did this slip by for so long? Reminder being sent on dev list.

#17 Updated by Matt Robinson over 3 years ago

  • Status changed from Needs Decision to In Topic Branch Pending Review
  • Keywords set to communitypatch
  • Branch set to http://github.com/phantez/puppet/commit/51ff88c950c172e6060ae63c1c71968e7898b462

Making this easily to find in the backlog of community patches to review.

#18 Updated by James Turnbull over 3 years ago

  • Target version changed from 4 to 2.7.x

#19 Updated by Nigel Kersten almost 3 years ago

Did we test this? I didn’t see it get merged in.

#20 Updated by Stéphan Gorget old account almost 3 years ago

It has not been merged yet and I don’t think someone has tested it since I rebased one year ago. Nobody contacted me.

#21 Updated by Stéphan Gorget almost 3 years ago

Updated with the wrong account, but if you have questioned don’t hesitate to contact me.

#22 Updated by Nigel Kersten almost 3 years ago

I’m sorry Stephan, I’ll make sure this is chased up.

#23 Updated by John Florian almost 3 years ago

Any plans for this to make it into a release? Will this ‘combine’ feature work for package removal as well?

After looking at #1935, #2833 and #3707, all paths seem to lead here. RHEL is purportedly the most popular enterprise Linux and that means yum/rpm is the most popular package provider in the enterprise.

It appears that lots of people are hacking up Exec resources to work around this deficiency.

#24 Updated by Nigel Kersten almost 3 years ago

erk. This is completely my fault. I was incorrectly filtering and failed to chase it up.

#25 Updated by Mike Lococo over 2 years ago

RPM circular deps came up on the mailing list again, and will be resolved as a side-effect when this feature lands: http://groups.google.com/group/puppet-users/browse_thread/thread/610cba4ead88480a/

Please please please review and merge this patch unless it causes puppet to punch you in the nose on every run. * It has a patch submitted (and has had one for 2-years). * With 16 votes it’s the fifth top-voted issue in the bug-tracker. * It’s the oldest issue in the top-5. * If I read the history correctly, it’s simply been awaiting patch review this whole time. No technical objection has been identified at this point, it may be ready for merge today if it gets the right pair of eyeballs on it. But I suppose it’s more likely that it will need a third-cleanup because it’s been bit-rotting so long.

Please merge, between circular deps and the performance improvements of batched transactions, this is a much desired feature that consistently hits the mailing list every few months.

#26 Updated by Michael Rooney over 2 years ago

This would be fantastic indeed, and save us a lot of time on spin-ups. Looking at the linked branch, I notice it seems to only add support for this to yum; it would be great to get this in for apt at the same time if possible! Let me know if I can be of any help there.

#27 Updated by Janardhan Molumuri over 2 years ago

Recently I was looking at the same feature and I was tempted to write something similar but found that there is already a patch available.

This patch will be really useful. Puppet team, Do you have any timelines on when this will be integrated into the upstream?

#28 Updated by Bill Tong over 2 years ago

This bug is causing problems with a puppet deployment of mine.

The idea is that puppet takes control once the minimal base system has been installed. The need for puppet to run three thousand separate transactions rather than one transaction is causing a lot of grief at the moment.

An ETA on this would be really helpful. Thanks a lot.

#29 Updated by Nick Lewis over 2 years ago

  • Assignee changed from Stéphan Gorget to Nick Lewis

I’ll have a look at this, and see what’s happened in the… long… time since it was submitted. Given the changes we’ve recently made around dynamically traversing the graph and deferring resources, I would really like to finally make this go.

#30 Updated by Bill Tong about 2 years ago

Thanks.

#31 Updated by Joshua Hoblitt about 2 years ago

I’ve run into problems with circular dependencies when trying to uninstall packages that are part of RHEL6.x @base. I haven’t seen any discussion of what’s holding up merging this patch (I haven’t tried it myself yet). Could summarize in this ticket what the problems with the patch are?

#32 Updated by Ryan Conway about 2 years ago

Came across this when trying to debug a series of related packages, and thought I would also up-vote.

Does anyone know what the hold up is with merging? If its close to being released, would be great to hear, as I need to either put in a temporary hack using an Exec or make a larger change if we’re unlikely to see this released within say the next month?

Any ideas?

Ta!

#33 Updated by Daniel Pittman about 2 years ago

Ryan Conway wrote:

Came across this when trying to debug a series of related packages, and thought I would also up-vote.

Does anyone know what the hold up is with merging? If its close to being released, would be great to hear, as I need to either put in a temporary hack using an Exec or make a larger change if we’re unlikely to see this released within say the next month?

Really, what needs to happen with this is that someone grabs it, checks that it actually works with the next major release, and turns it into a formal pull request to get it reviewed, as documented in http://github.com/puppetlabs/puppet/tree/master/CONTRIBUTING.md

The last time this was looked at predated my time leading the platform team, and it wasn’t on the list of “active” tickets at that point, so it dropped from sight, unfortunately. Worse, we are unlikely to get time to look at it ourselves – and work out if, for example, the API extension is sufficient to meet more than just “install packages in a batch” needs, ensure it doesn’t violate any ordering constraints, and so forth.

Absent any other action we will absolutely get around to looking at this, but it isn’t likely to happen on the platform team side any time soon, I fear.

#34 Updated by Stéphan Gorget about 2 years ago

The transaction.rb has changed a lot since I worked on this patch and the apply_changes function does not exist anymore. This was where the magic used to happen. Actually, this patch was written for 0.24.x and 0.25.x in june 2009 and I think that it will be more easy to rewrite the patch completely than to try to merge it into the next version of puppet.

If someone is interested to rewrite it, I will be happy to help.

#35 Updated by Daniel Pittman almost 2 years ago

  • Status changed from In Topic Branch Pending Review to Investigating
  • Assignee deleted (Nick Lewis)
  • Target version deleted (2.7.x)

#36 Updated by Jo Rhett over 1 year ago

FYI, I’d like to note some experiences I had working on the package management for cfengine, which dealt with this situation properly.

Honestly, I see no reason why the yum provider could not stack up the list of packages to add and the packages to remove and do them separately. This is what cfengine did, although we limited the size of the array because we found that rpm got stupid slow when the list was too big.

What we did not implement at that time, was the idea of a group operation. This could easily be a generic thing on the resource. You could then run the entire list of installs or removes for a specific group. In reality, the “undefined” group would be one normal one, and then people could tag certain packages to be installed or uninstalled together with their own group name. This works very well for distributed groups, who can test that their package installs work properly together but aren’t sure how they would relate to another group’s changes.

In short, very simply:

  1. For package providers who can handle more than one package at a time (new attribute) build an array of things to remove.
  2. Each “group” (or whatever) would have its own array. Each array would be processed independently.
  3. Uninstalls and Installs from the same group would be processed together, before the same from another group.

This change should be fairly simple to implement, consistent in usage across all of the providers (group/whatever could be used even by single-package providers), and be an improvement over what was available in cfengine2 — which is ahead of puppet at this point in time.

#38 Updated by Josh Davidson 9 months ago

As many have noted both here and in #1935, Puppet isn’t even capable of managing system packages in RHEL/Fedora. Consider, for example, polkit.

Better hope you don’t let Puppet manage the docs and devel packages…

http://rpm.pbone.net/index.php3/stat/4/idpl/21821175/dir/rawhide/com/polkit-docs-0.111-2.fc20.noarch.rpm.html

http://rpm.pbone.net/index.php3/stat/4/idpl/21829829/dir/rawhide/com/polkit-devel-0.111-2.fc20.x86_64.rpm.html

#39 Updated by Bill Tong 9 months ago

As another example. I have some scientific software in binary form which needs certain libraries to run. Some of these libraries come from a RHEL 6.4 repository, some come from an EPEL repository. I am responsible for the OS-side libraries, the developer is responsible for their compiled binary. The libraries look like this:

aaa  # from the rhel 6.4 repo
bbb  # from the rhel 6.4 repo
ccc  # from the rhel 6.4 repo
ddd-1.2.3  # from the epel repo
eee-2.3.4  # from the epel repo
fff-3.4.5  # from the epel repo

I have to specify the versions for the epel repo, since they contain multiple versions.

Unfortunately for me, when I tell puppet to install “bbb”, it pulls in the dependency on ddd. There are multiple versions of ddd in the epel repo, and it chooses the newest, version 9.9.9.

When puppet gets to the part where it wants to install ddd-1.2.3, it fails with a multilib error, for which the root cause is that ddd is already installed with a newer version.

But I can work around this! You will tell me to split this list of libraries that their software requires, and make them depend on each other in order, manually re-create the chain of dependencies that exists in the rpm packages using the Puppet DSL.

But I can work around this! You will tell me to create an rpm containing the binary software and create dependencies on the libraries within the rpm, listing the specific versions it needs.

Yes I could do both. But both of those are a lot more effort than puppet installing a simple list of rpms for me in a single transaction. And that’s what this bug is about.

#40 Updated by Andrew Parker 7 months ago

Here is an example of the state of the ready queue as a catalog run progresses. This provides some insight into where there might be batching opportunities.

#41 Updated by Kylo Ginsberg 6 months ago

There was a great discussion on puppet-dev: https://groups.google.com/forum/#!topic/puppet-dev/X7RgakTGnbk

The initial outcome was:

The provider interface:

  • Provider::batchable?(resource1, resource2)
  • Provider::batch_start
  • Provider::batch_end

We defined an initial set of vertical slices to work on, with the yum provider as the initial guinea pig:

  1. Define report schema changes
  2. Every resource is its own batch. The provider executes the batch and these batches appear in the report.
  3. The provider is able to control what resources can go into a batch to allow batches of size > 1. Manifest ordering algorithm.
  4. The yum provider executes all of the items in a batch using a single command. It assumes that everything will succeed and no error reporting is done.
  5. The yum provider handles errors while executing a batch and reports a failure for any item as a failure for all items.
  6. The yum provider is able to report which item of the batch caused the actual error and preserve this information in the report.
  7. Extend/test for Title Hash order and Random Order. Prior to this, the conservative (manifest ordering) algorithm is used per batch.

I suspect we’ll learn some about batch processing (and perhaps yum!) as we go, so these slices aren’t set in stone, but just an initial development plan. I’m hoping we’ll see some of these slices getting developed soon!

#42 Updated by Zachary Stern 4 months ago

  • Keywords changed from communitypatch to communitypatch customer

#44 Updated by Anonymous 4 months ago

Redmine Issue #2198 has been migrated to JIRA:

https://tickets.puppetlabs.com/browse/PUP-1061

Also available in: Atom PDF