The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #14410

role prefetching needs to be implemented to be more scalable

Added by Dan Bode almost 2 years ago. Updated almost 2 years ago.

Status:Merged - Pending ReleaseStart date:05/10/2012
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-Spent time:-
Target version:-
Keywords: Branch:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This issue is currently not available for export. If you are experiencing the issue described below, please file a new ticket in JIRA. Once a new ticket has been created, please add a link to it that points back to this Redmine ticket.


Description

The roles prefetching does a cartesian product of users × tenants to fetch roles, it quickly becomes very slow as the number of users and tenants increases

In : https://github.com/puppetlabs/puppetlabs-keystone/blob/9a74a4dbb983544bf17a6ab35dd9188c47178224/lib/puppet/provider/keystone_user_role/keystone.rb#L109-127

The current loop is :

get all users get all tenants get the role for the user/tenant But users are rarely in more than one tenant.

A first optimization would be to :

get all tenants get users in the tenant (keystone user-list ) get the role for the user/tenant I will propose this change later, but it might still be long with huge counts of users and tenants.

History

#1 Updated by François Charlier almost 2 years ago

Hello, I’m currently working on a fix for this issue.

#2 Updated by François Charlier almost 2 years ago

Proposed fix in https://github.com/puppetlabs/puppetlabs-keystone/pull/49 On my dev env with :

  • 188 users
  • 173 tenants
  • 1 user per tenant in general, less than 10 tenants with two users, 3 tenants with 3-8 users

Before the change, I stopped the puppet agent process after 2 hours, it was still running …

After the change, the puppet agent takes only 7 minutes to run.

7 minutes is still slow, but it’s better.

The fact is that each user/tenant/role/endpoint/… asked from keystone, the keystone process is ran once and makes two requests to the keystone server (one to authenticate, the second to get the data). It could (IMHO) again be improved using the REST API directly from ruby.

I’ll try to crunch some numbers to evaluate if it would be interesting to do so.

#3 Updated by François Charlier almost 2 years ago

To have an idea of the amount of optimization left, I wrote to example scripts to list all tenants/users/roles (the scripts https://gist.github.com/2918442)

The shell script is roughtly how the current first optimization of the module behaves. The python script is an idea of what could be achieved using the Keystone REST API directly.

With the same number of tenants/users as said above:

  • The shell script runs in ~ 3m20s on my machine (1m15s on the prod server)

  • The python script runs in ~ 30s on my machine (20s on our prod server), ~ 3.5 to 6.5 times faster.

#4 Updated by Dan Bode almost 2 years ago

  • Status changed from Unreviewed to Merged - Pending Release

I merged the commit with the improvements using the keystone command line tool in commit: 618eaae6c239b

Also available in: Atom PDF