The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Feature #11666

Add is_ec2 and is_euca facts

Added by James Turnbull almost 3 years ago. Updated 3 months ago.

Status:Needs DecisionStart date:01/02/2012
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:library
Target version:-
Keywords: Affected Facter version:
Branch:https://github.com/puppetlabs/facter/pull/138

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket is now tracked at: https://tickets.puppetlabs.com/browse/FACT-712


Description

Sometimes you just want to know if a host is EC2 or Eucalyptus. This adds two facts in a similar vein to “is_virtual”.


Related issues

Related to Facter - Feature #11640: Support EC2 facts on OpenStack Merged - Pending Release 12/29/2011

History

#1 Updated by James Turnbull almost 3 years ago

  • Status changed from Unreviewed to In Topic Branch Pending Review
  • Branch set to https://github.com/puppetlabs/facter/pull/138

#2 Updated by Ken Barber almost 3 years ago

  • Target version set to 186

I like the idea.

Wouldn’t it be nicer to have one fact type that returns a string – that people can throw into a case statement? Like ‘cloud_provider’ or somesuch? Otherwise users are forced to create if is_foo elsif is_bar elsif is_baz type structures … etc.

#3 Updated by James Turnbull almost 3 years ago

Sure – I am not precious about how it’s done – it was just tripping me up when testing something. :)

#4 Updated by Ken Barber almost 3 years ago

Yeah fair enough. I’ve wanted something like this. This is going to show up the xen bug, as xen instances match the ec2 arp stuff (and we make a subsequent http request to a non-existent api). I suppose it makes little difference as the problem already exists with xen.

#5 Updated by James Turnbull almost 3 years ago

Does this conflate with the virtual facts anyway? Hmmm perhaps not.

#6 Updated by Anonymous almost 3 years ago

James Turnbull wrote:

Does this conflate with the virtual facts anyway? Hmmm perhaps not.

I am concerned it will have the same problem as the virtual fact: that you could, eg, be a euca node, and a host for a container stack, or something like that. Specifically, that this is a set of booleans, not a single value, where one string ends up packing it down in ways that it shouldn’t.

#7 Updated by Ken Barber almost 3 years ago

  • Status changed from In Topic Branch Pending Review to Needs Decision

I’m gonna change this to ‘Needs Decision’ for now – I think the need is defined, just the implementation needs a decision.

Does this conflate with the virtual facts anyway? Hmmm perhaps not.

Not really – but its close. Having said that – the direction of the virtual flag is a funny one. You could have a ‘is_cloud?’ type of thing …

I am concerned it will have the same problem as the virtual fact: that you could, eg, be a euca node, and a host for a container stack, or something like that. Specifically, that this is a set of booleans, not a single value, where one string ends up packing it down in ways that it shouldn’t.

This is fair enough. You could always provide an array of containers though – comma separated.

So yes – there is a slim chance but you could have an Openstack node running using just QEMU on an EC2 instance. What is the right answer then I wonder? And since the use-case is primarily the the EC2 API – does it still have meaning to the embedded node in that case? If you could somehow emulate the API in such a case it would still work regardless. In the cases where say it was Openstack inside Linode you would only need to openstack knowledge to judge weither an API call is needed.

I’m just going to shove down some random notes – sorry – just thinking about this issue a bit lately (see #11566). I believe ultimately to solve the issue James has had – we only really want to worry about: EC2, OpenStack (we don’t support this yet – but there is a ticket for it) and Eucalyptus … and then any other platform that comes along and supports the API.

But stepping back and thinking about ‘cloud identification’ what are the possibilities here?

Privates:

  • Openstack
  • Eucalyptus
  • OpenNebula
  • VMWare VCloud (how does one distinguish this from just a vmware box I wonder?)

Publics:

  • EC2 (this actually has the same problem as VCloud but for Xen but we work around it horribly with a timeout)
  • Rackspace
  • Linode
  • … others

Implementation aside … what other use-cases are there for this kind of thing?

Two negative points on trying to do this properly:

  • I think identifying a cloud type is really really hard in a lot of cases since they are mainly just sitting on another virt platform (and we all know its hard enough just to figure that thing out). OpenNebula & OpenStack at least both nab mac address prefixes which is much much easier (but I notice they aren’t reserving them with the IETF I believe).
  • Also – what quantifies a ‘cloud’ anyway? Look at the VCloud example. The only differentiator is the way VM’s are launched.

Trying to do it properly could turn into the virtual fact debacle – without good techniques people will just consider the thing buggy. Since the logic in virtual is to assume ‘physical’ if none of the general forensics match – users assume we are doing something amazing to come to that conclusion. I think falling back to nil (ie. no fact) will set the expectation properly as this is our defacto ‘unknown’ return (maybe we should consider this for the virtual fact :–). Now that I think about it – this actually puts weight on a single fact (list or singular) as apposed to a set of booleans for this fact as well perhaps.

Thoughts?

#8 Updated by Oliver Hookins almost 3 years ago

Every cloud provider will be using a form of virtualisation or containerisation which is available outside of cloud use cases (although I can’t say I’m the authority). Xen DomU could just as easily be my laptop as it could be EC2. This just screams to me that overloading the virtual fact is the wrong way to go about this.

Perhaps figuring out if a node is being used in a cloud is hard – too bad, these providers miss out being classified. At the very least it deserves another fact which should list the cloud provider; if and only if it can be reliably determined.

#9 Updated by Ken Barber almost 3 years ago

Oliver Hookins wrote:

Every cloud provider will be using a form of virtualisation or containerisation which is available outside of cloud use cases (although I can’t say I’m the authority). Xen DomU could just as easily be my laptop as it could be EC2. This just screams to me that overloading the virtual fact is the wrong way to go about this.

Sorry just to be clear, I agree with this. I was just using virtual as an example of a similar implementation.

Perhaps figuring out if a node is being used in a cloud is hard – too bad, these providers miss out being classified. At the very least it deserves another fact which should list the cloud provider; if and only if it can be reliably determined.

I concur – lets just do it.

So the question pending is choosing implementation:

  • is_euca, is_ec2 style matching. Different clouds get a new is style fact. Perhaps is_cloud to make it meaningful.
  • cloudprovider/cloudtype/cloudsomething so we can shove it in a case statement. If there is for some strange reason a need for 2 or more providers we munge it with a comma (seems unlikely but strange things have happened).

I’m obviously more of a fan of the second type myself … but I’d like to hear what people have to say.

#10 Updated by Anonymous almost 3 years ago

Ken Barber wrote:

Oliver Hookins wrote:

Perhaps figuring out if a node is being used in a cloud is hard – too bad, these providers miss out being classified. At the very least it deserves another fact which should list the cloud provider; if and only if it can be reliably determined.

I concur – lets just do it. So the question pending is choosing implementation:

  • is_euca, is_ec2 style matching. Different clouds get a new is style fact. Perhaps is_cloud to make it meaningful.

I would much rather we push this sort of aggregate decision to the end user, so they are aware of the limitations. It isn’t like there is much headache doing that in the Puppet DSL, where a fairly simple if-then tree would be sufficient implementation.

Offering only the specific boolean “is it this” facts means that people can’t, eg, assume that we have support for all clouds because we support some clouds.

#11 Updated by Anonymous almost 3 years ago

  • Target version deleted (186)

#12 Updated by Robin Bowes 3 months ago

Redmine Issue #11666 has been migrated to JIRA:

https://tickets.puppetlabs.com/browse/FACT-712

Also available in: Atom PDF