Category Archives: Uncategorized

Ansible Tower Dynamic Inventory from CloudForms

In my previous post I showed an example of how as part of the provisioning of a service CloudForms could be integrated with Ansible Tower to provide greater re-usability and portability of stacks across multiple infrastructure/cloud providers. Now I would like to show an example of how Ansible Tower’s Dynamic Inventory feature can be used in conjunction with the inventory in CloudForms to populate an inventory that can have job templates executed on them. Right now CloudForms has hosts and virtual machines in it’s inventory that would be useful to Ansible, but in the next version container support will allow CloudForms to pass Ansible an inventory of containers as hosts as well (that will be really interesting).

For those not familiar, Dynamic Inventory is a feature in Ansible Tower that allows users to maintain an inventory of hosts based on the data in an external system (LDAP, cobbler, CMDBs, EC2, etc) so they can integrate Ansible Tower into there existing environment instead of building a static inventory inside Ansible Tower itself. Since CloudForms can maintain discovery of existing workloads across many providers (vSphere, Hyper-V, RHEV, OpenStack, EC2 to name a few) it seems natural that it would be a great source of providing a dynamic inventory to Ansible for execution of job templates.

I authored the script to allow users to build a dynamic inventory in Ansible Tower that comes from CloudForms virtual machine inventory. This means that any time a user provisions a VM on vSphere, Hyper-V, OpenStack, RHEV, EC2, or other supported platform – CloudForms will automatically discover that VM and Ansible Tower will have it added to an inventory so it can be managed via Ansible Tower.

To use the script simply navigate to Ansible Tower’s setup page and select “Inventory Scripts”.

Screen Shot 2015-11-17 at 8.18.25 PMFrom there select the icon to add a new inventory script. You can now add the inventory script and name it and associate it with your organization.

Screen Shot 2015-11-17 at 8.19.05 PM

You should now see the added inventory script.

Screen Shot 2015-11-17 at 8.18.57 PM

Now within the inventory of your choosing add a new group named “Dynamic_CloudForms” and change the source to “Custom Script” and select your newly added script within “Custom Inventory Script”.

Screen Shot 2015-11-17 at 8.29.03 PM

Your new group should be added to your inventory

Screen Shot 2015-11-17 at 8.29.14 PM

One last thing needed by the script is the cloudforms.ini file. This file holds things like the hostname of your CloudForms instance, the username and password to use to authenticate, and other information. You’ll need to place this on your Ansible Tower server in /opt/rh/cloudforms.ini. I also found I had to install the request service on the Ansible Tower server (`yum install python-pip -y; pip install requests`).

Now you should be able to run the “start sync process” manually from the inventory screen (it’s an icon that looks like two arrows pointing opposite directions). You could also schedule this sync to run on a recurring basis.

Screen Shot 2015-11-17 at 9.44.48 PM

And Voila! Your inventory has been populated with names from CloudForms. The script will take into account some VMs may be powered down or actually be templates and add only those machines with a power_state of “on”.

It should be noted that the inventory script currently adds the VM based on it’s name in CloudForms. This is because my smart state analysis isn’t set up in my appliance and I don’t have any other fields available to me. What should be done eventually (when smart state is working) is changing the script to query for the ip address or hostname field of the VM. One more thing … I skipped a lot of the security checks on certificates (probably not a good thing). It shouldn’t be difficult to alter the script and python requests configuration to point to your certificates for a more secure experience.




Tagged , , , ,

Ansible and CloudForms: Do you want to Deploy More Stacks Faster? Sure, We All Do!

Do you want to deploy more stacks faster? Sure, we all do.

By integrating Ansible Tower, Red Hat CloudForms, and Red Hat Satellite it’s possible to deploy stacks faster, more securely, and manage them after they are deployed. In this post, I’ll give a brief demonstration of what is possible when these systems are integrated. But first …

A Quick Background

For those that haven’t been following, Red Hat recently announced that it has entered an agreement to acquire Ansible. Ansible is a leading open source IT automation project and delivers an enterprise solution for IT automation via Ansible Tower.

CloudForms is a hybrid cloud management platform based on the ManageIQ community (Developed by ManageIQ who was acquired by Red Hat in 2012). CloudForms provides a myriad of functions that include:

  • Monitoring and tracking
  • Capacity management and planning
  • Resource usage and optimization
  • Workload life-cycle management
  • Policies to govern access and usage
  • Self-service portal and catalog
  • Controls to manage requests
  • Quota enforcement and usage
  • Chargeback and cost allocation
  • Automated provisioning

Finally, there is Red Hat Satellite, a platform for managing Red Hat systems. While Satellite provides many capabilities for managing systems, in this demonstration I focus on Satellite’s ability to provide trusted content and for tracking entitlements.

By combining these three powerful platforms it is possible to provide new levels of functionality to users who want to securely automate their IT environments and do it all with open source.

Integration Workflow

The diagram below illustrates one integration now possible that would allow users to combine the power of CloudForms, Red Hat Satellite, and Ansible Tower.


Step 0 – We assume that you have already created a playbook in GitHub and added it via a SCM synchronization job in Ansible Tower. It also assumes you have synchronized trusted content to a Red Hat Satellite.

Step 1 – A user requests a self-service catalog item from CloudForms.

Step 2 – CloudForms connects to the provider and creates the virtual machine(s).

Step 3 – Upon successful creation of virtual machines CloudForms reaches out to Ansible Tower and creates host(s) in the inventory to match the virtual machine(s) created. It also initiates a job on Ansible Tower to execute the appropriate playbook(s).

Step 4 – The virtual machine(s) subscribes to the Satellite and pulls trusted content from it as part of the playbook.

This is a high level overview for a tenant workflow.

A Short Demonstration


The video above illustrates something I cooked up in the lab illustrating the integration workflow I described in the previous section. In this case, a user selects a self-service catalog item in CloudForms for a web server. CloudForms provisions a virtual machine on Red Hat Enterprise Virtualization. CloudForms provisions the virtual machine (from a template) and passes in a ssh key into the machine via cloud-init. Then CloudForms reaches out to Ansible Tower and adds the VM (by IP Address) to an inventory and kicks off a job manually.

The nice thing about this approach is that by using an Ansible playbook to automate the deployment of a web server it would be very easy to create another self-service catalog item on vSphere, OpenStack, or other supported infrastructure provider and recreate the same workload. With CloudForms, Ansible, and Satellite users can deploy via workflow where needed or embrace model driven deployment to increase re-usability across a wide range of infrastructures when possible.

Of course, it would be really nice to integrate identity management into this demonstration so that credentials are not being injected via cloud-init and so credentials in Ansible Tower are centralized into a proper IDM system. Also, integration into a proper IPAM system would be nice (but hey, this is just a demo).


I hope this demonstration provided you with an idea of how Ansible Tower compliments Red Hat CloudForms and Red Hat Satellite to allow for automation of stacks. It should be noted that another key is that the more automation that takes place in playbooks in Ansible, the more portable (and presumably more maintainable) it is for end users.


As is usually always the case with most all things I’ve written … you should have a professional software developer, creative person, or lawyer re-write it as appropriate.

Ansible Playbook

Cloud-init script

CloudForms Automate Method (LabCorp/Infrastructure/VM/Provisioning/StateMachines/Methods/redhat_PostProvision)

Contents of /root/ on the VM template (referenced in Ansible Playbook)

DevOps in a Bi-Modal World

The business environment has never been more competitive and disruptive than it is today. Businesses need to come to terms with three realities:

  1. They need a continuous competitive advantage

Just ask Kodak who has seen the camera business transform from a standalone device to a feature on every mobile phone with new players like Snapfish, Shutterfly, and Chatbooks creating new ways of engaging with markets. If you don’t have a way of continually developing new competitive advantages you will not be relevant for long.

  1. They are a software company

Bank of America is not just a bank, they are a transaction processing company. Exxon Mobil, is not only an oil and gas company, they are a GIS company. With each passing day Walgreens business is more reliant on electronic health records.

  1. Their competition is everywhere

Ten years ago if I asked you who the biggest competitive threats were to Fedex names like UPS, and DHL might come to mind. Increasingly Fedex, UPS, and DHL face threats from Uber, Walmart, Amazon, and others who may enter their market of logistics with new ways of reaching customers.

What do businesses need to do given these three realities?

To quote Mark Zuckerberg, they need to “Move fast and be stable”. Moving fast and being stable can be translated to more quickly developing new services that could be scaled to meet fast growing demand if needed but also with an extremely low cost of failure should they not work. In other words, cheap experiments need to be able to become global successes.

The scientists conducting these cheap experiments are software developers. Lines of business naturally turn towards their development teams to request new services at an increasingly faster rate. The problem is, developers can’t obtain those environments fast enough from operations because traditional processes and non-flexible infrastructure and applications stand in the way. It’s no surprise then, according to a 2012 McKinsey and Company study [1], that software delivery in the enterprises surveyed was 45% over budget, 7% over on time, and has 56% less value than expected when delivered.

This is no secret to businesses and they are looking to new methods and designs to help improve these metrics. In fact:

  • Over 90 percent are running or experimenting with Infrastructure as a Service [2].
  • Greater than 70 percent expect to use Platform as a Service in their organization [3].
  • More than 90 percent expect new investments in DevOps enabling technologies in the next two years [4].
  • Over 70 are using or anticipate using containers for cloud applications.

Businesses are turning to new development and operations processes, new cloud infrastructures, and application methodologies that are conducive to these new processes and infrastructures. Looking at one of the leaders in public cloud, Amazon Web Services, we see they use these same principles and designs to achieve upwards of 10,000 releases per hour (as much as a release every 12 seconds) with a very low outage rate caused by these releases.

At first glance it would appear that enterprises could simply yell “DevOps and Cloud to the Rescue!” and solve their problem of deploying faster on scalable infrastructure, but the reality is far from that. Enterprises have existing assets and investments, and many of these are not going away anytime soon. In fact, the existing systems and processes most likely power the very core of the business and cannot simply be replaced over night nor would they fit the paradigm of moving quickly and experimenting. Gartner coined the term Bi-modal to describe this approach of two modes of delivery for IT – one focused on agility and speed and the other on stability and accuracy.

Gartner has also recognized an approach that enterprises can take that would allow them to maximize the use of their existing assets. In their research “DevOps in the Bimodal Bridge” [7] they suggest an approach where the patterns and practices of DevOps can be applied to existing assets (mode 1) to make it more agile and efficient.

I have observed this trend and I believe most organizations are trying to address four key problems across their emerging bi-modal world.

In mode-1 they are looking to increase relevance and reduce complexity. In order to increase relevance they need to deliver environments for developers in minutes instead of days or weeks. In order to reduce complexity they need to implement policy driven automation to reduce the need for manual tasks.

In mode-2 they are looking to improve agility and increase scalability. In order to improve agility they need to create more agile development and operations processes and embrace new application architectures that allow for greater rates of change through decreased dependencies. In order to increase scalability they need to implement infrastructure that utilizes an asynchronous design and is entirely API driven in order to change the admin to host ratio from a linear to an exponential model in order to increase scalability.

In order to make these examples more concrete, let’s look at each of them in more detail.

Increasing Relevance by Accelerating Service Delivery

Delivering development and test environments to developers in many enterprises generally starts with either a request to a service management system or a tap on the shoulder of a system administrator. This usually depends on the size of the organization and maturity of the IT department. Either way, once requests fall into a service management system there are often many teams that need to perform tasks to deliver the environment to the developer. These might include virtual infrastructure administrators, systems administrators, and security operations. In larger organizations you could expect to see disaster recovery teams, networking teams, and many others involved in this process too. Again, depending on the maturity of the organization how all of this is coordinated could range from taps on shoulders to passing tickets around in a service management system.

At best each team takes minutes or hours to respond and perform some manual tasks and often the person who requests the service must be asked follow up questions (“Are you sure you need 16GB of RAM?”, “What version of Java do you need for this?”). The result is lots of highly skilled people spending lots of time and very slow delivery of this environment to the developer. Multiply this by the number of developers in an organization and the number of requests for environments and you can understand why traditional IT processes and systems are struggling to maintain relevance.

A solution for this problem is to introduce a service designer into the process (you may be familiar with this from ITIL) that can enable self-service consumption of everything developers need. The designer works with all stakeholders including virtual infrastructure administrators, system administrators, and security operations to obtain requirements. Then, the designer builds the necessary configuration management content and couples it with a service catalog item. By invoking this catalog item the environment can be deployed automatically across any number of providers including virtualization providers, private, or public cloud.

The result of this solution is that all the teams responsible for delivering an environment are now free to do more valuable work (like working with development to design operations processes that work as part of development instead of being bolted on after). It also removes human error from the equation, and most importantly, it delivers the environment in significantly less time. We have seen upwards of a 95 percent improvement in delivery times in many of our customers [9].

Reducing Complexity by Optimizing IT

Speeding up delivery of environments to developers or end users is a great way to make IT more relevant, but a lot of what IT is spending their time on is the day-to-day management of those environments. If IT is spending so much time on day-to-day tasks how can we expect them to deploy the next generation of scalable and programmable infrastructure or have time to work with development teams during early stages of development to increase agility?

I have found that many virtual infrastructure administrators spend time on several common tasks that should be largely automated through policy.

First, are policies around workload placement. Often one virtual infrastructure cluster will be running hot while another one is completely cold. This leads to operations teams being inundated by calls from the owners of applications running on the hot cluster asking why response times are poor. Automating this balancing through control policies can alleviate this problem and keep virtual infrastructure administrators free to other things.

Next is the ability to quickly move workloads between different infrastructures. This has become increasingly important as organizations looks to adopt scale-out IaaS clouds. Operations leadership realizes if they can identify workloads that do not need to run on (typically) more expensive virtual infrastructure they could save money by moving those workloads to their IaaS private cloud. This migration is typically a manual process and it’s also difficult to even understand what workloads can be moved. By having a systematic and automated way of identifying and migrating workloads enterprises can save time and move workloads quickly to reduce costs.

Yet another issue is ensuring compliance and governance requirements are met, particularly with workloads running on new infrastructures, like an OpenStack based private cloud. Not knowing what users, groups, data, applications, and packages reside on systems running across a heterogeneous mix of infrastructure presents a large risk and operations teams often have the responsibility and obligation of ensuring this risk is minimized. By being able to introspect workloads across platforms operations teams can gain insight into exactly what users, data, and packages are running on systems and leverage the migration capabilities I mentioned previously to make sure systems are running on appropriate providers.

Finally, since IT has often become a broker of public cloud services it’s important that they can account for costs and place workloads on appropriate regions in the public cloud to control costs while also ensuring service levels for end users are maintained. If developers are based in Singapore then we should leverage public cloud infrastructure in that location instead of deploying to a more expensive and more latent public cloud infrastructure in Tokyo.

By implementing policy based automations our customers have seen large improvements in their resource utilization and a reduction in CapEx and OpEx per workload managed [10].

Improving Agility by Modernizing Development and Operations

With resources now free from handling each and every inbound request for an environment and being confident that those environments are running efficiently and securely on the right providers operations teams can begin to work with development teams to design new processes for their cloud native applications.

These newly designed processes and cross-functional team structure combined with a platforms that supports running the broadest amount of languages and frameworks within microservices based architectures will enable the development and operations teams to achieve higher release frequencies. By utilizing microservices and standardized platforms and configurations these new applications will allow for independent release and scaling of components of the application.

This results in an increased success rate of change, faster cycle time, and the ability to scale specific services independently, making the life of both development and operations teams easier and allowing them to meet the needs of the line of business. We have experience doing this with very large software development organizations [11].

Scalability with Programmable Infrastructure

As agility of development and operations processes is improved and release frequency increased, so to does the demand for more scalable infrastructure to run those releases on. Operations teams face the challenge of delivering infrastructure that will scale to meet the demand of this ever-growing number of applications. The last thing the head of operations would like to have to explain to the management team of a company is why an extremely successful new application was hitting a wall as to the maximum number of users it could support. This simply can’t happen. Unfortunately, the current infrastructure is not scalable, neither from a financial nor technical standpoint.

One option might be to build out a scale-out infrastructure, perhaps based on OpenStack, the leading open source project for infrastructure-as-a-service. However, the operations team doesn’t want to spend it’s time taking open source code and making it consumable and sustainable for the enterprise. It doesn’t have the resources to test and certify that OpenStack will work with each new piece of hardware it brings in. It also can’t afford to maintain the code base for long periods of time with the resources available. Finally, OpenStack is missing key features that operations needs and they don’t want to develop those in house as well.

What operations really needs is a way to minimize cost and increase scale through the use of commodity hardware and a massively scalable distributed architecture coupled with the enterprise management features required to operate that infrastructure and a stable, tested, certified way of consuming the open source projects that make up that infrastructure. By having this, operations can deploy scale-out infrastructure in multiple locations and still aggregate management functions like chargeback, utilization, governance, and workflows into a single logical location. Many of our customers have found this solution beneficial in reducing cost and ensuring stability at scale [12].

Introducing Red Hat Cloud Suite

Red Hat Cloud Suite is a family of suites from Red Hat that brings together all the award winning products from Red Hat in a consistent way to solve specific problems. It allows IT to accelerate service delivery and optimize their existing assets while allowing them to build their next generation infrastructure and application platforms to support massive salability and more agile development and operations processes. In other words, it meets them where they are and lays the foundation for where they want to go.

A Different Approach

It should come as no surprise to you that Red Hat is not the only company solving these problems. Red Hat is, however, one of the few companies that can solve all of these problems because of its broad portfolio of technologies and expertise. Most think of Red Hat as having the largest percentage of the paid Linux market share. That is true, but Red Hat has been adding to its portfolio and has grown acquired expertise and industry leading technology from Software Defined Storage [13][14] to Mobile Development Platforms [15]. These offerings place Red Hat alone with only Microsoft in terms of depth of capability.

An Important Difference

Along with this depth of expertise and capability comes an approach that sets Red Hat apart. Red Hat is the only vendor that uses an open source development model for all of the solutions it delivers. This is important for customers because the world of cloud infrastructure and applications and DevOps is built entirely on open source software. By having a strict open source only mentality customers can have access to the greatest amount of innovation and be ensured that as technologies change they could adopt them more easily because Red Hat can adopt and deliver these technologies. Two great examples of this are how Red Hat adopted the KVM hypervisor [16] and embraced and delivered it’s container platform with support for Docker and Kubernetes [17] – leading open source projects that become popular in a short amount of time. Red Hat is committed to the open source development model, so much so that it even creates communities when it acquires non-open source licensed technologies [18]. Customers should know that when they leverage a solution from Red Hat it is based entirely on open source, leading to greater access to innovation and lower exit costs.

Technical Capabilities are Important Too

While philosophical differences are important for ensuring that the right long term decisions have been made, Red Hat is also at the forefront of innovation in cloud infrastructure, applications, and DevOps tools.

True Hybrid Support

The term hybrid cloud has often been over used and abused, but it is important. Enterprises need to be able to run workloads across the four major deployment models that exist today: physical, virtual, private, and public cloud. Equally as important to the deployment model is the ability to support multiple service models, such infrastructure-as-a-service, platform-as-a-service, or even bare metal, virtual machines running on scale-up virtual infrastructure, and public cloud services. When most vendors claim they support “hybrid” cloud they are typically limited to only managing hybrid deployment models. Red Hat supports both hybrid deployment models and hybrid service models. This is important to both Development and Operations teams. For developers, it means being able to develop on the broadest choice of languages and frameworks. They could use an Oracle database running on bare-metal or virtual machine, JBoss EAP running on virtual machines on OpenStack, combined with Node.js and Ruby running in Containers on OpenShift. They are not constrained to a single service model that doesn’t give them everything they need.

Using Big Data to Optimize IT

Red Hat has been supporting Linux for a long time. In fact, we’ve been supporting Red Hat Enterprise Linux for over 13 years since RHEL AS 2.1’s release in 2002 [16]. There are over 700 Red Hat Certified Engineers in our support organization and they’ve documented over 30,000 solutions while resolving over 1 million technical issues. The Red Hat customer portal has won plenty of awards for helping connect customers searching for resolution to an issue to the right technical solution. With Red Hat Access Insights, Red Hat’s new predictive analytics service, connecting support data to recommendations is going to reach a new level of ease of use. Users can send small amounts of data about their environment back to Red Hat and it will be compared to optimal configurations to find opportunities to improve security, reliability, availability, and performance. This service is already available for Red Hat Enterprise Linux and will soon be available for all the technologies in Red Hat’s portfolio through Red Hat Cloud Suite.

An Easy On-ramp and Consistent Lifecycle

Deploying a private cloud is not an easy task. The list of platforms that need to come together from configuration management, to storage, to infrastructure-as-a-service, to platform-as-a-service is large. Each of these has dependencies on sub-components within each of these platforms. For example, to generate new docker images need secure content and that takes integration between the content management system and the image building services. Literally hundreds of these integrations are needed to build a fully functional private cloud. This usually results in one of two options:

  • Operations requiring lots and lots of time to deliver this private cloud.
  • An army of high priced consultants arriving to deliver and maintain a private cloud.

Neither of these options are an optimal results for IT. Red Hat Cloud Suite provides an easy on-ramp that allows a single person in operations to deploy a private cloud and it provides the path for ongoing management of that private cloud. This allows developers to begin using the private cloud more quickly and helps operations deliver a private cloud more quickly.

A Quick Summary

Here is a quick summary for those that just want the cliff notes.

The World is Changing

  • Businesses need a continuous competitive advantage
  • All businesses are software companies
  • Competition is everywhere

IT Needs

  • To increase relevance and reduce complexity
  • To create more agile processes and build programmable & scalable infrastructure and platforms

Red Hat Helps

  • Accelerate delivery
  • Optimize for efficiency
  • Modernize development and operations
  • Deliver scalable infrastructure

Only Red Hat Delivers

  • Innovation in the form of pure open source solutions
  • Integration with world class testing, support, and certification





[4] DevOps, Open Source, and Business Agility. Lessons Learned from Early Adopters. An IDC InfoBrief, sponsored by Red Hat | June 2015















Red Hat Cloud Suite for Applications

For those following our recent announcement, I put together a short blog post that explains why Red Hat Cloud Suite for Applications is the only on-premise complete and open source solution for accelerating application delivery at scale.

Docker all the OpenStack Services Presentation

Slides from the presentation Brent Holden and I gave at OpenStack Summit can be downloaded here.

A Demonstration of Kolla: Docker and Kubernetes based Deployment of OpenStack Services on Atomic

The Problem

Screen Shot 2014-10-22 at 8.39.48 AM

The Beauty of OpenStack

OpenStack is a thing of beauty, isn’t it? Just look at all those cleanly defined services, perfectly atomic, able to run standalone … it’s simply amazing. What more could developers and operators ask for in a cloud?

Screen Shot 2014-10-22 at 8.42.30 AM

The Reality of OpenStack

Except, that it’s not exactly like that. All those services heavily rely on each other and given the rate of change OpenStack is experiencing the degree of complexity only stands to increase. The problem is that OpenStack has many services that are dependent on one another and managing the lifecycle is difficult and inefficient because of this.

Let’s look at an example of updating the keystone service, OpenStack’s identity management service. It is difficult to know whether or not deploying a new version of Keystone into an existing OpenStack deployment will cause problems because of compatibility with others services. It’s also difficult to move backwards and expensive to roll back a deployment of a new keystone service with today’s tools. Operators don’t want to use extra racks of hardware to test an upgrade of a service if they can avoid it and no lifecycle management tools that try to imperatively deploy and roll back can do so as reliability as we’d like between OpenStack releases.

At this point you might conclude that I have a personal vendetta against OpenStack. Although this could be justified after the many nights I’ve spent installing, configuring, and upgrading OpenStack I can assure you that’s not the case. In truth, OpenStack is not a beautiful and unique snowflake. Lots of different infrastructure platforms face this same problem and so do many application platforms.

The Many Paths to OpenStack Lifecycle Management

Today, there are many ways to manage the lifecycle of OpenStack services, but the two most prevalent can be loosely grouped into two categories: build based and image based deployments.

Build based lifecycle management uses a build service, such as PXE, and is typically coupled together with a bunch of lifecycle management tools and  almost always uses some type of configuration management whether that’s Puppet, Chef, Ansible, or others.

Screen Shot 2014-10-22 at 9.40.38 AM

This approach is generally inefficient because each OpenStack service is placed onto a different physical piece of hardware or at least a different operating system.

Screen Shot 2014-10-22 at 9.45.48 AM

It is possible to combine multiple services on a single operating system, but this can get tricky. How does the lifecycle management tool know that OpenStack Service A in the image above won’t conflict with OpenStack Service B in terms of resources required, ports required, file systems, etc? It takes an awful lot of logic in a lifecycle management tool to know this and given the rate of change experienced in a community like OpenStack, lifecycle management tools have a hard time keeping up and delivering what users would like to deploy. Could virtual machines be used here? Possibly, but virtual machine are heavyweight and also lack rich metadata or require large infrastructures and agents loaded into those virtual machines to get metadata. In other words, VMs are too heavy and they also lack the concept of inheritance.

Screen Shot 2014-10-22 at 10.00.53 AM

Finally, build based deployments can be slow. Copying each package back and forth over the wire is not the most efficient way of deploying at scale.

Image based deployments solve the problem of slow performance that build based systems have by not requiring each package to be installed. Typically an image based system has some sort of image building tool that stores images in a repository and these images are then streamed down to physical hardware.

Screen Shot 2014-10-22 at 10.12.33 AM

However, even while using images, incremental updates can be slow due to the large size of images. Also, the expense of pushing a large image around for small incremental updates doesn’t seem appropriate.

Screen Shot 2014-10-22 at 10.12.42 AM

Even more importantly, image based deployments don’t solve the fundamental problem of complexity that understanding the relationships between OpenStack services presents. This problem is only moved earlier in the process and must be solved when building the images themselves instead of at run-time.

There is one other consideration that should be taken when looking at building a lifecycle management solution for OpenStack and that is that OpenStack doesn’t live alone. The last thing most operators want is yet another way to manage the lifecycle of a new platform. They’d like something that they can use across platforms from bare metal, to IaaS, and possibly even in a PaaS.


What Atomic, Docker, and Kubernetes Bring to the Party

Wouldn’t it be great if there was a solution for managing the lifecycle of Openstack services that was:

  • Isolated, lightweight, Portable, and Separated
  • Easily Described run-time relationships
  • Could run on something thin and easy to update
  • Worked to manage the lifecycle of services beyond OpenStack too

That’s exactly what the combination of Docker, Kubernetes, and Atomic can provide to the existing lifecycle management solutions.

Screen Shot 2014-10-22 at 10.32.57 AM

Docker provides a level of abstraction for Linux Containers through APIs and an “Engine”. It also provides an image format for sharing that supports a base and child image relationship allowing for layering. Finally, Docker provides a registry for sharing docker images. This is important because it allows developers to ship a portable image that operators can deploy on a different platform.

Screen Shot 2014-10-22 at 10.34.25 AM

Kubernetes is an open source container cluster manager. It provides scheduling of Linux Containers using a master/minion construct. It uses a declarative syntax to express desired state. This is important because it allows developers to provide a description of the relationships between different Linux Containers and let’s the cluster manager do the scheduling.

Screen Shot 2014-10-22 at 10.35.39 AM

Atomic provides just enough of an operating system to run containers in a secure, stable, and high performance manner. It includes Kubernetes and Docker and allows for users to update using newly developed update mechanisms such as OSTree. Here is a quick video that shows how easy it is to deploy atomic (in this case on OpenStack) and also how easy it is to upgrade Atomic. Watch OGG


Screen Shot 2014-10-23 at 2.20.49 PM

So when you put these pieces together what you end up with is something that looks (at a high level) like the diagram above. OpenStack developers are free to develop on a broad choice of platforms (Linux/Vagrant/Libvirt pictured) and can publish completed images to a registry. Operators on the other side would pull the kubernetes configurations into their lifecycle management tools and the tools would launch the pods and services. This would trigger Docker running on Atomic to pull the images locally and deploy containers with the OpenStack services. Services are isolated and (we are fairly certain given our experience with our OpenShift PaaS) lots and lots of containers could be run on a single operating system to maximize density of Openstack services. There are LOTS of other benefits including ease of rollback, deployment and update speed, etc, but this alone should be enough for anyone looking at running an OpenStack cloud at scale to be interested.

 Show me the Demo!

Screen Shot 2014-10-23 at 2.21.29 PM

Here are several demonstrations that illustrate the scenario above. These are a demonstration of the OpenStack Kolla project and were produced in 2 weeks time by a group of amazing developers who saw the potential these technologies had.

First there is building the images and pushing them to a registry.  Watch OGG

Second there is deploying a few pods and services manually to see how they connect and what Kubernetes and Docker are actually doing. Watch OGG

Finally, there is an example of deploying all the OpenStack services that were completed in milestone-1 all with a single command.  Watch OGG

After deploying OpenStack countless times I can say that when you see each schema automatically created in MariaDB and endpoints, services, etc automatically created all in under a minute it is an amazing feeling!

“I’m Sold, What’s Next?”

In the end, the combination of Docker, Atomic, and Kubernetes show the promise of alleviating some of the pain OpenStack developers and operators have experienced. There are still a lot of unanswered questions, but we feel that this combination of technologies shows promise and are excited that they have found a home in the TripleO project through Kolla.

If you are interested in learning more or participating please:

If you want to learn more about some of the other projects related to this post please check out the following:


Get every new post delivered to your Inbox.

Join 1,073 other followers