DreamCompute Images: A behind the scenes look at our image building process

Being a public facing cloud, it is our duty to provide updated operating systems from which to boot. The DreamCompute operations team automates building, testing, and deploying OS images to keep them regularly updated.
At DreamHost, we’ve been spending recent cycles on building new and updated operating system images for use in our public OpenStack cluster. We always pursue processes that are automated and produce the same reliable results; this was no different. For those out there who have installed an operating system before, you probably already have a base understanding in what is means to “create an image”: setting disk partitioning, selecting a root password, choosing a set of packages to install, etc.
However, creating an image for the cloud is slightly different. Pieces need to be in place to ensure the disk partition is appropriately sized (automatically and dynamically), that the user is able to log in with a supplied ssh key assigned at creation time, and any custom post-launch scripts are executed (just to a name a few). The end result is a blank operating system that can be booted in DreamCompute and will be configured by cloud-init to the user’s specifications.
After the image is created, it is thoroughly tested before it’s made available to DreamCompute customers. For this we use Jenkins to execute tests automatically. This provides us an automated CI gate so that only images that pass the tests are made publicly available.
In order for an image to make it through the promotion process, it must pass the following tests:
- A VM launched with the image makes it into “ACTIVE” status (as reported by nova)
- A floating IP can be associated with the VM
- An ssh connection can be made to the ‘dhc-user’ user
- User supplied user-data is accessible
- Nameserver in /etc/resolv.conf is properly set
- Root user does not have a password set (for security reasons)
- Root disk partition is appropriately sized
- Console log is not empty
After the image is created, it is uploaded to a private tenant in our staging environment (which we refer to internally as Norse). While still marked as private, the above tests are executed. If all the tests pass, the image is marked as “verified” and “is_public” (which is Glance talk for “everyone can see/use this image”). Then we move on to our production cluster (called Masseffect internally), following the same process (upload private, test/verify, make public). Metadata tags are included with the image that allow us to traceback which Jenkins jobs provided the acceptance test so we can troubleshoot if something passes incorrectly.
When you partner with us, your website is in good hands! Our services pair friendly expertise with top-notch technology to give you all you need to succeed on the web.Cloud Computing and DreamHost
There have been a couple of iterations to improve this system over the last 6 months, but we are pretty happy with our current state (there is always room for improvement though). One of the first attempts at this workflow did not have very good job control and would upload new images to all of our clusters (staging and production) simultaneously. The images were not shared publicly until the tests passed, but we ended up with a bunch of images uploaded to our production cluster that failed to even pass tests in our staging environment, which was less than ideal. Since Jenkins is driving this workflow for us, we decided to look at plugins to provide us with better job control, which is when we started using the Conditional BuildStep plugin. It also makes for a nice interface to follow along:

If you’ve made it to this point in the post, kudos and thank you! Image creation and promotion isn’t the most glamorous part of running a public cloud. However, having a system in place where a simple git commit kicks off a series of events that produces a well tested, public facing change does feel good. If you have any questions about this process or anything else DreamCompute or OpenStack related, feel free to reach out to me directly or come chat with us on #DreamCompute on irc.freenode.net.
7 Comments
Comments are closed.
Great job on the work with your public cloud system.
Hi Daivd,
Excellent job on public cloud however what is the time limit to keep the images safe?
We are still trying to define our standards surrounding our images. Our goal is go keep our images up to date with what the distributions offer. One idea we’ve tossed around is maybe doing monthly updates on our images. We do not delete the images at this point, so they are retained indefinitely, but when an updated version of an image is available, the previous version is marked as ‘non-public’.
Thanks David,
Now I know all the work that let me spin up a Ubuntu 14 instance so quickly and easily. I was able to push that to Ubuntu Xenial Xerus so I could install and try out the latest version of icecast2 and it is working quite nicely so far. Thanks or all your work. Is there any way to download or share my image of it with others who would like to try it out?
Sharing an image is fairly easy in OpenStack Icehouse (which is what DHC is running). You can simply run `glance member-create `. Horizon, the web interface for OpenStack, doesn’t have a facility for doing this, so the command line is your best option.
As a side note: we’ve been working really hard on bringing up a new OpenStack cluster on a different version of OpenStack and using a different network architecture (as well as other modern technologies like SSDs instead of spinning disks). Once we get past this large chunk of work, I plan on spending some cycles getting our images into a good place again (ensuring updates are applied, including new OS releases/distros, etc.).
Hi David, great post.
I would like to know how do you are updating the existing images. Are you updating or just creating new ones? If you are deactivating old ones do you keep the image references in your customers existing instances?
i.e: I have a Ubuntu 14.04 public image and I would like to deploy a updated version of Ubuntu 14.04 with all the new packages available in apt-get upgrade.
How do you replace de old for the new without lose the image references displayed in the users existing instances?
I was trying Kilo and looks like if you put the image in the new deactivated status works, but my users can see the deactivated images in the image list inside their tenants, making a mess in the list.
Do you have a more fancy way to deal with that?
Cheers
There is more details to this than are appropriate for a reply here, but I’ll try to provide a succinct response:
We essentially just hit the “build” button on our jenkins job, which rebuilds the image. Part of the build process includes an “apt-get update/upgrade/dist-upgrade”, so we know the image is up to date. When we have the updated image, it rolls out as described in this post. One quirk (which is annoyingly noticeable in Horizon) is that any previously booted “Ubuntu 14.04” images will have a null “image” field in horizon, as that field was pointing to a UUID, which is now hidden from the user. You still see “Ubuntu 14.04” in the image list, but it’s a different UUID (because it’s an updated image).
So essentially our answer is that we just don’t really deal with it. We accept this as a shortcoming of the setup (and want to try to patch/upstream a fix) and focus our efforts on more pressing issues for us (such as rolling out a new datacenter).