Hard drive failures suck and the IT world has created an incredible number of systems, protocols and products to minimize, mask, and recover from storage failures. Many of these solutions are EXPEN$IVE, complex and proprietary, but we at DreamHost are developing a much cheaper (as in FREE) alternative. The Ceph open-source file system will store and protect petabytes of data at a fraction of the cost of the behemoth systems from the likes of EMC, NetApp and HP, and as a bonus, you can tinker with Ceph now.
It’s easy to manage a few hard drives on your average desktop computer, but it’s much tougher to manage petabyte-scale deployments. So just how big is a petabyte? It’s one thousand terabytes, 1 followed by 15 zeros, or approximately the number of times people have wished death upon Justin Bieber. An average home computer might have one hard drive failure every few years and just a few dozen file requests at a time, but at the petabyte-level hard drives fail almost daily and there will be hundreds of thousands of file operations per second or more.
Traditional file systems rely on some sort of allocation table or central server that is responsible for recording which disk data was stored on, and answering any requests to find data. As a real-life corollary, imagine a library with a single librarian who is in charge of not only organizing the books, but also answering any questions about where the books are stored. This situation works wonderfully if there aren’t any customers and if the books don’t move around too much, but our hapless worker would be completely overloaded if thousands of customers showed up or if the bookshelves kept collapsing and books had to be shuffled around.
Ceph avoids this bottleneck by replacing the allocation table with the awesomely named CRUSH algorithm that uniformly distributes data across your disks (we call the storage servers OSDs). CRUSH is a pseudo-random algorithm that calculates the location of data based instead of storing it. Going back to our library example, this would resemble a simple set of rules that any customer could follow that would lead them to the right room, row, and bookshelf, provided they know the book’s title. No overworked librarian necessary. A shared CRUSH map describes only the rough layout of the library (which bookshelves/servers are where), allowing any computer to calculate where to find a given book/file.
Ceph is still under heavy development and we have a small army of developers at work on the vital pieces, but you can help in two ways. First, you can test the file system by either compiling the source code or by downloading the Debian/Ubuntu packages. With a few configuration tweaks and a few servers, you’ll be off and running in Ceph La La Land. For additional help or to report bugs, you can subscribe to the Ceph mailing list or visit the IRC channel.
If you REALLY want to get your hands dirty, we are hiring several developers to help with Linux Kernel hacking, C++ programming, and QA/Performance Testing. Please apply for those positions on the DreamHost jobs page.
Want to learn more about Ceph? Nothing could be more informative than hearing from the file system’s creator and DreamHost co-founder Sage Weil at the Southern California Linux Expo (SCALE) this weekend February 25th to 27th at the LAX Hilton. Sage will be speaking at his ‘Ceph: Petabyte Scale Storage for Large- and Small-scale Deployments’ talk 4:30 PM on Sunday February 27th in the Century AB room and at the Open Source File Systems Panel at 1:30 PM in the La Jolla room, also on Sunday.
But Ceph isn’t the only thing DreamHost will be talking about at SCALE… our Robert Rowley, Abuse-meister extraordinaire, will spill the beans about the most common attacks against our web customers. He’ll also tell you how to protect against those attacks and if you’re good there might be a top-secret DreamHost Perl script or two thrown in. Robert’s ‘Securing Web Applications for System Administrators’ talk will be on Saturday February 26th at 4:30 PM in the La Jolla room.
SCALE registration is $70, but you will receive a 50% discount if you use the ‘DREAM’ promotion code to sign up.