Save Money with Open Source Storage Save Money with Open Source Storage

® 1 an Save Money with Open SourceStorage Storage, an Internet.comeBook Storage eBook. © 2009, Internet.com Contents…

Save Money with Open Source Storage

This content was adapted from Internet.com’s Enterprise Storage Forum and Enterprise Networking Planet Web sites. Contributors: Drew Robb, Deann Corum, and Jennifer Schiff.

2 2 The State of Open Source Storage

5 Saving Big Money With Open Source Storage

5 7 7 Get Your Free Networked Storage

9 An Open Source Backup Option

9 11 11 Configure Bacula for Open Source Backups

1 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage The State of Open Source Storage By Drew Robb

pen source storage has come a long way in the “I still wouldn’t say that there were a lot of open source stor- last few years. There are good open source offer- age apps,” said Jason Williams, CTO at Digitar of Boise, ings on the backup, mirroring, file system, NAS, Idaho, a company that makes heavy use of and Sun and storage virtualization side. It is possible to open source software. Ocobble together an awful lot of disks and run them at high performance without the need for state-of-the-art hardware. Williams said the leading open source storage offerings are Even companies known for proprietary offerings, like EMC, Sun’s ZFS file system, Zmanda and Bacula for backup, and are on board. DRBD for network-based disk mir- roring. “EMC most often encounters open source in the form of a Linux-based Greg Schulz, senior analyst and host connected to our storage founder of StorageIO Group, is products,” said Jay Krone, senior more upbeat about the state of director of storage platforms at open source storage offerings. EMC. “Customers are purchasing Intel- or AMD-based servers and “There is a wide variety of open putting Linux on them to take best source storage solutions and ap- advantage of volume pricing on the plications from different sources, hardware and minimal-to-no licens- ranging from volume manag- ing costs on the software.” ers, iSCSI and NAS stacks, file systems, clustered file systems, Krone said customers tend to add object-based storage solutions, open source applications, like the dedupe and compression, among Apache Web server, or proprietary others, not to mention all of the products like Oracle databases, to those Linux-based serv- propriety or commercial solutions that may leverage open ers to address a wide spectrum of business problems. To source technology embedded into turnkey solutions and meet this trend, most EMC storage hardware and software products,” said Schulz. “Of traditional server and storage products have been adapted to run in a Linux environment. vendors, Sun is probably the most notable and vocal around For example, EMC’s PowerPath family is available in Linux. open source storage, along with many smaller startup ven- dors.” Despite the recognition by EMC and other data storage vendors, opinions differ on how far open source storage has Sun’s “Amber Road” project, now known as Unified Stor- come. age Systems (UFS) or the Sun Storage 7000 series, is built around preinstalled OpenSolaris and ZFS on x86 hardware.

Of traditional server and storage vendors, Sun is probably the most notable and vocal around open source storage, “ along with many smaller startup vendors. ”

2 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

These units support both file and block data protocols, thin Open Source Storage Projects provisioning, replication, mirroring, snapshots, antivirus, and analytics. An HPC version adds Linux to the mix too. None of the big data storage vendors are as committed to open source as Sun, so it is no wonder the rest of the field “Amber Road is essentially a NAS system that integrates is rather dispersed among a wide range of players. In the inexpensive servers with open source software in an easy- backup arena, you have outfits like Zmanda Inc. of Sunny- to-use appliance,” said David Trachy, a principal engineer at vale, Calif., and Bacula Systems SA of Switzerland. Sun. “The whole point is to get around the premium you have to pay for proprietary disk systems.” Amanda, the basis for Zmanda’s backup offering, is billed as the most popular open source backup and recovery Sun’s Open Storage portfolio also includes its ZFS file sys- software in the world, with more than half a million servers tem, storage servers, and its Storage J4000 family of JBOD and desktops running various versions of Linux, , BSD, systems. Trachy said Sun is seeing plenty of growth among Mac OS-X, and Windows worldwide. Zmanda also has the these products. Zmanda Recovery Manager (ZRM) for MySQL.

ZFS, in particular, is garner- While Zmanda uses a business ing good reviews. Offered free model similar to Red Hat, Bac- with OpenSolaris, it provides ula is the real deal in terms of a high level of data integrity, Amber“ Road is frontier open source — run by a as well as mirroring between team of devotees such as Kern sites. According to Trachy, it can essentially a NAS system Sibbald, who are now starting be used as the basis for huge to offer professional services data repositories. It is already that integrates inexpensive to Bacula fans. Bacula man- being picked up by partners like servers with open source ages backup and recovery to greenBytes and Nexenta Sys- and from tape or disk. What is tems to build storage systems. software in an easy-to-use endearing about these guys is the smart marketing — a Dracula “Startups are using ZFS and appliance theme with a catch phrase that combining it with JBODs to will appeal to backup veterans create different products and (“It comes by night and sucks appliances,” said Trachy. “Miss- the vital essence from your ing in Sun’s open source lineup is computers”) — and blunt honesty. The news page features FC [Fibre Channel] block-level storage and pNFS, but these ”the startling admission, “We recently found and corrected a will be added over time.” serious bug in Bacula...” Oh, for such openness whenever a big IT vendor makes a snafu. In addition, Trachy notes that ZFS integrates well with solid state drives (SSDs), which are beginning to gain traction Cleversafe is another storage vendor pursuing an open in the storage world. Williams, for example, swapped SATA source-based business model. drives inside Sun X4500 servers for ZeusIOPS SSDs from STEC to function as a high capacity (up to 640 GB) memory FreeNas.org is a free distribution that supports CIFS, NFS, cache. SATA remains his platform of choice for volume data FTP, iSCSI, and provides RAID 0, 1 and 5. Another useful storage. open source tool is DRBD by Linbit HA-Solutions GmbH of Austria. It is designed for mirroring of block-level data in Competition for ZFS comes from the likes of Red Hat’s high-availability clustering. Global File System (GFS), the Linux Logical Volume Man- ager (LVM) and file systems like ext4 and BTRFS. GFS was Open Source Storage Barriers first developed at the University of Minnesota as a means of While the number of applications has certainly blossomed, offering high performance and data sharing capabilities for widespread adoption of open source storage still faces the Linux platform, as well as storage virtualization. While many barriers, both real and imagined. GFS is controlled by Red Hat, LVM comes in a wide range of versions in the open source community. “Open source needs to be seen as more of a turnkey sup- ported solution, even if that is what some vendors already

3 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

provide, in order to overcome perceptions that open source source as part of a total solution and its overall value propo- is only for those looking to avoid costs, have the time and sition.” people to integrate, or is just one big computer science project,” said Schulz. Similarly, Chip Nickolett, owner of Comprehensive Consult- ing Solutions of Brookfield, Wisc., thinks we have yet to He also believes that the very essence of open source — be- see the best of open source. To his mind, the big hurdle is ing free — gets in the way of broad acceptance. convincing core storage professionals, who tend to regard storage as being so important from a performance, data “People tend to think that free means less value than what integrity, backup, and disaster recovery perspective that they you might pay for, or less value and stability than for software are willing to spend the money on a SAN or other pricey that you might otherwise buy,” he said. “Likewise, there can storage hardware. They just aren’t that interested in saving a be a support concern or misperception that you might add a few pennies on a potentially risky and — to them — unproven lot of cost and complexity by having to integrate the solu- open source venture. Until that mindset shifts, he thinks tion.” open source will struggle around in the fringes of the storage universe. Others, Schulz said, avoid it because they are in the midst of heavy head-count reductions and have the idea that addi- “I really haven’t seen much traction on the open source stor- tional staff will be required to support open source. But the age side of things,” said Nickolett. “There are backup and biggest barrier may be more fundamental. Schulz believes disk management tools, and a few low-end NAS and SAN a philosophical shift is required for open source storage to offerings, but nothing yet that has become ‘viral’ from a us- make it to the next level — it has to get past the simplistic “it’s age perspective.” open source” value proposition and get more involved with the bigger picture. So far, Sun and Zmanda, for example, report strong inroads for open source-based products, with a cost-savings mes- “What I want to know is, what is the business, economic, sage that’s catching on in a tough spending environment. functionality, and support value proposition of open source Even so, such offerings are still just a small part of the enter- compared to other solutions,” Schulz said. “What I really prise data storage market. n want to hear is what the vendor is doing to leverage open

4 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage Saving Big Money With Open Source Storage By Drew Robb

n these times of economic woe, many companies are Digitar has been using Novell’s SUSE Linux software on its looking for ways to trim their IT budgets while still getting HP servers since it opened its doors. However, the com- the job done. That seems to be playing into the hands of pany tried the Linux storage subsystem a few years ago with a couple of storage trends — the choice of iSCSI rather unsatisfactory results. thanI Fibre Channel (FC), and the growing interest in open source storage software. “At that time, we found the Linux storage subsystem to lack reliability and the Linux Volume Manager (LVM) to be slow,” One firm embracing both is Digitar Inc. of Boise, Idaho. In said Williams. “Back then, I wasn’t familiar with OpenSolaris essence, Digitar is an e-mail processing company. Its sys- and I must confess that I was anti-Solaris, as I had found it tems take care of spam, virus, and difficult to use while at college. I other malware issues for custom- preferred the Solaris kernel but ers, who are then sent cleansed believed Linux to be more user- e-mail. As such, its systems hold friendly.” upwards of 50TB of data — housed primarily on He had what he describes as a hardware using free OpenSolaris “flaky” array at that time, which software and Linux. was spitting SCSI I/O errors. As Linux ignored them, this led to lots “Our entire operation is based of database corruption. The only on open source, and the financial reason Digitar even considered perspective is the biggest reason,” OpenSolaris as an option was said Jason Williams, COO and because Sun offered it for free. CTO of Digitar. “But there are performance benefits too.” “If it hadn’t been free, we would never have looked at, yet it made However, not everything in the our I/O and corruption problems operation is open source. In particular, the company’s own go away,” said Williams. “This gave us the time we needed e-mail processing software is proprietary. As it encapsulates to get a new LSI array in.” the company’s primary value, that is understandable. Performance, Cost Benefits “It is up to each area to decide if its software should be free Tests showed a performance drop of 40 percent when or not,” said Williams. “But as we are gaining the value of Digitar used LVM compared to a drop of 15 percent using open source in many ways, we have a responsibility to con- OpenSolaris. On the cost side, Williams sees the irony in his tribute to the open source community as a whole.” unwillingness to pay around $1,000 per server for Solaris

…as we are gaining the value of open source in many ways, we have a responsibility to contribute to the open “ source community as a whole. ”

5 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

before it was open source. Yet the cost of a Linux server corruption, an integrated file system/volume manager, write license from one of the Linux vendors would have been bundling, and dynamic striping. Williams gave the example more than that, sometimes more than twice the price. With of a database corruption that had resulted in vendor finger OpenSolaris being free, the savings were significant when pointing. The checksum capabilities highlighted the fact that multiplied by more than 100 servers, he added. the file system was clean. This, he said, was the only way to be sure that the actual error lay in the DB itself. “An OS has largely become a commodity item,” said Wil- liams. “I’d rather put those funds into software development “Storage is the easiest way to get into using open source or hardware.” software,” said Williams. “OpenSolaris and iSCSI keeps things simple.” OpenSolaris is now used in about 60 percent of the orga- nization’s servers, with Linux running on 40 percent. On Saving with Solid State the hardware side, Digitar is using Sun X4500 boxes for Solid state drives (SSDs) are not something you normally about half its storage, with Sun 4240 servers and Sun 7000 mention in the same breath as cost savings. They cost or- Unified Storage holding another 20 percent. The rest sits ders of magnitude more than SATA drives and are still in the on traditional storage arrays. Over time, Williams said the very early adopter stage. Yet Williams includes them in his company will phase in more X4240s. rundown of how to get far more from a storage environment.

Commodity Hardware, SATA Drives Digitar uses Zeus-IOPS SSDs by STEC inside three of his When it comes to cost cutting, he’s also a fan of iSCSI over four Sun X4500 servers in a hybrid arrangement. He swaps FC, and x86 gear running SATA drives compared to high- out one SATA disk in each server for a STEC drive as a way end disk arrays. To his mind, it is better to use SATA drives, of getting far more bang for his memory buck — the SSDs as you get more IOPS per dollar. are used in place of more RAM, as opposed to being a stor- age space for data. He gave the example of 15K 146GB SAS drives costing $180 and 7.2K 250GB SATA disks from the same vendor While the price may be high, he said he can have up to costing $55 — you can buy 3.2 SATA disks for every SAS 640GB of SSD operating as a memory cache, compared disk. Those three disks provide a combined IOPS of 240 to a maximum of 128 RAM. According to his numbers, the compared to 175 for the SAS. Thus, higher RPM doesn’t price works out to $50 per GB for RAM and 25 cents per necessarily translate into higher performance. GB for SSD.

Williams also questioned using proprietary hardware from “There is no other way on the market to get 10X perfor- the big storage vendors compared to buying x86 boxes from mance for $2,000,” said Williams. “We continue to use the likes of Sun. One proprietary array, for instance, was cheap SATA disks in the remainder of the server for volume priced at $150,000 compared to $35,000 for the X4500. storage.” The latter offered more processing power and 24TB of disk storage, though only about half the memory, while the former While those servers use write-optimized SSD, Williams also didn’t cover any disk trays at all. used read-optimized flash in his Sun 7000 Series arrays for data analysis tasks. A Vote for Sun ZFS He said OpenSolaris’ ZFS file system is the best and cheap- “We are currently using SSD to accelerate disk, not to est way to mirror data across disks, enabling good redun- replace it,” said Williams. “As the price drops, though, I’m dancy and reliability on JBODs. sure they will be used more frequently in place of high-end disks.” n “Since ZFS came out, it has saved our behind more than once,” said Williams. “The combo of OpenSolaris and ZFS is such that I would now be quite willing to pay for what it offers.”

ZFS provides copy-on-write so there is no need to buy ad- ditional snapshot functionality, block checksums to detect

6 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage Get Your Free Networked Storage By Jennifer Schiff

pen source software is hardly a new concept, iSCSI, NAS, FC, RAID... but it has only recently begun to make significant In addition to the Linux kernel, Openfiler uses open source inroads into the world of enterprise data storage, technologies such as Samba CIFS fileserver and LVM2 where the big name proprietary vendors have (at block device virtualization to give small and large enterprises Oleast until now) had the advantage. the ability to do file-based network attached storage (NAS) and block-based storage area networking (SAN) “in a single But as the open source community has grown and code has cohesive framework.” matured, with Linux taking root in more and more enterprises large and small, storage vendors, including big names like For enterprises seeking a Sun Microsystems, have been file-based storage networking developing open source net- solution, Openfiler provides worked storage solutions. CIFS and NFS support to ensure cross-platform capabil- One network storage software ity. And for enterprises with vendor, Openfiler, never needed virtualization environments to be convinced of the benefits such as Citrix XenServer and of offering enterprises an open VMware, Openfiler provides source network storage operat- both Fibre Channel and iSCSI ing system. (target and initiator) support. Openfiler also supports RAID. Openfiler saw open source — the Linux kernel — as a way for While some storage admin- enterprises to inexpensively yet istrators may be hesitant to efficiently deploy and manage try an open source network their storage networks years storage solution, even one with ago. And it developed an open source network storage an unbeatable price tag, Rafiu Fakunle, the co-founder and operating system with a Web-based GUI that worked with project lead of Openfiler, said they shouldn’t be. any industry standard x86 or x86/64 server, which enter- prises could download for free. Several years later, Openfiler “Open source has been around for a while, and it has built boasts more than 1,000 customers and is busy developing up a level of credibility,” he said. Indeed, Openfiler soft- new features to serve its growing customer base — and both ware has been around for more than six years now and has enterprises and vendors have taken notice. amassed more than 1,000 customers. “So you don’t have to worry that if the software breaks, there’s no one to fix it,

For enterprises seeking a file-based storage networking solution, Openfiler provides CIFS and NFS support to “ ensure cross-platform capability. ”

7 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

because you have a vendor backing you up.” albeit a reasonable one. For small and mid-sized businesses, the per node support subscription fee is approximately Fakunle said that with open source software, you don’t have $1,100 per year. For enterprise support subscriptions, which to wait for the next release or upgrade include high availability and block to get a problem fixed or get the replication support, the annual fee is feature you really want. For example, approximately $2,550 per node for a if a user has downloaded Openfiler single-node configuration, or $5,950 and wants a specific feature, he can If you look“ at some for a two-node clustered configura- submit a request, “and if it’s not too tion. For additional information on difficult, we’ll implement it within a few of the guys who are Openfiler support, visit openfiler.com/ days, or sometimes within hours,” he doing storage right products/support-comparison. said. now, even the Coming Soon: Open Source And because Openfiler is based on proprietary guys, for the Cloud the Linux kernel, “it’s compatible with Recently, in an effort to reach more most operating systems out there in you will find that the customers, Openfiler has been work- terms of hardware,” Fakunle added. ing with LINBIT, another open source “So even if a specific vendor doesn’t vast majority of them software vendor that does block-level sell the driver for their hardware, we are actually using replication, to develop a cloud com- have folks in the open source commu- puting solution. nity who can reverse engineer stuff to Linux as the base get it to be compatible with a specific According to Fakunle, the new piece of hardware.” for their storage solution will give users the ability offerings “to not only do replication between Indeed, open source has taken off to two nodes within the enterprise, but such an extent that even traditional, actually store or export those blocks proprietary storage vendors are tak- that are being replicated to a cloud ing a serious look, and some are even computing environment, such as Amazon or any other cloud developing their own open source solutions. ”vendor who has a Linux offering.” “If you look at some of the guys who are doing storage right So if an enterprise’s local storage “happens to get kicked in now, even the proprietary guys, you will find that the vast the proverbial you know what, you’d be able to [quickly and majority of them are actually using Linux as the base for their easily] restore your data,” he said. storage offerings,” he said. “And the reason they’re doing that is because there is all this stuff in place, based on the Openfiler is also “firming up” its iSCSI offering. “Right now, Linux kernel, and these big name vendors — IBM, NEC, the iSCSI target software within Openfiler allows you to Oracle with its BTRFS file system, Sun Microsystems with deploy your VMs and what have you and is compatible with OpenSolaris — have a vested interest in making sure that earlier versions of Windows clustering on the client side,” their products continue to work in those enterprises. So he said. “The new release of Openfiler for iSCSI is going to they’ve completely changed their philosophy. support persistent reservations, which is part of the SCSI-3 specification. And that will allow you to run Windows 2008 “They’re, like, if you can’t beat them, then join clusters with Openfiler. That brings Openfiler up to date with them.” the offerings from the big guys n The big difference between the big-name vendors’ network storage operating systems and Openfiler is the price. Would you rather spend $30,000 to get the functionality you need, or would you rather spend a tenth as much, or less, with full support? asked Fakunle.

Speaking of support, while it costs users nothing to down- load Openfiler software, support comes with a price tag,

8 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage An Open Source Backup Option By Deann Corum

f you’re looking for a good open-source backup solution, The repository for all this media, file, and job data is the this may be your lucky day. For tape backup and even Bacula catalog. Yes, it requires a database, and that means disk-to-disk backup, Bacula is one of the more popular you will have to install and maintain a MySQL, PostgreSQL Iand well-maintained open source applications out there. or SQLite database to use Bacula. We’re going to provide an overview of how to get started Preliminaries and SQL Installation Phase I with Bacula, look at different pieces that make up Bacula, Depending on your operat- and look at setup and configu- ing system, some preliminary ration basics. This isn’t meant steps may be necessary. If to be a comprehensive guide, your operating system doesn’t but should give an overview include mtx (most modern for those who haven’t waded ones do) or you intend to use through the copious and excel- SQLite, the easiest thing to lent online documentation for do is download the depkgs Bacula. then create a /bacula direc- tory where you will untar the Basic Bacula Compo- bacula and depkgs source nents and Services code. Untar depkgs into that The three major services directory and run: Bacula uses are the storage daemon (bacula-sd), the file make sqlite daemon (bacula-fd), and the make mtx director itself (bacula-dir). The storage daemon facilitates the Since MySQL seems to be storage and recovery of data the most commonly used and attributes to physical me- database with Bacula, we’ll dia. The file daemon is the cli- zero in on installation of that ent piece, which is installed on particular database for now. the machine to be backed up, and the bacula director is the With MySQL, be sure to get the mysql-devel and libz-devel manager and coordinator of all backup job activities. You’ll use packages to get the SQL header files Bacula requires, the Bacula director (bacula-dir) to configure storage pools, and the gzip compression libraries for mysqlclient. If you’re jobs and to automate and schedule backups and the Bacula installing MySQL by .rpm, you’ll need the following: console (bconsole) to interface with and control it all.

With extra backup becoming critical for businesses, it’s no wonder that online backup services have emerged “ as a major industry ”

9 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

mysql-.rpm tape devices, target file locations, and your desired backup mysql-server-.rpm schedule. Start with some defaults, then customize these mysql-devel-.rpm after you get Bacula up and running. The file you will eventu- ally spend the most time configuring is bacula-dir.conf, since If installing from source, untar the files, then from the direc- this one defines and controls jobs, schedules and pools. tory you placed the source code in, you will run: After Bacula is ready to run, a useful tip for testing your ./configure --enable--safe-client --prefix=mysql-direc- configuration files is to run start the related service with the tory -t option: where you replace mysql-directory with the directory ./bacula-dir -t /etc/bacula/bacula-dir.conf where you want to install mysql. Normally for system-wide use this is /usr/local/mysql. Run make and make install This will either terminate with no comment if the configu- and finally, run: ration file is OK, or will print an error message indicating where the error is in your configuration file. ./scripts/mysql_install_db An occasionally troublesome aspect of these configuration This creates the SQL databases for controlling user access. files is that the randomly generated passwords must agree between them. If they are changed and do not agree, the Bacula Installation director service will not start. Another problematic configu- Next, download the bacula source (at least bacula itself and ration issue can be configuring tape devices with Bacula. the docs), and untar the archives in the /bacula directory you Bacula lists support most modern tape devices. I strongly created earlier. Change to that directory and run encourage you to test your tape drive before using it with Bacula. ./configure SQL Installation Phase II using the options described in bacula’s documentation. Start with some basics. For instance a pretty generic Red Hat Start and change to the bacula install directory. There, you installation might get you started: will see scripts for creating and manipulating the Bacula database. The scripts you will need to run are: ./grant _ CFLAGS=”-g -Wall” ./configure mysql_privileges, ./create_mysql_database --prefix=/usr and ./make_mysql_tables. These scripts will grant --sbindir=/usr/sbin privileges for the ‘bacula’ user in MySQL and create the --sysconfdir=/etc/bacula bacula database and tables. You may also need to edit /etc/ --with-scriptdir=/etc/bacula ld.so.conf and run /sbin/ldconfig if your mysql libraries are --enable-smartalloc anywhere other than /usr/lib or /usr/local/lib. --enable-bat --with-qwt=$HOME/bacula/depkgs/qwt Finally, Run Bacula --with-mysql=mysql-directory Test each configuration file and service by running it with --with-working-dir=/var/bacula the -t option as mentioned above. Once you’ve cleared all --with-pid-dir=/var/run errors, you are ready to start Bacula: ./bacula/start and --enable-conio interface with the program via bconsole.

Be sure you include the directory where you installed mysql In the next piece, we’ll deal with auto-starting the daemons, earlier. If you need to later change any of these options, start configuring an autochanger, configuring spooling (and why over by using ./make distclean and re-running ./con- you should), define what volumes, pools and labels are in figure with your desired options. Once you’ve done that, Bacula, run-before and after job directives, and the point of run make and make install. the whole shebang, restoring files from Bacula. n

Customize the Bacula configuration files for your needs ac- cording to the Bacula manual’s instructions. The configura- tion of these files will largely depend on your backup media/

10 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage Configure Bacula for Open Source Backups By Deann Corum

ince we covered Bacula installation and a very basic initial run in the previous piece, let’s look at some of the more specific configuration options Sand capabilities this time. We aren’t trying to provide a comprehensive collection of all Bacula capabilities, which are incredibly numerous. Instead, we want to give you an overview of how Bacula’s basic op- tions and capabilities might be configured.

Device Resources Device resources specify the details of a storage device that can be used by the Bacula storage daemon. One of the most common storage devices likely to be used are tape autochangers. In Bacula, storage devices are configured in the bacula-sd.conf file as well as in the bacula-dir.conf file. An autochanger configuration in bacula-sd.conf might look similar to the following:

Autochanger { Name = Autochanger Device = Exabyte Changer Device = /dev/sg0 Changer Command = “/etc/bacula/mtx-changer %c %o %S %a %d” }

Device { Name = Exabyte Media Type = LTO Archive Device = /dev/nst0 Autochanger = Yes Hardware End of File = no Spool Directory = /var/bacula/spool

One of the most common storage devices likely “ to be used are tape autochangers. ”

11 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

Random Access = no AutomaticMount = yes RemovableMedia = yes AlwaysOpen = yes Maximum Spool Size = 300G }

Device above refers to the archive device, which in this case, is a tape drive. Autochanger = Yes in the Device section tells Bacula that this device belongs to an automatic tape changer and necessitates the definition of a changer device, which is detailed in the first section.AutomaticMount = yes means that Bacula will automatically access the volume unless the operator explicitly unmounts it in the console. AlwaysOpen = yes tells Bacula to keep the device open unless specifically unmounted, so that the tape drive is always available to Bacula. When using tapes, this should always be set to ‘yes’ to avoid unnecessary operator intervention.

If you’re using a DVD device instead of tapes, the Archive Device above might be specified as/dev/hdc with Random Ac- cess set to ‘yes’. If you’re archiving to disk, you would put the full absolute path to the disk directory here instead. In that case, Bacula would write to that directory, and would use the Volume name configured in the catalog. Also, theRemovableMedia parameter would be set to ‘no’. The Media Type would be ‘File’, the Autochanger parameter would be set to ‘no’, and the separate autochanger device configuration would be unnecessary.

In bacula-dir.conf, the device resource must also be defined, and the name must match what is configured in bacula-sd.confabove: The reason“ for setting Storage { Name = Autochanger a Maximum Spool Size Address = stash.crystle.com SDPort = 9103 is to keep the spooled Password = “xxxxxxverylongpasswordstringxxxxx” Device = Autochanger data from filling up the Autochanger = yes entire disk. Media Type = LTO }

Spooling The parameters Spool Directory, and Maximum Spool Size in bacula-sd.conf are worth” noting. If you’re using tapes and running incremental backups, you’ll want to use a disk spool. The reason for this is that it takes a long time for data to come in to the file daemon (bacula-fd) during an incremental backup. If this data is written directly to tape, the tape will start and stop often, causing what’s called shoe-shining of the tape. By writing the data to disk first, then to tape, the tape can be kept in continual motion, reducing wear on the tape. Of course, the larger the spool device, the better. In the above configura- tion, the Spool Directory is set to /var/bacula/spool (which can be a link to another directory elsewhere on the system), and the Maximum Spool Size is set to 300G.

The reason for setting a Maximum Spool Size is to keep the spooled data from filling up the entire disk. Bacula becomes very unhappy when the spool device becomes full during a backup so you don’t want that to happen. Make it large, but limit it to keep it from filling your disk. Otherwise, Bacula will happily spool data to disk until the entire disk is full, at which point it will become very unhappy.

It’s worth remembering that Bacula writes backup information to the catalog AFTER the data is written to tape. So, once the job is done spooling and the backup data is actually written to tape, Bacula will then write the information about that backup to the catalog. This is to prevent having backup information written to the catalog for backups that haven’t actually been writ- ten to backup media yet.

12 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

Volumes, Pools and Labels How Bacula deals with Volumes, Pools and Labels can be confusing for first-time users, so here’s a synopsis:

In Bacula, a Volume is a single tape or file on which Bacula will write your backup data. A Pool is a collection of volumes, configured such that Bacula does not have to limit a backup to the length of a single Volume (tape, or disk). Instead of naming Volumes in Bacula, you define Pools, so that Bacula can simply add data to the next appendable Volume in the Pool. Adding Volumes to a Pool is done using the ‘label’ command from within the Bacula Console (bconsole). Bacula will not write to a Volume until it has been labeled from within Bacula. The Volume label, along with other data, such as the first and last write times, and the number of files and bytes written to a Volume, are stored in the Bacula catalog (database). The Bacula catalog also contains information about the Pool as well as the Volumes in a Pool. The Pool resource is defined in the Bacula Director configuration.

A simple example of a Pool setup in bacula-dir.conf is below:

Pool { Name = full Pool Type = Backup Recycle = yes # automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 6 months Accept Any Volume = yes # write on any volume in the pool }

Most of the options above are self-explanatory, however, much more information about how to configure Pools in Bacula can be found in Bacula’s online documentation.

Jobs in Bacula consist of a FileSet, A Client, a Schedule, and a Pool. In other words, we have to tell Bacula WHAT to back up, WHERE to back it up FROM, WHEN to run the backup, and WHERE to back it up TO. Typically, one particular Schedule is set up as the Default. Below are examples of each piece that constitutes a Job in Bacula and finally how the Job directive ties them all together:

WHERE to back up TO:

Pool { Name = full “ Pool Type = Backup A Pool is a Recycle = yes AutoPrune = yes collection of volumes, Volume Retention = 6 months configured such that Accept Any Volume = yes } Bacula does not have

WHEN to back up: to limit abackup to the length of a single Schedule { Name = default Volume (tape,or disk). Run = Level=Full Pool=full mon-sun at 23:00 ”

13 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

WHERE to back up FROM:

Client { Name = stash Catalog = catalog Address = 10.0.0.63 Password = xxxxxxverylongpasswordstringxxxxx AutoPrune = yes File Retention = 6 months Job Retention = 6 months }

WHAT to back up:

#stash-home FileSet { Name = stash-home Include { File = /sdd1/stash.home Options { signature=MD5 aclsupport=yes } }

}

JOB definition (includes Fileset, Client Schedule, Pool):

Job { Write Bootstrap = /var/bacula/bootstrap/stash-home.bsr FileSet = stash-home Spool Data = yes Spool Attributes = yes RunAfterJob = “su - backup -c “/usr/local/rdb/rdb-driver rdb --trimonly stash.home”” Name = stash-home Client = stash Schedule = Default #full backup every night at 11:00 Storage = Autochanger Messages = Standard Priority = 11 Type = Backup Pool = full }

Job RunBefore/RunAfter Directives In the example above, you’ll notice a RunAfterJob directive. This particular one tells Bacula to run a script to remove some files on the server (where in this case, Bacula is backing the files up from), after those files are backed up to tape. Of course, Bacula has RunBeforeJob capabilities as well. Bacula can run most any script or command here you chose. Most commonly these directives are used to dump databases to disk, mount and unmount filesystems or devices, or remove files prior to or after backup jobs.

Restoring Data Oh yeah: Finally, let’s get to the point of all this: RESTORING data from backup. This is the easiest part once you have

14 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com Save Money with Open Source Storage

Bacula running and doing regular backups for you. In bacula-dir.conf, you will have configured a default restore job. However, the parameters of any restore job can be changed from within bconsole after running the ‘restore’ command.

After typing ‘restore’ in bconsole, you are presented with several menu-driven options to choose what Jobs (JobIds) you’d like to restore from, which Clients and Pools you’d like to restore from, and what files you’d like restored - right down to an individ- ual file from a certain date. After you’ve made your selections, you will be presented with a list of media that will be required to restore your files, and your chosen restore job parameters. You’ll be allowed to modify those parameters, including where (which client/directory) you would like the file(s) to be restored, and whether to overwrite the files if they already exist in the restore location.

In the few years I’ve used Bacula, I’ve personally been impressed with what a robust backup system it is. I’ve known both large-scale users as well as small-business and home users who swear by it. If you’re looking for a robust, well-maintained open-source backup solution, add Bacula to your list and give it a spin. n

15 Save Money with Open Source Storage, an Internet.com Storage eBook. © 2009, Internet.com