home | projects | about

Friday, September 10, 2004


I have a very large picture archive that needs backup "off-site". Week to week this archive sees a lot of action, whether it be picture addition or reorganization. The pictures are organized by directory, and directories are often moved or renamed. Nothing ever gets deleted. I wanted:

- daily backups so I could go back and see the archive on a particular day.
- hard-links to the previous backup so that only "new" files use disk space.
- graceful handling of moved, renamed, and duplicate files without additional storage or transfer bandwidth.
- backup format of a directory tree that can be directly copied when needed.
- remote backups using ssh.

I didn't find exactly this tool so I created Link-Backup ("lb"). lb keeps track of file stats and content and is able to find and hard-link to identical files independent of filename or path. A side benefit of this is that it'll find and hard-link identical files within your archive. In addition, if there is a duplicate file with different stats (can't be hard-linked) lb will perform a copy at the destination instead of transferring a new copy. lb requires python to be installed at both the source and destination. It sends itself to the remote end, so it only needs to be installed where it is invoked.

Sunday, September 05, 2004

Jail vs. UML

One can never be too careful (paranoid?) when exposing outward facing services to the world. It's been shown time and again there are many creative ways to breach a server, especially when the hacker can replicate the runtime environment of the target and has the time and incentive. A careful approach and most importantly ongoing maintenance are both key to trouble free success. Even so there is no perfect system, only countermeasures.

Earlier this summer I set out to evaluate a hosting OS for this server. There is nothing on this machine that makes it an interesting target, but even so I couldn't bring myself to expose a server without some heavy duty OS provided protection mechanism. I ruled out chroot mainly because there are better protection mechanisms available that don't have some of the documented problems with chroot, such as means of escaping the chroot environment, and the non-partitioning of OS services that have no bearing on the chrooted environment.

I narrowed the evaluation down to two protection mechanisms: FreeBSD Jails vs User Mode Linux (UML). As the name implies, the jail feature in FreeBSD allows the creation of a partitioned execution environment in which system services and state outside the jail are not visible inside the jail. UML allows an instance of linux to run as a user mode process, on linux - a virtualized linux instance. Both of these have the possibility of hosting an http server in a throw-away runtime environment without exposing the native OS.

I first evaluated UML and took note of these issues:

- need to select / build the root image to be hosted
- need to rebuild hosting kernel with skas patch
- need to rebuild hosted kernel to disallow module loading
- need to configure tun/tap device for tunneling the UML's ip traffic
- need a way to start / stop the UML gracefully on boot / shutdown
- lcalls not virtualized inside UML (show stopper?)

These are addressable however my main observations are that UML is not optimized for service hosting, and that it has significant administrative complexity. On the otherhand it looks very useful for OS development and testing purposes. I expect that as UML matures and tools flourish these issues will get addressed. A Google search shows "virtualized server" hosting companies offering UML hosts even though the lcall issue exists. Perhaps it is not a big deal - these companies could have either directly resolved the lcall issue in the hosting kernel, or maybe they just don't expect their customers to be bad guys.

FreeBSD Jails on the other hand offer benefits that UMLs do not. There is technically no OS virtualization, instead the OS recognizes jailed processes and only offers them services available in that jail. A jailed process cannot "see" outside of the jail (unless there is a bug in the jail implementation - a major caveat). A few immediate benefits of this technique:

- There is no separate kernel image to build and maintain.
- The ip network infrastructure is already in place in the native kernel. The jail only needs to be assigned an ip address.
- There is an easy to use existing mechanism in FreeBSD 5.2 to start and stop jails gracefully.

When it comes to service hosting, my observations are that the Jail feature is designed for this use whereas UML isn't specifically, is more mature having been authored in 1999, and has lower administrative complexity than UML.