For nearly a year I have been trying to come up with a sane storage solution that is quite, reliable, expandable, reasonably high performing, and inexpensive. My photo library is not terribly large (~130GB), but since I acquired an HD video camera, my storage needs were rapidly getting out of control. One hour of video consumes about 80GB of working space, and about 9GB of space to archive.
After far too many hours surfing the Internet and trying various solutions, I settled on building an OpenSolaris computer, accessed from my Mac via iSCSI. The reason for OpenSolaris vs. Linux, Windows, etc., is the filesystem, Zetta File System (ZFS). Simply put, it is the best file system I have ever used. More on that later.
To be clear, building your own storage box is not like a drobo or similar device. There is nothing plug and play about this. But it is fun, and you can get the system exactly the way you want.
The box I built has two mirrored boot drives, and four data drives in a RAIDz, the ZFS equivalent of a RAID-5 (but better). This gives me a little over 2 TB of space, plus about 700 GB from the boot mirror. The whole thing was a bit over $1000.
My iMac connects to the box over a single 1GBit connection, through an inexpensive Linksys switch. I think I paid $40 for an 8 port version.
The data on the server is exposed via iSCSI to my mac, so that the storage appears like any other disk. This allows me to backup the iSCSI data via time machine. It is just as easy to export the data via Windows file sharing or NFS.
Informal speed tests showed I was getting about 220MB/s on the RAIDz. My technique was simple and error prone - I copied a 3GB movie from one file system to another, all within the same 4 disk RAIDz set. Considering the file was being read and then written with the same disks, that is pretty good performance.
Testing between the iMac and the storage box was a little more disappointing. I used iozone, an excellent I/O tool, and got between 40-50MB/s. Considering the speed of the network and my equipment, I expected around 80-90MB/s (125MB/s is the theoretical max for a single gigabit connection). The same test to another iMac showed speeds around 60MB/s (the limit of the second iMac hard drive), so something is not configured correctly on the storage server. The switch may also play a part. I have read of people with similar hardware to mine getting in the 100MB/s region with a single network card, and 190MB/s with two cards, so it is certainly possible to go faster than what I am getting. My guess is that somewhere my configuration is not correct.
But the real glory of the system is the ZFS. Wikipedia does a much better job explaining ZFS than I could do, but here are the bullets.
* Lightweight file systems
Think of a ZFS as a glorified directory. It takes a single command to create one, they can be nested, and each one can have individual policies such as compression, sharing, snapshots, etc. Nested ZFSs share the same storage (thin provisioning), can have quotas, pre-allocation, etc.
Example - before doing a major overhaul on your photo library, take a snapshot. If things just don't work out, revert to the snapshot, and you are back to where you started. Heck, take 2 or 10 or more during the process to protect you at various stages. Writable snapshots (clones) can be created as well.
* Data Integrity
Assuming you are using some sort of redundant RAID (RAIDz, mirror), every read/write is validated to ensure the data has not silently corrupted. Traditional RAID-5 and mirrors do not do this. Also, disks have an error rate of 1 in 10^14 bits or so, and a TB is 10^13 bits. It is statistically probable that you will have data corruption with a largish traditional RAID-5 system during the rebuild of a failed drive. ZFS is able to silently recovery from such errors.
* Data portability
A ZFS can be securely replicated to another system. Read-only ZFS support is also included in Mac OS X, so if your Open Solaris box blows up , you can mount the disks on your Mac, in USB cases if necessary, and read the data!
* Easy to use
Now, this is relative. Compared to Linux or Windows, it really is easy to create and manipulate ZFS. Compared to plug-and-play drobo type devices, it is an incredible pain in the neck. But where is the fun with drobo?
Oh boy, where to start. The combination of file systems as easy as directories, really fast disk I/O, high data integrity, snapshots, multiple access methods, etc. etc. etc. makes for a really flexible system. Not to mention everything else you can do with OpenSolaris - virtualization, iTunes servers, desktop applications, etc.
I have been running this server for about eight months, and have had two failures. One was my fault, the other was a crash (OK, I power cycled the system in a fit of impatience ). The first time, mucked up the OS boot mirror, which left my system un-bootable. I booted from the install CD, was able to re-mirror my disks with one line at the command prompt, and 10 minutes later rebooted with a healthy system . In the other case, I could not get the system to boot, and had to reinstall the OS from scratch. The reinstall took about 20 minutes, and all of my data on the RAIDz was fine. In fact, due to the way the file system works, this was the first computer system of hundreds that I have managed in the past where I felt really confident that the data would be OK after an OS crash.
Sorry this is so long! I am happy to clarify anything or provide more detail if you wish. But honestly, wikipedia does a great job explaining the benefits of ZFS.