The GetDPI Photography Forum

Great to see you here. Join our insightful photographic forum today and start tapping into a huge wealth of photographic knowledge. Completing our simple registration process will allow you to gain access to exclusive content, add your own topics and posts, share your work and connect with other members through your own private inbox! And don’t forget to say hi!

Some drive structuring thoughts...

Jack

Sr. Administrator
Staff member
Upon reviewing my typical workflow, I realized that perhaps I could more efficiently structure my drive arrays and here is what I came up with. I'm not settled that it is an ideal solution, but for right now is working better than my previous solution.

My consideration: Striping additional drive arrays for performance, and the best way to implement it?

Problem: I/O conflicts from reading and writing large files to the same drive array as when converting raw files and saving them back to a shoot folder. So ideally, we should probably read raw files from one disk and write processed files to a separate disc, all while our OS is residing on yet its own disk -- in this, there are no clogs while a single drive has to compete for I/O time or compete with any OS paging, though OS paging is probably a minor concern on fast systems with large amounts of RAM. From prior discussions, I know several of you already do this at least on single drives and have commented on notable time savings while doing large volume batch processing. I was reading and writing from and to a single 2-drive striped 'working image' array and getting results comparable to when I read from a single drive and wrote to a separate single drive, so I left it alone until now...

My current implementation uses a pair of 2-drive RAID 0 arrays. The first array has all my working image files, including the raws from the shoot. The second contains my OS and a large amount of free space for the desktop. As I read the raws from the working image array and process to the desktop array, I find batch conversions to be significantly faster now that I process between these stripes; I now have sustained throughputs between these arrays at 150MB/s which is double what I had before when one or both sides were on a single drive, or when both were on a 2-drive stripe. (PS: I do copy the folder of converted files back into the folder for the shoot with the raws for permanent storage, and of course realize that adds total time, but I schedule this when the system is otherwise idle. And of course this is regularly backed up redundantly to the Drobo.)

What I'm not sure of, is would I have been better off, worse off, or about the same if I simply ran all 4 drives as a still faster stripe and lived with any I/O bottlenecks? I know, I could set it up and try it, but it seems a bother, especially if I end up the same or slower ;)

Any thoughts?
 

cjlacz

Member
I don't think it's worth it. As for performance, it depends a lot on the hardware you are using. Hardware will reach it's transfer limit at some point and adding more drives won't make a difference. 2 drives is pretty safe. Also, running two arrays from a single controller would hit the same limits as one 4 drive array from the same controller. The only real way to test this is to keep adding drives and benchmark the results. From benchmarks I've seen 4 drives with a dedicated 4x PCI-E raid card seem to often max the bandwidth. The theoretical limit is higher, but I never see it reached.

The big problem with a single drive is the heads need to seek back and forth, to read the file then the write it. If you moved to a 4 drive stripe (raid-0) you get the same problem again, just spread between four drives instead of one. Performance might even be lower, but there is no way to tell for sure unless you benchmark it.

The biggest reason I wouldn't move to a four disk stripe is that it increases your chance of data loss. With a two disk stripe you've doubled your chances (if either drive fails you lose everything). With a four disk array you've doubled it again. Any of those four drives fail and you lose everything.

Back when I was in college and had a all scsi setup I had a few disks striped and that really bit me when a disk failed and I lost a lot of data. I don't know your backup strategy, but I'd caution you against storing client data on a stripe without a copy somewhere else, even for a short time. Consider raid 0+1 or just using the stripe as a scratch disk or work drive.
 

Jack

Sr. Administrator
Staff member
Thanks for the input Charles, it supports my suspicions.

FWIW, the 4-drive vulnerability had not escaped me, and I am anal enough to be worried about it currently in 2-drive form -- so yes, ALL my critical data is currently (and would always be) backed up onsite. Currently that is to a Drobo (auto RAID 5 device). And then a third copy is stored offsite on individual drives.

Even my striped OS is bootable backed up weekly (scheduled with Carbon Copy Cloner) to a 5th drive stuffed into the lower optical bay of my MacPro using one of the additional SATA ports on the MB -- and yes, I've needed that bootable back up more than once. That drive also has a time Machine partition on it in case things go really bad :)
 

cjlacz

Member
Haha, wow. Well, I'm glad to hear you have the backups covered!

I've been learning a lot about ZFS while I search for parts here. (Harder then I thought). Hopefully I'll be able to write more soon, but it certainly seems like the best affordable option for a reliable data server. With 1TB drives at $70 pretty affordable to make a solution that will last a while. I like Drobo's flexibility, but I'm also learning why it may be so slow.
 

Jack

Sr. Administrator
Staff member
Re Drobo speed...

I am fairly certain it is first firmware limited for the lowest common denominator drive it can take, a 5400 rpm SATA1, and it may actually even buffer that down further for reliability.

FWIW, I currently achieve large sustained writes at just over 30 MB/s and large sustained reads at nearer to 50 MB/s, either condition being more than adequate for my needs as low-hassle, onsite back-up.
 
Top