The GetDPI Photography Forum

Great to see you here. Join our insightful photographic forum today and start tapping into a huge wealth of photographic knowledge. Completing our simple registration process will allow you to gain access to exclusive content, add your own topics and posts, share your work and connect with other members through your own private inbox! And don’t forget to say hi!

Some Mac Pro performance tips...

Jack

Sr. Administrator
Staff member
A really long post offered only FWIW and with the disclaimer that YMMV so do plenty of homework before attempting these for yourself! In fact, because of the costs AND risks involved, I almost did not bother posting it, but in the end I figure it may be useful information for some and you are all smart enough to decide if these will be good tips for you to explore further. So without further babble,

“A few tips I’ve gleaned on tweaking my Mac Pro’s performance“

To set the stage, I have a newer machine that is pretty high end, the Mac Pro 8-core 3.2 machine. Next let me clarify I am not any kind of computer guru or technical expert – these are just some tips I came across that made a notable difference in the performance of an already fast machine. Finally, I want to extend a special thanks to two folks who answered many of my questions offline and kept me pointed in the right direction, Bob Fruend and Lloyd Chambers. (Lloyd has an excellent website with lots of reviews and tech tips at: http://www.diglloyd.com/ .)

The first tip is regarding ram. My machine came with 2x1G ram and I immediately added 4x2G sticks for a total of 10G ram. This left 2 ram slots open for another pair should I want to increase ram in the future. Then I found a ram test that made reference to having all the ram bays full improving performance overall and was curious how significant it really would be. Here is the link: http://www.barefeats.com/harper3.html. It appears that 1) you want all ram bays full, but 2) additionally the top slot pairs should match the lower slot pairs. So it appears you get better ram throughput if you have all 8 bays full of 1G sticks than you will with 4 bays full of 2G sticks, though both are 8G ram total. It also appears that having 4x1G in every slot or 8G total is actually better throughput than my 10G configuration was. So I decided to order 2 more 1G sticks to fill up out the slots, but when I got online I discovered the 2G sticks were actually less expensive per gig than the 1G sticks, so I decided what the heck and ordered four more 2G sticks. (For those curious, Apple ram is expensive, so all my ram has come from OWC.) Okay, so now I have 16G of ram in my machine…

My first test was simply booting CS3. With the old ram configuration, CS3 booted on my machine in just over 4 seconds, which is admittedly pretty darn fast to begin with, so I did not expect any huge gain. But I was in for a surprise… With the new configuration, I was stunned when CS3 booted in under 2 seconds! Now I realize it could be the larger total of ram, or it could be all bays full, but it seems to me that 10G was already overkill for a program launch, so I am putting my money on the having all bays full scenario actually making a difference. (Important note: CS3 boot times are greatly affected by individual preference settings, file and plug-in load arrangements, so do not panic if you have longer boot times than I do with your CS3 configuration on a similar machine. I have optimized my CS3 to only load plug-ins and file formats I regularly use.) At any rate I give this a cautionary recommendation as other congiurations may not see any performance difference, but if you are looking at a ram upgrade, I think this is something to consider; it certainly worked for my set up.

~~~

The next tip is more involved, drive striping. Striping is basically a RAID 0 one two or more drives and increases performance by dividing the writing times across two or more drives for increased data throughput, yet maintains the total storage capacity of both drives. While striping can increase performance without loss of total capacity, it needs to be pointed out that it cuts reliability in half for a 2-drive stripe and more for higher multi-drive stripes. So ANY data stored on a striped array really needs to be redundantly backed up via a mirror (RAID 1) to properly insure against a total data loss. (And let me point out there are MANY different RAID strategies, so it pays to educate yourself on RAID configurations before choosing my relatively simple option below, since your needs may be better served by alternate strategies.)

First note that for optimal performance your OS swap file, CS scratch partition and image read/save drive should all be on separate physical drives for best performance. This is so those operations don’t end up competing for I/O time on the same disk when called up. In other words, it is not a good thing to work on large images stored on say your desktop if your OS swap file is located on that same physical drive, even if it is in another partition; you will get better performance if your data drive is separate. Similarly, you probably won’t want your scratch drive as a partition on your image drive or OS drive. Next it should be pointed out that the front portion of a drive generally offers faster performance than the middle and end of a drive, so if possible, you want your regularly accessed files stored on the earlier portion of your drives. With these in mind, I have always stored my images in dedicated image drives, usually with a smaller “fast” partition at the beginning for my current files, and the second “large” partition for older files. It is easy to move files from the fast partition to the large partition as the fast side fills up, and in this way I keep the fastest part of that drive available for current working files.

But, I had never bothered with striping those drives, primarily because of the cost involved in the extra drives and back-up drives due to reduced reliability. However, with 1TB fast SATA2 drives now selling almost everywhere for under $200, I figured the cost per gig of storage wa finally down to the point where I could justify trying it. So my current working image drive is a pair of 1TB SATA2 -- I chose Samsung Spinpoints, but know that Seagate 3200.11’s are also excellent -- the 2TB total partitioned off to a 464G “fast” working partition and the approximate 1.4TB remaining as the “large” storage partition. Both of these partitions are then software mirrored to an external storage array for safety.

So how did this work out? Well suffice it to say it basically cut my image open and save times in half! Again, not that things were “slow” on my system before, but they were taking notably longer since I moved up from 60MB or so DSLR files to a medium format back that generates 240MB image files. So adding this new working array has notably improved the file open and save experience for me.

Again only offered FWIW only, please keep in mind YMMV!

Cheers,
 

Dale Allyn

New member
Thanks, Jack. Great post. I had forgotten about performance boost gained by loading all RAM slots. I'll factor that into my Mac upgrade.
 

TRSmith

Subscriber Member
Thanks for the info Jack. I really like the RAM idea. I'll have to take a look at my configuration and maybe go ahead and fill all the remaining slots. I'm really happy with the Mac Pro (same as yours except video card I think). Might as well get the most out of it.
 

bradhusick

Active member
Another tip...
I have my scratch disk setup as a RAID 0 striped array on two WD Raptor 10,000 RPM drives in an external eSATA enclosure hooked up to a dedicated eSATA drive controller card. This is about the fastest scratch setup possible and CS3 loves it for larger files.
 

Guy Mancuso

Administrator, Instructor
Well there is another way to go about gaining some speed and Jack and I have been looking into this. I ordered two of these from Dell although you can get them at newegg and Frys's. Western Digital VelociRaptor WD3000GLFS 300GB 10000 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - OEM

http://www.newegg.com/Product/Product.aspx?Item=N82E16822136260

What we are looking at here are 10 k drives but they will not fit in the sled of the MacPro because there actually 2.5 drives in a heat sink 3.5 and they will not line up so you have to remove a optical bay than install a product like this which I ordered here http://www.maxupgrades.com/istore/index.cfm?fuseaction=product.display&product_id=158

So what I will do is install my OS on one of these and use the other for scratch. Than everything else in my box will be storage. But what might be the best processing from C1 and other programs is taking 2 of my Seagates 500gb drives the Seagate Barracuda 7200.11 ST3500320AS 500GB 7200 RPM 32MB Cache SATA 3.0Gb/s Hard Drive
http://www.newegg.com/Product/Product.aspx?Item=N82E16822148288

Than run them in Raid O and put my Raw files there and process from this striped Raw drive.


Now look at the test results of these new 10k drives. http://www.barefeats.com/hard103.html

Now I have a older Mac Pro actually the first new ones when they came out the Dual duo core 2.66 with 6gb of Ram so for me it could use all the boast it can get since i am not buying the newest one even though there is a 20 percent gain in speed .i don't think it is enough to warrant it so will wait for the next generation than remove all this stuff in my box and move it into the new one except the ram of course since that will change. But I have really picked up a lot of speed just using the Seagates I have now but adding the new 10k will probably be the fastest a 2.66 box will go. I will update when they come and are installed.
 

Jack

Sr. Administrator
Staff member
Hi Brad:

Guy and I are investigating the new WD 10K VelociRaptors right now for that purpose. They are fairly new 300G SATA2 versions of the older SATA1 raptor, but relatively expensive and mounted in such a way they are non-standard SATA configuration so won't fit in a typical slide-in drive box. So you need a mount with power and drive cabling. BUT the throughput is incredible and striped they will rock for sure.

HOWEVER! I have been told that with 16G of ram, Leopard will utilize excess as a ram disk and offload the CS scratch command to it first, so it's not clear how often I will actually tag my scratch drive now. Need to do some testing to see... Regardless, the new WD VR looks like a great candidate for a dedicated OS drive and perhaps just a single scratch. I want Guy to test them striped as a dedicated scratch just so we know how much that will help before he loads his OS on one too :thumbs:

Lastly, another strategy involves partitioning off a thin section on the fast part of 4 drives and then striping those. Normally a 4-drive stripe would be too low reliability to consider, but for scratch who cares! Theoretically, this should be even faster than the VR 2-drive stripe.
 

bradhusick

Active member
I generally avoid partitioning because it adds a little overhead to the disk management system. I run 4 drives in the bays and the two raptors in the external drive. At this point, we're probably talking about immeasurable differences. :)

By the way, I printed that M8 12 photo panorama of a village in France - it is 8 inches high and nearly 9 feet long! CS3 did it, no problem, on my Epson 4800.
 

Jack

Sr. Administrator
Staff member
Another tip...
I have my scratch disk setup as a RAID 0 striped array on two WD Raptor 10,000 RPM drives in an external eSATA enclosure hooked up to a dedicated eSATA drive controller card. This is about the fastest scratch setup possible and CS3 loves it for larger files.
Here's some follow up on stripped arrays for scratch. I am in the process of moving data off my main machine and onto a DROBO storage device and so freed up some bays in my Mac Pro and decided to add some drives primarily for performance.

I elected to not go with the WD300G 10K V-raptors as they have a non-conforming SATA2 connector location and need to be jimmied to fit in the Mac Pro -- not to mention they cost $300 each :eek:. Instead, I went with the WD 640 G 7200 drives, which at the time of this writing are rated very well for 7200 RPM drives and only cost $90 each -- IOW I bought 4 of them for just about what a single WD300 10K VR costs :cool:.

First off, a note on drives. I have now used four drives I am very pleased with, all are fast, quiet and run cool. They are the Seagate 7200.11 in 500GB or 1TB, the Samsung Spinpoint 1TB and the WD 640G Caviar Blue. ALL of these drives test well on the performance benchmark sites. I have also used WD "Green" energy efficient 1TB drives which are extremely quiet and run very cool, but do give up a little speed to the above choices. They are still SATA2 fast, but probably not the best choice for ultimate performance, though they are an excellent choice for data storage.

I run an action I wrote that forces CS3 to go to scratch regardless of how much RAM you have in your system, but contains the typical sort of operations we as photographers might do to an image in Photoshop. The action is attached below so you are free to download it and try it for yourself if interested. You can use any image between 1.5 MB and 50 MB in size (tiff preferred) since the action takes it up to over 1GB. Overall times will vary only a few percent depending on the image you select. Of course you want to use the same image for all of your tests to quantify the actual performance gains on your system.

I should mention my machine is pretty fast, a Mac Pro 8-core 3.2 GHz with 16G RAM.

Now for the test results:

With OS on one Seagate 7200.11 and NO dedicated scratch drive the action runs in 1:24

With OS on one Seagate 7200.11 and scratch dedicated to a second Seagate 7200.11, the time drops to 1:03 for a gain of about 25% -- expected and definitely shows it's worth having a dedicated scratch drive.

With OS on one WD 640 and scratch on a second WD 640, the action times drop a bit further to 0:59, or a gain of about 6%. Not too shabby for just a drive swap.

With OS on one WD 640 and scratch on a striped pair (RAID 0) of WD 640's, the action drops to a blazing 0:44, or a 26% performance gain! (I should point out I am using OSX's software RAID to stripe my drives, and no doubt a dedicated hardware RAID would increase performance further.)

Now to Brad's comment above. I've read online too where partitioning a stripe can slow things down because the OS has to poll the partition first. I've also read where the partition speeds things up because of the shortened stroke on the heads. Ostensibly BOTH things happen, but which is worse on the performance of the system? So I decided to test this myself to see which strategy is better. I partitioned a thin 128GB stripe off the WD 640 pair and reran the test. Bingo, partitioning the stripe helps as now the action ran in 0:42; only a 5% gain over not partitioned, but still a gain. The real fact IMO is this is now a net of a 29% overall boost compared to a one-drive scratch, and represents the order of performance increase one might expect when upgrading to a complete new $4000 computer system. Definitely worth considering!

A FWIW PS: If you research CS performance increases on the web, you will no doubt come across references to Photoshop's VM buffering plug-ins. One forces it always off, the other always on. In standard trim, CS supposedly manages this automatically based on machine configuration. Some folks claim the "Always on" plug-in causes Photoshop to use any free RAM in your machine for scratch before tagging your scratch disk, however I have doubts as to how well it really works. On my machine, the plug-in did NOT add to performance, and actually slowed things down by a few seconds. So clearly it is not accessing unused RAM more efficiently on my machine, as I have several Gigs free at any time.

Cheers,
 

Jan Brittenson

Senior Subscriber Member
I see you can get the 80GB WD WhisperDrive SATA2 drives for $39. Any idea how these perform? I wonder if two of those might make for a reasonable scratch volume.
 

Jack

Sr. Administrator
Staff member
I see you can get the 80GB WD WhisperDrive SATA2 drives for $39. Any idea how these perform? I wonder if two of those might make for a reasonable scratch volume.
I understand they are pretty slow for a few reasons -- first is they only have 4 or 8 MB buffers (and buffer plays a significant role in heavy I/O operations like scratching), second is the disk density is not as high as larger drives. Supposedly, the reason the WD 640 is so fast is it uses 2 high-density 320 GB platters, instead of the typical 4 160 GB platters. The 80 G uses a single 80 GB platter, which is even less dense, and may even be a 2.5 inch drive platter in a 3.5 inch form factor case. The added platter density speeds up data transfer at the fixed 7200 RPM spindle speeds. The final issue is that drives are fastest at the beginning of the drive, slowing down as you approach the first quarter in, then tend to fall off dramatically after that -- and you'll hit that slower portion a lot sooner on a small-capacity drive.

Howeverbut all that said, striping those 80's is probably still going to be a lot faster than any single drive, like having a 15K RPM 160 GB drive. But then you do get additive benefit on the drive buffer sizes when you stripe, so a pair of drives with 16 MB buffers gives you an effective 32 MB buffer where a pair of 4 MB buffer drives only gets you 8 MB total buffer...
 

Jan Brittenson

Senior Subscriber Member
How did you set up the 128GB raid stripe? I played around with this a bit using two brand new drives, and the only way I could accomplish it was by first partitioning each drive into two partitions: 64GB and the rest. Then creating a raid set from the two shallow partitions. That worked fine, it mounted and I could use it. But when I tried to create a second set from the bulk of the drives Disk Util was the Fail, big time. It completely messed up the partition tables on both drives rendering the original shallow raid set broken and unusable. When I tried to delete and clean up DU crashed, then after restarting it would just wedge trying to access the drives. A reboot didn't help; I had to resort to pulling one of the drives to get it to stop trying to establish the raid sets on boot. Wiped the partition table of one drive, then the other, then after a third (or fourth, I lost count) power cycle deleted the now orphaned raid sets. My advice to anyone reading this - don't try this at home. (Note: I could use the two partitions as such though, it was just trying to stripe them that failed.)

So I simply striped up the two drives in a single raid set - that works just fine of course, and performance is still really good. I get 228MB/s sustained read OR write from two Seagate 7200.11 drives. To create a scratch volume with reduced seek latency I'm going to try to create a disk image file using Disk Util and mount it on boot. There's plenty of space for contiguous allocation. Not sure if it will show up as an alternative in PS/FCE though... If it works it'll have the secondary advantage of easily being extended with additional scratch volumes as needed. The overhead of a loop mount shouldn't be significant to any operation bottlenecking on disk I/O.
 

Jan Brittenson

Senior Subscriber Member
Hmm, on second thought I guess you ONLY used 64GB of the drives and ignored the rest completely, leaving it unpartitioned?
 

Jack

Sr. Administrator
Staff member
Hi Jan:

Here is what you have to do using OSX software to stripe with a partition. First you need to partition each drive individually using disk utility into two (or more) partitions, equal size partitions on each drive. So for drive 1 I made a 64G partition and labeled it 1FAST, then whatever was left over I labeled as partition 1BIG. On drive 2 I did similar, creating 2FAST with 2BIG beneath it. Then you open the RAID dialog, choose "0", and drag 1FAST and 2FAST into it, then I label that pair "Scratch". I then open a new RAID 0 dialog and drag 1BIG and 2BIG into that and labeled that pair "Working Images". Note that when you select the RAID 0 dialog, there is an "advanced" tab, and this allows you to set sector sizes. I left scratch at the default -- 32K I think, but set my image sector size to 256K since my image files are mostly all larger than that except for some small web jpegs.
 

Jan Brittenson

Senior Subscriber Member
I can confirm that using a disk image for scratch makes a huge difference. (At least for my setup.)

Here's my setup:
Dual Quad 2.8, 12GB RAM
2x500GB Seagate (32MB buffer) striped as /Volumes/RacingStripe
320GB boot disk (stock)
Boot disk used for swap (system never swaps though)
10.5 with the latest patches
PS CS3 with the latest patches, set to use "all" the memory (~3GB) available

On the blank stripe, created a 16GB disk image using Disk Util
HFS+ unjournaled

With the regular stripe (2x500GB) as scratch, I got 2:09 (min:sec) and 2:08
With the disk image as scratch, 1:09 and 1:11

The tests were interleaved. Same file. Your action above. Stopped and restarted PS between (after changing scratch).

I can't find a way to check, but I *think* I may have used a large extent or allocation size when creating the scratch volume. Being unjournaled and reasonably small probably helps quite a bit here. It would be interesting to see if you can reproduce this...
 

Jack

Sr. Administrator
Staff member
Jan:

Are you sure the gain was due to using the disk image and not just because you had a smaller partition limiting seek times?

Currently on my machine which is a 3.2 with 16G ram, I get between 40 and 44 seconds on the runs with the thin stripe.

To get the large allocation blocks, you have to check the advanced box on the format, and am pretty sure you need to do that for each format you perform as it defaults back 32K sectors. You can go back and see what your sector size is using disk utility --- just click on the top level of the stripe-pair in question and open the advanced tab, the sector size is grayed out but readable.

~~~

Bob, it shouldn't matter too much what size file you start with as long as it's between say 1MB and 50MB --- I've run the action above with both and got virtually the same times. The action first forces the file to up over a gig then pushes it around a bit.
 

Guy Mancuso

Administrator, Instructor
My times when i did this with a slower box 2.6 8gb 667mhz was around 56 seconds with the new Western Digital VelociRaptor WD3000GLFS 300GB 10000 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - OEM

I have 1 drive as the OS and One drive as the scratch partitioned for 100 gb.

These new drives i got a pretty significant boost moving to them. I put two of them in a optical bay
 

Bob

Administrator
Staff member
Well...
I got 53 seconds in one run and 1:10 in a second using a drobo as scratch :eek:
I also noted that PS did not bother waiting for the IO to happen, since although for this benchmark it issued the writes to scratch, it didn't do any reads from scratch as far as I can tell.
Also, the image size were all cpu bound on my 4 core machine.
PS is issuing scratch writes in anticipation of a memory over-commit!! What a dweeb!
-bob
 
Last edited:
Top