Slow file handling

AmberV · February 10, 2012, 8:28pm

Here is a good article on the topic, with some good cross-links at the bottom, too, for anyone interested in SSD drives.

StefanG · February 10, 2012, 9:40pm

Thanks for the link, Amber
very good article. Especially the part about Trim. I didn’t have it, which is why my drive became unoperable when a few blocks had expired their write cycles.

The links on the bottom of the page are also worth reading. I can confirm that not all SSD controllers do a good job on wear leveling. The reason it failed on my drive, however, was also negligence on my part: I didn’t have the latest firmware installed. According to the manufacturer, there was a mismatch to the SATA controller of my motherboard. Both need to integrate well for good wear leveling.

I appreciate your research and feedback, Amber
Stefan

almansur · February 11, 2012, 12:58am

Generally speaking, SSDs are not intended for use in general purpose computers.

Their big benefits of SSDs are 1) ruggedness and 2) read speed. Ruggedness comes from not having any delicate mechanical parts. Read speed comes from not having any delicate mechanical parts. That’s not a typo. It takes time for the mechanics to get the data you’re after under the head so it can be read.

SSDs are best in WORM invironments. WORM is Write Once Read Many. Databases work that way; write a record once then read it a zillion times. General purpose computers have a more balanced need for reads and writes.

Sometimes many writes happen without any reads. Programs save stuff to disk all the time like status updates. Scrivener is one of these. If you move the mouse pointer over a text document and click, just once, in there so that the insertion point is placed there then guess what! You just caused the project.scrivx file to be written to disk. It’s because the TextSelection tag for that file changed and had to be saved.

With a rotating media hard disk file write you’ll something like the following:
The file, let’s say it’s 1 block (512 bytes), is written.
The file’s old block address is placed on the free list.
The file system saves the file’s (new) location and timestamp in the directory.
The file’s new location is marked in some used block list.

So that’s 4 writes and the last 3 might be updated in place on an ordinary HD.
An SSD can’t do updates so all of those will be writes to new (erased) locations
and the last 3 will result in something, maybe a journal entry, to mark the old locations free.
All this means that a simple file write results in a total of at least 7 writes which means the SSD’s endurance is impacted. Some of those might be to the same cell but you can’t count on that. Each cell, depending on the technology used, can take from 3K to 100K writes before it fails.

Some SSDs, I don’t know about all of them, come with extra capacity to make them last as long as an ordinary HD which is maybe 10 years. I’ve seen 64GB SSDs that actually come with 80GB of storage. Even with that, excessive writes will kill the SSD. To extend the life don’t use it for the C: drive or if you must don’t put the paging file on there and try to keep it no more than half full. More unused space means more cells to slowly wear out.

Al

Jaysen · February 11, 2012, 1:23am

So we have hit the … system architect bone in me. Here we go.

Boot OS and program files are fine for SSD. As you state WORM method applies to both. Writes only occur in WELL WRITTEN OS AND SOFTWARE install locations during patches and updates. So for windows folks, c:\windows & c:\program files should be safe on an SSD. For OSX types that would be everything in / that isn’t Users and var and tmp.

But there is that caveat about being “well written”.

Here’s the real thing that you have to consider when listening to the arguments about SSD longevity: system designers have been using SSD for … well … decades. I have a 15 year old SAN controller with SSD drives with 0% failure. Note I am not talking about the SS cache, the actual system boot module. Consider the use of SSD in the macbook air line. The first generation should be well past the breaking point. But we are not seeing mass failures. We are starting to see failure in the linux and windows netbook markets. Not the massive failures that one would expect from excessive writes through “normal” usage, but failures only on “power users” systems.

The key seems to be a little trick being played with RAM. Provide enough RAM to reduce swapping. Create virtual disks in RAM for high write segments (/tmp & c:\temp). Use shutdown sequence to write non-persistant storage segments to SSD. My favorite is using a massive write cache and only purging the write buffer to disk after the cache journal and FS journal show no change for significant time (or buffer size is exceeded).

Anywho… don’t be afraid of SSD if the system passes design criteria (think commercial netbook). If you are home brewing a system with an SSD drive, spend a bit of time making sure you know how to optimize OS buffers for write caching in cooperation with FS journaling.

StefanG · February 11, 2012, 9:50am

Good points. All of them. I will try Jaysen’s tips for system optimization. But I doubt that other users will give this as much thought as we do.

What can a Scrivener user gain from our postings?

Something to be aware of. Especially if you live near tramways and busy streets. Or when construction work is going on in the neighborhood.

I type a lot on trains and subways. And I feel uneasy when my disk is spinning during bumpy rides and accelerations. Which is why I avoid saving. I rather leave the drive asleep. I stick to editing and proofreading, where an occasional save is good enough.

I suspected this was happening, but couldn’t quite pinpoint it. From what I read in other topics, the autosave behavior in Scrivener will not get changed, so I guess we’ll just have to live with it.

SSD disks are rugged, but as almansur correctly pointed out, there is no overwriting. Old data has to be garbage collected, wear leveling will save into empty blocks. The chunk size is much larger than on conventional drives. How seriously this impacts the disk’s endurance I do not know.

I think so, too. Nothing to worry here. But if a friend asked me for advice, I would recommend he’d move his temp folder and data folder to another drive.

Jaysen · February 11, 2012, 1:45pm

Most certainly. That was my point. Only those two directories (from a vanilla install) are known to be safe. There may be archive locations for drivers that would be safe, but those will be case by case.

Consider a slightly better option than a separate drive. Use a “persistent RAM disk” or “RAM write cache” for shadowing all other areas. The idea is that the RAM disk buffers the writes until a file handle is closed. Then at system shutdown windows forces a flush of the buffer to SSD. If you add an extra 2 GB of memory, and dedicate it to the RAM disk, this will allow you to buffer 2GB of write data while allowing reads to come straight from the SSD.

Last tip, RAM RAM RAM RAM RAM RAM RAM RAM. Go for 12GB. Disable virtual memory. You don’t need it. On our high performance, IO intense, SSD based servers I go for the following minimum spec:
• 4 Core i7 (mostly need IO not CPU)
• 2x256 GB SSD in mirror
• 12GB RAM: 2 GB write cache. NO virtual mem.

If we know we are going to need a lot of disk IO that is non-OS related (say a forum site that has a SQL backend for posting) we use external storage with a minimum of 64GB R/W cache. Granted my SAN has a few TB of RW cache so I really never have to specify this. Most external drive arrays ($US400 for a 2TB mirror) will provide enough caching to prevent SSD being used for any write caching.

Ok, I need take off my fun hat and put on the pointy haired boss hat.

BTW, I just realized you were talking about c:\windows\temp. Your right. That needs to go to RAM or non-SSD.

StefanG · February 11, 2012, 3:16pm

Thanks for clarifying, Jaysen. Should have been more precise. Actually, I was talking about the environment variable %temp%. On most machines it is located at C:\Users\UserName\AppData\Local\Temp

Your idea of a persistent RAM disk is very convincing. The only thing I worry about: What happens in case of a crash, when the system has no time to flush this buffer to a real drive. Maybe it’s a matter of finding the right software. The mounting tools I have do not survive a reboot unless of course, it was a deliberate shutdown.

Jaysen · February 11, 2012, 5:06pm

Ah yes, the panic condition! In that case any un-flushed data is lost. So if you are writing a wad of changes, say in photoshop, and have your write flush set too high, you would lose a tremendous amount of data. You really have to profile the usage patterns and find that nice balance. Let’s consider scrivener as our candidate for write profiling.

We know that we have a 2 second idle time write. That means that after 2 seconds of you not doing anything (no keypress) scriv will auto save. We also know that scriv auto writes it’s XML on file opens (binder clicks). We also know that the auto save will only impact what is open in the editor panel and that unless something is open in an editor there is no file handle in use. So we could specify the following logic in logic to the write cache:
• write to disk on close()
• write to disk after 4 seconds of no write()
Basically is we see a file closed (file handle/descriptor released) flush our buffer to SSD. If we have gone more then 4 seconds with no write to disk then we flush that sucker to SSD as we must be making changes. Anything less than 4 seconds and we are staring at the screen not doing anything so don’t flush.

With something like photoshop, writes on a file may have a much larger wait as computations are done in memory before the next write to disk is ready. Since you would see a larger amount of data being written you would want to maximize your cache time to ensure you only write when change is completed.

It’s kind of an art more than a science. If you like to tinker look at system IO and which processes are doing the writes, then tweak your settings to fit.

Fun stuff.

StefanG · February 11, 2012, 6:02pm

Photoshop is different altogether. Its preferences allow you to adjust its Scratch Disk ad lib.
Seems like you’re used to a server environment. Protected against power failures and RAID backup etc.

I am concerned with Scrivener on a PC. If it freezes or crashes, I will lose my stuff if saves did not go to a real drive, but to a RAM disk only. If you flush to a real drive every 4 seconds, then why not write directly to SSD every 4 seconds? I liked your first suggestion better:

Performance-wise your RAM-suggestion is great. If only it weren’t for the risk of data loss. Personally I opt for a SSD with the temp- and data folder moved to another drive.

I only wish I could deactivate Scriv’s autosave. This would make me feel a lot better on train rides and other unstable environments where I’d rather decide myself when to save and when to wait for calmer conditions. It would save battery life too, if my drive were not kept spinning all the time.

Jaysen · February 11, 2012, 6:50pm

You could opt for a 7200 second save interval? 2 hours should effectively disable it for you.

And yes. I am a “big machine” guy. Which makes building home systems tough. Does anyone really need a 10TB RAID 0+1 NAS for their home? At least I don’t actually have a server room complete with full firewall and remote VPN access in the house anymore.

StefanG · February 11, 2012, 7:32pm

Wish I could. Zero would be ok, too. But autosave is limited to a range from 1 - 300 seconds.

almansur · February 11, 2012, 8:33pm

For your “working on the railroad” scenario I suggest you use a USB flash drive for your project(s) directory. Put your Scrivener backup directory there too.

That way you can set your HD to power down when the laptop is on battery power. With minimal apps running you won’t be seeing any task swapping that would activate the HD and all Scrivener traffic will be to the USB drive. Except what goes to C:\Documents and Settings\xxx\Local Settings\Application Data\Scrivener\Scrivener. I see custom compile settings in there.

In all cases, to protect against drive failure you must copy your data to other drives, USB or HD. At least 2 others. That way, when, not if, the work drive fails you’ll still have 2 copies and some time to replace the failed drive before the next failure.

This has the benefits of speed, low power, reliability, security and portability.

Al

StefanG · February 11, 2012, 9:41pm

Same thing as with the autosave above: Wish it were that easy. Your suggestion works great with Word and Co, but not with Scrivener. Running the project from the stick still keeps the drive busy. I discussed the whole issue in another thread, don’t want to warm it up here. Maybe it is related to temp files, maybe even to what we’ve been talking about here in this thread. I’ve already wasted too much time on this. Decided to wait for the next update and see if that changes anything.

You’re right about backups. I store on them SD cards. These disappear completely in the notebook, and I never have to remove them when I transfer trains or subways. Unplugging USB sticks all the time loosens sockets in the long run, when you type as much on the road as I do. The notebook doesn’t fit into its case with the stick. SD cards are my preferred backup medium (in addition to extrenal drives, of course)

kewms · February 11, 2012, 11:51pm

You can’t deactivate it altogether, but you can make the save interval very long. On the other hand, then the drive has to spin up and down with each save, which might not be desirable either.

Katherine

jalc6927 · February 21, 2012, 3:29am

I experienced the delay when typing issue and solved it by working off the file saved directly to my hard drive as one suggested on an earlier post. While working off a flash drive, the constant backup caused the delay