Sync with external folder with duplicate files can lose duplicates

Useful and comprehensive info, thanks.

However I’m tagging @AmberV because I have just observed a serious problem with sync folders on startup this morning: one of my files has disappeared from the the sync folder AND Scrivener!

UPDATE Some time later (2.5 hours of puzzlement, panic & perplexity) … TL;DR I think the problem is that Word is set to automatically create backups and the Scrivener file identifier in [ ] is therefore on two files (See screenshot of SyncToy backup below), and this has confused Scrivener, though I can’t quite see the sequence of events that causes file deletion - was it the two consecutive syncs?

Obviously if this is true (can you confirm?), the problem is avoided by setting word not to automatically create backups, but that’s a handy thing to have… maybe Scrivener should not ignore the rest of the filename? What does Scrivener do with deleted things? Delete them, or send them to the bin (alas I emptied my bin before doing other backup work)

Original horror story

Here’s a reconstructed sequence of events, actions, etc. leading up to this situation, with illustrations where possible/appropriate.

I have a document “New Plot Ideas Jan 2022 —”; I need to restructure it in a way I can only do in Word so
I duplicated it (I’m paranoid) and added WORD to the document name, synced to the folder and then edited “New Plot Ideas Jan 2022 — WORD”. Based on the timestamp of the word backup file Backup

Based on the timing of the Scrivener backup, I quit Scrivener at 10/02/2022 15:06 and I then (thank god!) ran “SyncToy” to backup files to an external drive at 15:20. I have trawled through the backup (using Agent Ransack) looking for a unique string added to the Word edited file and I have not found it, so the file might have disappeared on sync on close rather than sync on open this morning

The file “New Plot Ideas Jan 2022 — WORD” is confirmed present in the SyncToy backup location (phew!)

This morning, after startup sync, the binder shows this

image

i.e. the Word-created backup file “Backup of New Plot Ideas Jan 2022 — WORD” has been created and the original file “New Plot Ideas Jan 2022 — WORD” has been deleted from Scrivener.

Inspection of the sync folder shows it looks like this

and full searching of the folder shows that the original file “New Plot Ideas Jan 2022 — WORD” no longer exists!

Fortunately (I can hardly believe I took that extra, rare backup at this time!) in the SyncToy location we find…

So the file "“New Plot Ideas Jan 2022 — WORD” did exist after Scrivener had been shutdown… and now it doesn’t.

Unless this can be explained and the issue avoided I cannot trust sync folders not to completely “lose” work, so I’m turning it off NOW.

I’m forking this to a new thread since this report is a separate issue from what was being reported in the other thread.

Thanks for the report, there are indeed things we can do better here. On the Mac if duplicate ID numbers are detected, it warns you of the problem and then attempts to handle it the best way it can, using modification dates, and moves the older conflicted copy to the “Trashed Files” subfolder of the sync folder. This gives you ample recovery options in case its attempt to repair the damaged sync folder fail. The Windows version is doing the modification date part, but not the warning or the trashed-file part. Instead, it keeps the conflicted duplicate in the main sync folder, where it is up to you to recover the document from it, but once you sync again it is silently deleted (what it sounds like you ran into).

Necessary improvements aside, this problem (and nearly all others) can of course be avoided by good procedure:

  • I always make sure to close files before syncing with Scrivener. Not only does this avoid potential swap file issues like you describe, it avoids space cadet issues of forgetting to save (that’s a problem for me anyway, which is why I take care to do so).
  • Always briefly audit the binder items that land in the “Updated Documents” collection. If you see any issues, you can often use the automatic snapshots the sync feature creates, or in this case, you may find the conflicted file in the sync folder on the disk, before you sync again.
  • And as we always say, in the user manual as well, all forms of sync are risky, and by definition destructive. It is a form of destruction that most often works in our favour, and aligns with human expectation, but by definition we are always destroying an older copy somewhere. Because cold hard logic cannot always predict what we want, syncing should always be coupled with aggressive backups. We do some of that for you with the automatic snapshot feature that is enabled by default, but external backups as you created are also vital for the blind spots (like this one) where snapshots cannot cover the problem.

We pay for convenience with risk. If that’s not cool with you, then drop the convenience and export/import or copy/paste into and out of external editors by hand.

Thanks Amber; good advice and I hope that the Windows handling could at least be brought into line with the duplicate ID handling on Mac - it seems strange that the risk should have been assessed and mitigated there, but not on windows, where the risk is surely not less :wink:

Just one comment:

Ah, but I also had the option to “take snapshots of affected documents before updating” enabled so apart from this particular edge case I was pretty much covered . And, whilst the nature of sync may be destructive if you can’t be algorithmically sure of what you do, letting the user take explicit responsibility by giving a preview of sync actions and requiring confirmation would spare you some aggravation - and give the user a chance to save their bacon.

Yup! That’s the idea for mitigating algorithmic risk: to spam duplicates and recovery files and snapshots everywhere as much as possible. This one condition got missed in their implementation of it, however.

1 Like

If, after running Scrivener sync, you are going to be editing those files in Word directly from the sync folder (which is the most natural way of doing it, and should be the most simple), then you should be careful to:

EITHER:1) turn off Word’s automatic backup,
OR :
2) be sure to remove the backup file (extension wbk) from the sync folder before running sync again.
AND ALSO:
3) close the document in Word before running sync again; this should ensure that the temporary file Word creates, typically with a file name like “~$ filename [231]” (without quotes), which is a temporary Word bookkeeping file for open docs, is automatically deleted when the docs are closed.

Having either the backup wbk file or the ~$ in the sync folder will cause problems.

But there are other gotchas waiting for you.

If there are items in Scrivener’s Trash folder, syncing will copy them into the sync folder, in a folder named Trash. The next time you sync, these Trashed files will be reproduced into Scrivener’s binder as live files, not in the Trash folder, but elsewhere in the binder, where they can easily (if not likely) share identical names with newer files, and can not easily be identified as ex-Trash duplicates.

This is a nightmare scenario, that can worsen each time you run sync. You can easily end up with multiple identically named duplicates of previously trashed files mixed in among your regular binder items.

As a routine matter, I keep my Scriv trash empty, and usually empty it immediately after putting something in it.

IMV, everything described here is a matter of details being allowed to fall through the cracks on the Qt/Windows version for reasons that seem to amount to little more than “because Qt” and “because Windows.” There is nothing that Scrivener does that is beyond the capabilities of a well-designed native Windows application. But that is not what Scrivener is.

I don’t know about that, there is nothing in Qt or Windows itself that manages an external sync folder. All of that logic is coming out of Scrivener, and should be managed correctly, with bugs fixed when they are found.