Problem with .doc import

I have strange problem when importing Cyrillic text in .doc file into Research or into Draft folder (drag and drop onto corckboard/tactical view screen): question marks appear in random places of the text:

This is what happens with beginning of one of my works. “?” don’t seem to appear in connection with end of the line or some other symbol.
When I copy and paste the problem doesn’t appear, but I need to gather many .doc files in my project, so I would like to find another solution.
I’ve tried on my own PC (Win7) and on PC of my friend (WinXP), and in both cases the problem appears.

If you have Word installed on the computer, I’d try going into Tools > Options in Scrivener and in the Import/Export options, enable the MS Word converters. This looks like an encoding issue, and switching the converters being used might be enough to fix it.

I’ve Word 2010, but it doesn’t work, when I turn the checkbox “use Word or Open Office as converter” on. Scrivener starts to convert the file, but after about 10% is completed, two error messages appear about Word 2010 crash:

and after a couple of seconds:

And then Scrivener says, that file cannot be converted due to unknown error.
Sorry for so much Russian text, I don’t know, how to get same messages in English. I hope it’s understandable, as the form of error messages is standard.

Hum. Would you mind sharing a sample file as an attachment here or by sending it to AT literatureandlatte DOT com (with a link to this forum thread for reference)? It doesn’t need to be your actual document, just one that has shows this issue when importing.

Since the other converters aren’t working, try saving the file as a .rtf from Word and importing that to Scrivener, to do the conversion there in Word instead of during the import proces.

RTF doesn’t show this problem with this checkox off (no question marks appearing in the text).
I made a sample file (attached to message). When I tried to import it (using Word encoding - off) I get:

So it seems like “?” appears each time after first 20 characters. :confused:
I also tried with sample file in English (sample file2.doc) and it isn’t showing this issue.

But both files cause MS Word crash, when I try to import them with same checkbox on. So I think there are two different problems - one with my Word 2010 (doesn’t want to work as coverter) and another with Scrivener converter and Russian text. :confused:
sample file2.doc (22 KB)
sample file.doc (22.5 KB)

Thanks for sharing the samples. The problem will have to do with the encoding, and it appears that using the MS Word converters do then resolve the problem, but obviously that’s not working for you at the moment given Word’s crashing. You might want to try rebooting the computer and doing the import again, with those Word converters enabled, to see if the problem recurs. A lot of junk can build up over time and with a lot of programs running, so occasionally just restarting will be all you need to fix a random problem.

Also, first ensure you’re up to date with version 1.5.7 of Scrivener; you can check this in Help > About Scrivener. If you’re on anything earlier, I’d suggest doing a complete uninstall and then a reinstall from a fresh download to ensure you have a clean slate. Your projects won’t be affected by this (although they won’t appear in the Recent Projects list until you’ve first opened them via Open Existing Projects, File > Open, or by double-clicking the “project” file directly from Windows Explorer). If you wish, you can also save your program settings by going to Tools > Options… and choosing “Save Preferences” from the Manage menu button. After the uninstall/reinstall process, just return there to load the saved preference file.

I have the last version of Scrivener, downloaded from . I returned all settings to defaults and then rebooted my PC. But the problem still appears.

Do you have the same encoding issue when you try to import my sample file?

Thank you very much for your explantaions and attempts to find the solution :slight_smile:

Ah well, it was worth trying the reboot just in case.

The sample file you provided does show the import problem using the standard importers, but it imports correctly when using the MS Word converters. I’ll provide the sample to our developer in hopes that we can do something with the standard converters to clean up the problem from the encoding that’s causing the incorrect characters, but the normal solution would be to switch to using the other libraries for the conversion. Since that’s not working in your case, re-saving to RTF from within Word and then importing that is likely your best option. Copy/paste could work as well, but if you have a lot of files you may be able to do a batch conversion from .doc to .rtf.

Alternatively, if you don’t have any question mark characters in the text otherwise, you could just import everything and then do a find/replace in Scrivener to replace the ? with nothing. As far as I can tell from the sample, it seems that everything else is correct, and just that character being added.