Importing from Word – Garbled/Missing Text

Hi all,

I’m trying to get more proficient at using Scrivener for academic work, particularly for revising my PhD thesis and working on papers. Unfortunately, I’ve been having issues when trying to import and split the docx file of my thesis into Scrivener. Most of it seems okay, and the document is properly split, but some parts of the text are simply missing or come out garbled. I’ve tried removing all the formatting and comments in the original document (even though I really need the latter for revising my draft), but I still get the same result – so I’m really unsure what to do.

Of course, I could try copy and pasting manually, but the fact that Scrivener does this makes me feel quite insecure about how reliable the import function is to use over the long term.

Best,
Plamen

(Screenshot of one of the problematic bits of the text)

Not something I’ve run across. My hunch is some flaw in the DOCX file. Wild guess. By chance is it stored on a “cloud” location and its copy to your local machine is not complete. Or, on import when the whole document should be in one Scrivener document, is it uncorrupted and the corruption starts only when you start the splitting process?

Otherwise, exactly how you are importing then splitting?

2 Likes

Concur with that. I’d try:

  1. Edit → Text Tidying → Zap Gremlins, as there may be control characters inserted by Word. (Maybe duplicate your project first though I wouldn’t bother.)

  2. Try exporting from Word to RTF then importing that.

HTH

Mark.

2 Likes

In most cases, if the DOCX importer is giving you troubles, it will be easiest to simply open it in Word or LibreOffice and save it as an RTF file, which is going to be much easier for Scrivener to read from and typically solves all problems.

If this file came from Google Docs, we are aware of some issues with its exporter. It produces sloppy formatting in some cases, and has been known to include content in areas that, if the importer isn’t prepared to read what it is doing, will be excluded. These should be fixed in a future update.

2 Likes

Thanks – and no, the file is stored locally. It has been stored in a cloud at some points previously, but not currently. The problem does also occur with importing a whole document without splitting.

I’ve tried the two import and split options – the first based on the document’s outline, as well as by using separators. The result is the same.

I cannot explain what might be going wrong, and not something I’ve heard of or experienced (and have done a reasonable amount of DOCX importing over the years). Given that before splitting the single imported DOCX is corrupted is a key observation, but why? Dunno.

Best to do as @AmberV and @xiamenese suggest: In Microsoft Word, “save as” the document into the RTF format and import that, and then get on with your dissertation.

2 Likes

Thank you and @AmberV ! Converting to RTF seems to have solved the issue.

2 Likes