Here is a problem that may have resulted in errors in a book that I just published.
Here’s what happens:
Compile to DOCX
Editor makes a change to the DOCX, removing italics from a word (e.g. voir dire to voir dire).
I accept the change
I copy the text from the DOCX to Scrivener
The italics have not been removed! They have reappeared!
This is illustrated here. I copied the text from Word to Scrivener. The words are not italicized in Word, but after pasting into Scrivener, they are italicized.
If you need a timely response, open a support ticket.
If you can create a trimmed-down version of a project that has the minimal files necessary to reproduce the problem, as well as clear instructions on what to do with the file, that helps.
Just to confirm, we appreciate all bug reports, whether one takes the time to make a video or not.
So to the issue itself, I’m assuming this is something you have only seen happen when formatting has been changed via Track Changes rather than through any other means? Being a complete ignoramus in all matters MS Word, is there a way of “setting” a document so that it strips out the change tracking layer, once you’ve completed your pass and done all the accepts/rejects? It seems that Word isn’t actually changing the underlying formatting here, but rather applying an alteration dynamically, based on the acceptance of the revision. If there is a way to save a “compatible” copy or something along those lines, that’s where I’d investigate.
Interestingly, it seems we aren’t the only ones. I opened the test material you sent in to support, in LibreOffice, and the text is italic there as well. So I tried Pages—same thing, italic text. So definitely does look like there needs to be some kind of “commit all changes” command that should be run from Word, after a revision pass.
I know you mentioned LibreOffice worked, but I wasn’t copying and pasting (I don’t have Word), I was opening the .docx file directly and letting it convert.
As a doublecheck, I would make a copy of the trouble Word document in question and Accept All Changes. Then see if copy and paste of that passage works as expected. Worth a try?
Though this is unlikely to have gotten by you, one way you might have unaccepted changes and not realize it is if the Reviewing mode in Word has been set to show Final (without showing changes). (And, by the by, everything we see in the video is conformal with this hypothesis.)
It is indeed puzzling though that your testing showed that the problem only effects de-italicizing. Did you by any chance try reproducing the problem: turn track changes off, italicize something, turn track changes on, de-italicize that something, then accept that change, then copy paste into Scrivener.
Yes, I agree with your analysis—there’s some information about the italicized text remaining in that DOCX file.
Note that Rebeca has the files that I’ve used to demonstrate the problem.
Some things I’ve tried:
I’ve confirmed that all changes have been accepted. I Accepted All again, problem remains.
Yes, opening it in LibreOffice shows the italics, while copying/pasting to LO does not.
Copying pasting to Pages: Italics remain (same problem as with Scrivener).
Copy the text, open new Word document, paste it (no italics), copy from new doc, paste into Scrivener, italics present (ie, it doesn’t solve the problem).
Turned Track Changes off, italicized a sentence, turned Track Changes on, de-italicized it, Accepted all changes, copied text, pasted into Scrivener: not italicized (that is, it worked correctly). Possible conclusion: it’s related to someone else making the changes??
Save the DOCX file as an RTF file, open that in Word, copy text, paste to Scrivener: not italicized (that is, it worked correctly).
If I open the RTF created in Step 6 with LibreOffice, the italics are present.
Yes, I’ve only seen this problem for italicized text.
I can’t try it on my Macbook because I don’t have Word installed there.
Regarding Video creation: I find that it’s a good way to demonstrate that the problem really exists, that I’m not doing something weird.
I could spend a few hours investigating this (discussing it on Word forum, for example, examining every byte of the DOCX file, etc.), but it will be more efficient for me to manually check the formatting changes involving italics for my next book. I could also use the Step 6 procedure as a workaround.
In the XML, ‘voir dire’ is set to use the font “TimesNewRomanPS-ItalicMT”. I note that on my (somewhat dated) Word this font designation appears verbatim in the toolbar Font field. The regular roman text is set to “TimesNewRomanPSMT”.
Here is a first guess: these font designations don’t mean anything to your Word installation, because you do not have a corresponding font (that Word recognizes as corresponding). What you are seeing when you look at your document is Word using a fallback font to show you your text which it sees as specified to be in fonts it does not recognize.
Test: if you put your insertion point in and go to your Font menu in Word, you will see that Times New Roman is not checked on that list – meaning Word is not recognizing the font as in the Times New Roman family.
Test: Do an advanced search for anything in format Times New Roman or Times New Roman Italic and your search will return nothing .
Test: However, if you search for anything with font “TimesNewRomanPS-ItalicMT”, you will turn up Worst. Costume. Ever. and voir dire.
So it is all going to come down to the fact that the font specifications in the Word document are using the Postscript name for the fonts in question, not the family name. If you have a copy of the Word document which you originally sent to the editor, you might look at that and see if this was true of the document you sent in the first place – I suspect you will find that it was. (And does ‘voir dire’ look italicized in it? I’m guessing not.)
If you find out your original Word doc went out like that, I may have the solution for you: An issue came up for me recently where the problem was that the documents I was compiling to Word were showing up with the Postscript names of fonts instead of their family names – and this was undermining my workflow into InDesign . As I recall, to get things playing nice together (and thanks to this Forum), I needed to change a setting in Scriv Preferences > Sharing > Conversion, namely unchecking Use Enhanced Converters. This sidelines a third-party converter (the culprit) and uses Keith’s own custom converter – which he has been developing in recent times.*
Best,
gr
The text in that dialog might tell you that unchecking that box will make Scriv use standard macOS converters. Last I heard the text in the dialog was just outdated.
QUESTION: L&L strongly recommends that use the enhanced converters, but I’m not clear on what kinds of problems I’ll encounter if I don’t. The manual says,
Does that warning apply if I select “Export as RTF-based doc file”?
COMMENT: It’s unclear which checkbox the text that starts with “if enabled” refers to.
Finally, unless there’s something funny about my setup (which I doubt. It’s very plain vanilla), this is a stealth problem that could hit a lot of users. You might want to find some way to let users avoid it.
AFAIK, this is a problem with the documentation not matching reality, both in the dialog box and in the manual. I believe this was due to the experimental nature of the new in-house Word converter when it was first released. But there’s ample evidence now that the in-house converter is superior to the third-party (Aspose?) converter in many respects. L&L, perhaps it’s time to clean this up.
p.s. I guess having the enhanced converter be the default would have made sense (as the best available) at the time, i.e., when you installed your Scriv 3.
The RTF-based .doc option is specifically for that—.doc alone. I don’t think that old trick of changing the file extension on an RTF file to .doc, works with .docx. I could be wrong, but it’s more of an old-school hack handing over work in RTF format to people that don’t know that RTF is an official Word format, and refuse it, stating they only take .doc. Back then making a real .doc was a more difficult, especially for fledgling third-party tools that couldn’t afford expensive conversion engines.
As to the rest, yeah it’s probably time to update the text in the manual and software help text.
I think it still makes sense as a default. The main issue is import rather than export. Export is somewhat of a controlled environment—if we need to support a particular type of formatting because there is a compile checkbox for it, we can add it. We don’t have checkboxes for the entire DOCX specification (yikes) so we don’t have to broadly support everything it can do. Import on the other hand—people try to import all kinds of things.
And of course, the enhanced converters also handle ODT and DOC files—which will indeed fall back to system converters when this is switched off. So turning this off by default would only suit those who use one specific file type, and then maybe not even all of the possible cases one might use it for.
That’s probably also why the help text remains the way it does. It would get a bit messy having to describe the situation with the flag disabled. I’ll update it for the manual though.