046 - Smart quotes disappearing on .doc import

The issue with the not imported smart quotes seems to be back :confused:

My Scrivener options are set to “not use smart quotes”.
My Word files are full of them…

Importing a file did NOT import the quotes (mainly apostrophes, as I’m writing mostly in French).

To see what would happen, I converted this file to .rtf
This worked fine on importing to Scivener project : the quotes are there.

I also notice that the Word file took the default margins set in the Scrivener projet… and that the same file saved as .rtf kept it’s own margins.

Hope this can be useful :neutral_face:

Thanks Lilith for reporting this.
Can you confirm that converting the Word document to RTF and then importing the RTF to Scrivener worked fine, but importing the Word doc file directly to Scrivener did not import the quotes?

Also, do you have Word installed on your machine?

If the RTF file worked on import to Scrivener, then the Scrivener RTF engine is working fine. So, it’s probably not a Scrivener RTF issue.
If you have Word installed then it’s Doc2Any conversion issue when converting the Word doc file to rtf.
If you don’t have Word installed, then it will be a SubSystems issue converting the doc to rtf.

Either way, it would be nice to know which of these elements is not handling the smart quotes. But it looks like we should be recommending that RTF is the preferred import option for preserving text like on the Mac platform.

I can confirm that in the file on which I reported this

  1. the file was in Word
  2. I had saved it from within Word to .rtf
  3. the apostrophes imported fine from .rtf (but not from Word)
  4. all my imported files are made in Word

Word IS installed on my machine since Word 95 and I have always upgraded it to the newer version (I missed only one until now). I’m using Wod 2010 now, but some of the files are still in their Word 95 format.

Since I reported this issue, I have been working on other files and have reported other issues related to import.
I feel that it is all one and the same problem (only a feeling, though).

Apostrophes / smart quotes :

  • apostrophes do almost NEVER import (when they are “smart quotes”)
  • apostrophes also disappear from .rtf files now (I have tried with about 20 different files, all originally in Word (those I have been working on can be from Word97 to Word 2010), some converted to .rtf for test purposes.
  • the .rtf files have their apostrophes, but these are “smart quotes” from Word
  • if completely replaced “smart quotes” by simple apostrophes, then import of apostrophes is fine
  • some other special characters do NOT import either, such as «, », or œ (the word “sœur” is written “sur”, which the spelling checker should NOT see as a misspelled word, but it does : the first letter is NOT underlined, and the two next letters are underlined (this word “soeur”, also spelled “sœur” is always underlined by spelling ckecher, with the other suggestion mentioned), which means to me that at least spelling checker reckognises this word as NOT being “sur”
  • copy and paste text from Word to Scrivener works fine, even with the smart quotes but the word “Sœur” appears as “sur”, even when copied and pasted (it crashed the program too :mrgreen: )
  • on places where an apostrophe has been dropped, the spelling checker’s behaviour is improper : it underlines only a piece of the word in some cases, and in most others, there is nothing underlined at all (with “new words” that do not exist, i.e. “j’aime” drops the apostrophe and makes “jaime”, which makes no sense in French, and the spelling ckecher does not notice it). Is spelling checker “seeing” what the screen is not displaying ???

Fonts :

  • on import, fonts are changed randomly (Word and .rtf files)
  • I am using “styles” in Word, so that my paragraphs have their own style coming with their level [each style is named, and same name equals same format].
    On importing into Scrivener, the difference remains visible, but NOT with this beta : I get fonts which I never use, and they cover the whole file (no difference between paragraphs)
  • on importing one and the same document twice (one .rtf copy, one .doc original), I get two different fonts (none of which used in any of my Word styles NOR in my Scrivener settings)

Added spaces (on non-breaking spaces from Word) :

  • import adds (until now) 2 spaces (yesterday only 1).
    I remember that when this happened a few months ago, the issue was incrementing the added spaces every day, faster and faster (I tried this for a joke and it went up to 32 spaces at each “non-breaking space” imported from Word ; it stopped when I stopped playing and counting the spaces :mrgreen: ) ; then you fixed it…

Drag and drop lost randomly :

  • then drag and drop import had several failures where I had the document grabbed but was not allowed to drop it anywhere (neither in the binder, nor the corkboard, nor outliner)
  • wishing to import a quite long file, and work on it without losing pieces, due to the numerous crashes, I decided to switch back to 035, where I was sure it would import properly.
    I noticed that I could no longer drag and drop to the corkboard in beta 035 either and had to use the internal “import” function
  • then I reinstalled 046 and could not drag and drop anywhere
  • then new crash
  • after reopening, drag and drop is available again

All this together (except the font thing) looks like something that we have already experienced, and that you had fixed.

And… 035 was the best version ever for importing from Word :wink:

Some more information on vanishing apostrophes…

They ARE there : I can cursor select them !!!
They are only so tiny, that they are not visible.

The same occurs for special characters such as “œ” : The character is there, and as flat as an invisible apostrophe…

May this help you, guys…
As for “touching” such places… well, every time I try, I crash the program…

If you want the text, just let me know.

More news :imp:

  • the triple colon also disappears “…”
  • touching around the remaining “ghost” of the vanishing characters (adding the missing character beside it) it makes Scrivener crash
  • removing it (or selecting and overtyping it) does not (yet) crash the program
  • and, yes, touching left of these invisible signs does not seem to crash the program, whereas on the right (just behind) it seems to be immediate
  • and this is what you get when you copy such words from Scrivener into Word :
    … should be : "Il est d’ailleurs plus que "
    … looks like in Scrivener : "Il est dailleurs plus que "
    … looks like in Word : "Il est d ailleurs plus que "
    … does really come out of the clipboard like this : "Il est d’ailleurs plus que "
    Here the sign is visible, so may this help… :unamused:
  • it finally crashed (after half an hour without any problem), and I was not working around an invisible character : I was reading to search the next one, and pressed control key down… too long maybe… that is all I touched at that moment…
  • another crash : I touched the sign by not taking care to be at it’s left… immediate crash…
  • seems that it is slowly eating up all my memory… I had to reboot my computer, which has never been the case until now :laughing:
  • please let me know where I can send three or four versions of one and the same document (I’m going to prepare them with hightlights)

Hi Lilith,

Thanks for the detailed information in your post. Here is a download to a fix compiled late yesterday:

You will need to Close Scrivener and rename your existing Scrivener.exe to something like ScrivenerO46.exe and unzip the download file and copy it’s contents: Scrivener.exe to where your Scrivener046.exe is. Restart scrivener, the version number should be 047. If it is, then I’m hoping many of the issues you raised will be resolved.

I believe these issues were introduced as a result of fixing another bug regarding support for double width character code pages. The actual problem itself was due to an internal data type that needed to be changed from signed to unsigned i.e. char and uchar.

Anyway, I’d appreciate if you could try this.

Hi Lee,

I downloaded the file last night and used it for a first time : No crash.
Today I’m going to test it more deeply.

First results :

  • drag and drop import is working
  • Word and rtf imports on drag and drop are fine, WITH the apostrophes
  • font on import is still not the one of the file : .doc files import as “Poor Richard” and .rtf files a “Helvetica” (both not used in my files)
  • import overrides the different fonts used in the original document
  • import keeps the original font size, alignment and formattings such as bold, italics…
  • apostrophes are imported even when they are smart quotes
  • still no crash

Crash :

  • I managed to crash the program by going back to one of the files imported with 046 : I touched the “missing character” just by clicking where it is
  • em-dash is another character that did not import properly : nice place to crash, when surrounded by spaces (but the same document imports correctly, with all the characters - only the font is not the one expected

Fonts :

  • one of the files imported with 046 came with a huge line spacing, which I cannot change - I imported this file again with 047 : this line spacing is still there, although not present in the original file ; the font in which it imports is “Cambria Math” (another one I never use), even when imported with 047 - the main change is that apostrophes, smart quotes and special characters do import now
  • although this file is now in “Cambria Math”, when I copy it and paste it into Word, it comes out in “Times New Roman”, and the line spacings are of normal size there
  • copied back to Scrivener, this does NOT override the huge spacing ; copied into a new file, the line spacings are normal (single spacing)
  • another file having this big line spacing happens to take over the default font set in Scrivener (but keeps its own font size)
  • I notice that the same documents, in the same format, import with the same fonts, regardless how many times I may import them (but these fonts are in no relationship with those used in the document itself ; this time, “Tahoma” is used on import from Word, and “Arial” on import from rtf
  • I could find out that converting document to default text settings also adapts the line spacings to the expected

Other stuff related to 047 :

  • changing the label of a file does not affect the icon in the binder directly (no change is visible, unless I pass the mouse over the binder, then, suddenly, the icon colour changes)
  • jumping from one file to the other (with all those different fonts in the documents) does not affect the font in the toolbar unless I click in the document (maybe this is normal and I have only never noticed it until now, as I’m used to work with only some fonts)

Various crashes :

  • there are still some crashes on clicking here and there, without being able to reproduce them by doing the same (I just crashed the program by clicking on “format/font/outlined” - impossible to reproduce)

Display seems to need a serious delay.
I realise that in some cases, it does not make it at all, before I move the mouse over where it should have happened immediately.

  • outliner : change the column width bey seizing the header leaves the columns as they were, unless you point the mouse over it
  • scrivening : having looked at the file as a scrivening, I switch to corkboard and move all the files to another folder, then click on the empty folder : the scrivening with all the text is still visible (it should be blank or show me the corkboard, as this is a folder, and my settings are to show folders as corkboard). I clicked on corkboard and saw the empty board, then I clicked on Scrivening, and saw all the text again. Then, slowly, the screen display adapted (very slow motion ; it was as if I could see it thinking wether or not to make that display - not a thing my computer usually does)
  • binder : change the label of a file and see no change unless the mouse hooves over the binder

This is not permanent.
For the moment, display of icon colour changes in the binder is instantly.
Changing de width of a coloumn in outliner does not display without the mouse pointer over it.

As for the scrivening display, I can reproduce this : I moved my files again from one folder to another - this time without even having displayed this folder as a scrivening.
Seeing the empty corkboard, I clicked on scrivening button, and there was all the text… in an empty folder…

BTW if you want me to send these reports on 047 by mail, or if you prefer each subject separately, just let me know.

Memory 047
Opening a project sucks memory and does not give it back once the project is closed.
I encountered problems in opening one project : machine begins to sing and nothing happens.
Impossible to shut down the program to stop this : no response (not with “not responding” written on top).
I called up task manager to stop it. That’s where I noticed that memory used was rising from 70 mega up to 205 where I shut the program down.
Then I checked with different projects : They all pop up immediately, except the one I cannot open ; and they all require a certain amount of memory, which does not drop when the project is closed (seems to stop climbing at about 50 mega).
(As for the tricky project, it contains more than 200 files, and I think I’ll simply make it all over)

I had an issue with quotation marks as well that were c/p’d from Word instead of imported, with a few ‘extra’ side effects:

The text in question started out as 12-point Arial, and became 11-point Courier New

None of the quotation marks or formatting transferred over (italics and such)

Tabs went to one inch and stayed that way (no idea how to work around it, the original tabs were a half-inch)

Some lines received an extra Tab, whilst others did not.