Problem with Traditional Chinese Character

I use Scrivener to do my translation job (from English to chinese or German to chinese).

This problem is that after I typed in my work in chinese, and some time later when I open the file again, some chinese characters were changed to some strange codes or non-sense characters.
Please do help me get out of this problem… or I need to consider to change another applications…
Thank you.

Have you disabled any fonts since last opening the document? Strange characters would indicate that the correct font for displaying the characters is perhaps not available. Perhaps you could post a screenshot?

What version of the system are you using? Which Chinese font(s)? And what entry system? I use simplified Chinese — though I’m generally reading files in or containing Chinese produced by others in Word — and don’t have trouble, especially since 1.5.4 has solved the “invisible characters” problem.

Mark

As seen in the highlighted, the original characters were changed to some non-sense chinese character and even with some english alphabet in it.

This problem actually exists for long long time. From 10.5 to 10.62 OSX and from 1.5.1 to 1.5.3.

Have you updated to 1.54? Try that first if not. I really have no clue about this, sorry, as I just use the OS X text system. Do these things load in TextEdit (as I say, try updating to 1.54 if you haven’t already before answering though).

All the best,
Keith

Keith, I don’t know if this will help in this case, but I’ve just had something from Nisus, as follows:

It has always seemed to me that the fallback font is actually “Hiragino Mincho”, a Japanese Kanji font, not a genuine Chinese font. Just thought it might be of interest.

Mark

I have exactly the same problem. I’m using version 1.54 + OS 10.6.4.

It seems that mixing English and Chinese text caused the weird characters. Chinese text typed somewhere after the English text could be mangled.

Using different fonts doesn’t change the result, sometimes even worse, causing more weird characters.

My guess is that somehow the 1-byte roman characters and 2-byte Chinese characters are not “treated” seperately, each 2-byte character are treated as a combination of two 1-byte characters, thus mangled the text.

This is a serious problem for us Chinese writers, I am forced to stop using Scrivener because of this bug. Your help would be appreciated.

I have been using Scrivener from just before the release of v. 1, am now on v 1.54, and have used it on all system versions current during that time, so probably starting with 10.4.x, through 10.5 and now on 10.6.4, and most of my work is in mixed Chinese — albeit simplified — and English, and I have never had this problem once.
So, as I said before, it sounds more like something to do with the font(s) you are using, or perhaps the input system. Can you tell us more, as without that kind of information it is difficult to work out what is going on. I don’t think either Keith or Ioa are Chinese users, but I will help to the best of my ability.
So, for instance, in my preferences, I have my text-editing font set to Baskerville 13pt — I prefer Optima, but I have had problems with that with certain accented glyphs, if I remember rightly, so I changed — which when I switch into Chinese uses 花纹宋体 13 pt as the alternate font. Almost all my texts are sent to Chinese colleagues on Windows boxes, so I generally export using Times Roman 12 point as Word for Windows Chinese version reads that as Times New Roman; the Chinese is still 花纹宋体 which Word reads as 宋.
I have a Revision 1 MacBook Pro and a Revision 1 MacBook Air, both running Scrivener 1.54 and 10.6.4. For my Chinese input system I use IMKQIM which is infinitely superior to the Apple input systems.
As I say, I personally have had no problems whatsoever of the kind you describe, so please tell us more.

HTH

Mark

Dear Mark,

Thank you for helping. I’m using Yahoo! Keykey input method, but I doubt if it has anything to do with the problem since I could recreate this bug simply by copy and paste.

Here is a brief test, the font is set to Heiti TC 20 pt. I made a paragraph with 251 Chinese characters (it seems this bug only appears when a single paragraph consists more than 250 Chinese characters):

then I added the alphabet “a” (any 1-byte roman character will do), at the beginning of the paragraph:

saved it, closed the file, then open it again, the end of the paragraph became mangled:

I also noticed that putting the roman character anywhere between the 1st and the 250th character will result in the same mangled text. Using different fonts did not change the result.

This bug also appears while using Simplified Chinese fonts such as Heiti SC.

Again, thank you for noticing this issue.

And here’s what happened using Heiti SC 20 pt:

Thank you, Honeypie, I’ll see if I can reproduce it. I don’t know anything about the Yahoo Keykey input method, so excuse me if I immediately suspect that. I think the fact that you have a 250 character key size there makes me feel that it has a 256 character or there-abouts block limit and that it is putting in some code at the beginning and end of each block which is getting confused with the hex-code for the characters.

Questions:
Does this happen with TextEdit?
Does this happen if you switch to the Apple input system with Traditional Chinese? (I’ve never had it with Simplified Chinese, though I’m not sure if I have used the Apple system while I’ve been using Scrivener.)
Does it also happen with Lihei Pro?

There is another possibility that it is to do with your font. I don’t have Heiti TC or Heiti SC on my system (I use 花纹黑体 — it may possibly appear as ST Heiti on your system — if I need a sans-serif equivalent). Is that a font you have bought in separately, or is it one that you have imported from previous Mac-OS 9.x? One of the things that gave me problems with early versions of OS-X was the non-availability of Song, which was standard under Mac-OS 8-9. My understanding is that the problem was that its PostScript was faulty and so had to be dropped as OS-X is dependent on PostScript for its screen and print display, that made it unusable. If Heiti TC is originally a Mac-OS 9 font, the same may be true there.

As I say, I have never had any problem with Scrivener and Chinese, so it seems to me that the problem must lie in either your input system or the fonts you say you are using, as those are the points of difference with my set-up. I guess the only other question is, do you have any other third-party haxies or other system-level extensions running that could be creating this as a curious knock-on effect?

Let me know further as to the above. If you could send me a zipped up small Scrivener project with a paragraph or two of your TC and SC text, that would make it easier for me to try things out. Send me a personal message, and I’ll let you know my email address.

:slight_smile:

Mark

Two other points have come to mind:

  1. Are Heiti TC and Heiti SC fully UTF8 compatible? If not, that might be at issue.
  2. Have you tried turning off typographer’s quotes in ? It seems that with a number of non-Latin font sets or perhaps non-British/US/International keyboards they can cause conflicts. I don’t have problems even with typographer’s quotes turned on, but then I’m using IMKQIM and it’s possible there might be a conflict with Yahoo Keykey.

Mark

Dear Mark,

Thank you for your patience.

Yes it does! I didn’t notice it before. So it seems to be a “system level” problem. The weird text appears only in RTF format, not in plain text though.

I’ve also tried it with Pages, and couldn’t recreate the bug.

Yes, I switched to the Apple input system and it’s still the same. So I guess the input method is not to be blamed.

Yes it does.

Actually Heiti TC is the default system font in Snow Leopard, which caused quite a fuss among TC users here since it’s obviously less than perfect. But I think UTF8 compatibility should not be an issue here.

Yes I tried turn it off, but it’s not the cause, either.

I can’t think of any extension which might create such an issue, I tried safe boot mode (hold down the shift key while booting) to turn off all the extensions and the problem is still there.

I also used fontbook to disable all the duplicated fonts and removed those with problems to no avail.

Since it seems to be a system-level issue, I think Scrivener is not to be blamed. But I will surely appreciate if you could help me to locate the bug. Again, thank you so much for helping.

Update:

I’ve experimented with different fonts, and here is the result so far:

Almost all the Traditional/Simplified Chinese fonts resulted in mangled text, with only two exceptions:

微軟正黑體
王漢宗中黑體

which did not have the problem.

Switch to Roman fonts (using the same Traditional Chinese text) could solve the problem, also using Japanese or Korean fonts seems to solve the problem too. I’ve tried:

Apple Gothic
Apple Myungjo
Kozuka Gothic Pro
Kozuka Mincho Pro
Hiragino Kaku
Hiragino Maru
MS Gothic
MS Mincho
PC Myungjo

and everything’s fine.

OK, here we run into a difficulty. I guess you are running the Chinese version of the operating system, whereas although my computers were bought in Xiamen, I’m running the British English version of the operating system, so your Chinese font names are in Chinese whereas mine are in Roman. I do not have the two fonts you mention, 微軟正黑體 and 王漢宗中黑體 under those names.

Interestingly, I don’t use FontBook, but Linotype FontExplorer X — originally it was a free app, I’m not sure now, as I haven’t upgraded for a long time — and Heiti TC and Heiti SC appear on that as alternate faces of ST Heiti Light and ST Heiti Regular (花纹黑体).

Anyway, I’ve copied a Chinese text — originally imported from Chinese WinWord, admittedly — into Nisus Writer Pro — gives you a character count — changed the font to Heiti TC, made sure I had a “paragraph” 250 characters and then added some English to the beginning … no problem.

I then opened a new document, set it to Chinese text, character Heiti TC, though my input system uses Simplified Chinese, typed a Roman ‘a’, then typed a Chinese string 我是英国人 enough times to take the paragraph count to 256 characters and then put in a paragraph break. No problem at all.

I then went back to the beginning of the paragraph and typed in more roman characters — abracadabra — and no corruption anywhere; leaving those in place, I inserted more Chinese — 好奇怪 — in the middle of the paragraph … no problems at all.

So, I don’t know what can be going wrong with your set-up or Birdy’s, other than possibly corrupt fonts or corrupt font-cache — someone more computer-savvy than me can advise you better how to deal with that.

By the way, Scrivener and Nisus Writer Pro both use the same text-engine released by Apple of which TextEdit is the vanilla showcase; on the other hand, Apple use another proprietary text-engine of their own for Pages, but which they aren’t releasing to developers. So it’s not surprising if Pages behaves differently to Scriv and TextEdit. See many, many threads which mention both these matters.
:slight_smile:

Mark

Dear Mark,

Actually I’m using the English version of Snow Leopard. I’ve tried FontExplorer X (which is a shareware now), removed all the non-system fonts, cleaned the font cache, even re-installed Snow Leopard, to no avail.

Needless to say I’m very frustrated. I want to reset all my font-related preferences back to the factory setting, obviously what I did was not enough. :cry:

I will certainly be grateful if you could let me know how to “reset” my fonts back to “factory settings”.

Update:

I’ve discussed this issue with other Snow Leopard users and they all could recreate this bug under 10.6.4. including Japanese users. It seems to be a system bug appears after updated to some edition of Snow Leopard.

So I guess it’s time for Apple to take over…

Sorry it’s taken time for me to get back to you; I’ve been away for a few days.
I’m afraid I don’t know enough about re-setting fonts to factory setting, other than through doing a clean re-install. I’ll see if I can prod Ioa or Jaysen into advising.
Interesting, I guess I’m a lucky one who updated to a version of 10.6.4 before the bug came in. Either that, or else …
How did you update? Did you all just click “yes” in the Software Update window when it advised you that the update was available, and just let it download and install the easy way? I never do that. I always download the full “combo” version of the update so that I have it in my software installer archive, ready in case I ever have to do a full system re-install and can then take it up to the most recent version without hassle.
I have a vague memory that some other time in the 10.0.0 to 10.6.4 process there was another instance of a bug being introduced following a normal system .x to .y update, for which the cure was to re-update using the combo updater … or something like that.
Anyway, at least you have managed to locate the source of your problem, which has to be considered a step forward, even though the issue is not properly resolved.
Mark