Help with Styles

v. 3.0.2

I have a very long document and would like to fix the formatting within Scrivener–so far, compiling doesn’t fix the problem.
The formatting, brought over from an OCR’d document, is a mess. There are carriage returns in the middle of the page, etc. I thought I could take care of these through compiling, but no go. I updated the “No Style” but it doesn’t update throughout the document. I can’t do a “Select All” because there are other styles throughout the document.
The line which starts with “one two three” is the style I want.


Or, how could I fix this in compile?
When I go to compile, I can’t find an option that lets me change indents and the length of lines. I can’t find a way to change, “how the text appears in the editor.”
Screen Shot 2018-06-17 at 10.07.37.png

Here is the second graphic, which for some reason didn’t attach properly.

Here’s what I would do:

Split the document so that the OCR-mangled sections are in their own sub-documents. (Documents -> Split.)

Set the formatting you want as the project default. (Project -> Project Settings -> Formatting)

Select the OCR-mangled sub-documents, and use the Documents -> Convert -> Text to Default Formatting command,

Neither this process nor the Compile command will delete carriage returns or tabs, though. You may find the Edit -> Text Tidying -> Remove Empty Lines between Paragraphs and -> Strip Leading Tabs commands helpful. I’d also use the View -> Text Editing -> Show Invisibles command, which will make paragraphs, tabs, and other whitespace characters visible.

Compile will not change how text appears in the editor. It’s not supposed to. To change how text appears in your output document, see Section 24.2.3 of the Scrivener manual.

Katherine

Thank you.

If you turn on invisibles (View > Text Editing > Show Invisibles) , you will be able to see if the OCR engine has used line (⮐) or paragraph (¶) markers in its attempt to replicate the layout of the source material.

Usually, these markers will come after a space, but not always.

Rather than replacing each marker individually, you can use search and replace to speed things up. Search for the relevant type of marker (using OPTION RETURN or OPTION COMMAND RETURN in the search box) and replace it with a space. If this leads to words being separated by two spaces, you can then search for two spaces and replace them with one. If you usually use two spaces after a full stop, you can then search for full stop [space] and replace with full stop [space] [space]. And so on.

You will still need to proof the text afterwards, but if you have got vast blocks of OCR’d output, running logical search and replace functions on the text can help to speed things up considerably. If needed, you can also use S&R to replace end-of-line mid-word hyphens that are no longer appropriate to the new layout: you sometimes see words such as hiberna–tion where an OCR engine has copied the hyphenation pattern of the source text.

Before doing anything like this, it can be helpful to duplicate or snapshot the file so that there is an escape route to take if things don’t go to plan and CMD Z doesn’t take you back to Kansas.

OCR.jpg

With OCR’d text with two carriage returns between paragraphs, and CRs at the end of every line, I use S&R as follows:

  1. Enter ‘Space’ CR in the search field, and just ‘Space’ in the replace field;
  2. Replace All;
    All the newlines within paragraphs should have been replaced with just a space, but the double CRs between paragraphs should remain;
  3. Enter CRCR in the Search field and a single CR in the Replace field;
  4. Replace All.

If you have to deal with text that doesn’t have the space before the CRs, then:

  1. Enter CRCR in the Search field and a character that won’t appear in the actual text in the Replace field … I usually use @ but YMMV;
  2. Click Replace All;
  3. Now enter @ (or whatever you’ve used) in the Search field) and CR in the Replace field;
  4. Click Replace all.

HTH :slight_smile:

Mark

EDIT

Oh, select text, then:

Edit > Text Tidying > Replace Multiple Spaces with Single Spaces.