Styles documentation

scrive · March 13, 2022, 5:02am

As a hyper-avid fan of Scrivener Styles, I have come to understand the need to document openly the code that is currently residing in my LaTeX (Memoir Book) > Styles Project Format.

Scrivener provides a powerful, structured format to enter the many options that Scrivener so carefully provides that facilitate the writing experience. With well over a hundred custom Styles that I have created to expand on the utility of using Scrivener+LaTeX, I am finding it cumbersome to create a summary of the code I’ve entered for those Styles.

My initial idea to create an open document to store Style codes and data has now expanded to eventually include ALL Project Format user-input data (as near-impossible as that might seem to me right now).

My first objective was to select a system, database or spreadsheet to store the user-defined data. Strangely enough (or perhaps not) I realized that a Scrivener project could provide the structure and flexibility that I’d need to hold the user-defined data, with an exceptional ability to process and output that data in whatever format I will need.

That brings me to my question (sorry that took so long).

With the help of the Forklift and Oxygen XML apps, I have been able to locate the Scrivener .scrivx file, and eventually open styles.xml that appears to contain most, if not all, of the used-defined Styles data in my project.

For reference, I have essentially zero knowledge or experience with XML.

I’ve been able to peek into the styles.xml file using the Oxygen app. I identified the entries I created in my own project’s Styles, along with the Styles that were included in the original General Non-Fiction (LaTeX) template. With zero XML knowledge, however, that is as far as I can go at the moment.

From this point, if anyone has any suggestion as to how I might extract the Styles user-defined data for eventual output as text, I would greatly appreciate it.

I realize that I will need some sort of an XML definition that lays out how the data is structured, which I do not currently have. It would appear, however, that at least part of that structure is defined on the right-side of the above screen as Attributes.

Any tidbits of insight to move me along to the next step would be welcome.

scrive

gr · March 13, 2022, 2:13pm

What you are seeing is chunks of RTF code embedded (CDATA) in XML code.

But I sincerely think this is a rabbit hole you should not go down. There is no benefit to you to try to decode this arcane data in order to catalogue your defined styles.

The only information about defined styles that you could be wanting to record are the very attributes you see when you apply the style. Given this, you should be able to construct your style catalogue this way: Make a fresh document and make a list of the names of your custom paragraph styles – one name per paragraph. Open the styles panel. Use it to turn each successive paragraph into the style whose name it bears. Next, go back to each paragraph and add a description of the styling you see there. Of course, the purpose of each style is important in your set up, so you can as well write that down while you are at it. Now you are done. You can repeat this process with another doc for your character styles.

Note that by proceeding this way all of your labor is directly “on task”, and this is how it will effect a savings over how you are currently proposing to pursue the matter.

AmberV · March 13, 2022, 2:52pm

Firstly, if you aren’t a programmer with a decent level of familiarity with using the XML libraries in your language of choice, I cannot imagine that hunting through raw XML by eye is any more efficient than just copying and pasting values out of the Styles compile format pane. Yeah, it’s a lot of tabbing through text fields and clicking through lists, but that’s exactly what you’d be doing in a raw XML file anyway, and at least in Scrivener there is some specialised interface for interacting with the elements.

Something worth noting is that you start with a screenshot of the Styles pane in the compile format designer highlighted, and speak mainly in a way that would extol the virtues of how styles can be compiled functionally—but then move on to examine the project’s styles, which have nothing to do with the former. Project styles, at least what gets stored of them in the file you’re looking at, are purely aesthetics and a name—not very useful to you I don’t think. On the other hand, a compile Format’s styles will be stored in the respective .scrformat file, within the Settings/Compile Formats folder—and that is where you will find functional information.

In other words, the way you are talking about all of this makes me think you’re looking to build a list that looks a bit like this:

Style name	Prefix	Suffix
Citation	\cite{	}

As previously mentioned, the project’s styles matter not one bit at all to this kind of information—save for the fact that if we have citations and want to use them in our project, we would use this reference to create a style called “Citation” in the project and mark cite refs with it.

Project styles: just ways of marking text’s meaning and signifying to us through the use of visual language, which text is meant to do what. There is no execution here, just abstract semantics.
Format styles: all execution. For plain-text markup based writing, this is where styles go from visual markings to real syntax.

Therefore, if a Format does not list a style, a style in a project does not have any functional use beyond whatever utility we get from having the text marked with it, in the writing space. A Format might provide functions, through its style list, that a particular project does not make use of, but they are there if we need them. This decoupling between marking text and executing functions upon marked text is something that makes Scrivener quite unique, and very powerful for writing. But it does mean you have to pay heed to which does what, if you want to go poking into internal files and extracting configuration details from them.

kewms · March 13, 2022, 4:44pm

Once this catalog is created, what do you want to do with it?

I think I may have suggested something like this in another thread, but what I had in mind was a simple self-documenting list of the kind you see for lists of fonts:

This is bold text.
This is italic text.

This is a line of quoted text,

And so on.

scrive · March 14, 2022, 1:19am

Hi kewms, AmberV and gr,

Thank you all for your comments.

My better, rational self agrees with most, if not all, of what you have said. Then there is the other side … a few points:

Scrivener has helped countless writers reach their goal.
I’m cognizant of the adage: A bird in the hand is worth…
Scrivener+LaTeX is a very minor part of the existing Scrivener market.
Development time and coding skills are a very unique and limited resource.
Allocation of such resources is a key challenge for any ‘going concern’.

While there is NO doubt that I am certainly under water here, there are personal factors, such as my slowly declining mental memory state (call it what you will) coupled with a desire, like almost all writers, to communicate a message, that drive me into uncomfortable areas where I have little if any experience or knowledge.

Which brings me to Scrivener Styles.

Writing with Scrivener+LaTeX (S&L) has been so far beyond my expectations. Working with S&L, coupled with the prospect of doing without has revealed to me the immense scope of delivering my message. Without S&L, I would still have been “in the weeds”. The pair are indispensable, more so with Scrivener Styles+LaTeX (SS&L). With my limited resources, if I am to succeed with my goal to communicate, I have few, if any, options.

Recognizing the limitations that exist for Scrivener and myself, I must first realize how central Scrivener Styles are toward reaching that goal. I therefore need to pursue extending my Scrivener Styles toward an open format, as awkward and clumsily as that may be.

It may take me another two years, or more (if my memory lasts that long) to complete such an extension.

I understand that hurdles stand in my way, many of which I am completely oblivious to and ignorant of. Thank you for outlining some of those obstacles. They will likely slow and may effectively halt any progress toward my goal. So be it.

I see many metaphors for the possibilities that exist for Scrivener. To share them would be preaching to the choir. (Actually, it many be more akin to a member of the choir preaching to the Preacher.)

In many ways, I’ve stumbled into having a unique view of a small slice of how powerful Scrivener is and can be. I’ve been on unique journeys for most of my life. Often I’ve failed, sometimes not.

If I can nudge the needle ever so slightly toward my goal, it will ALL have been worth it.

Thank you again for your comments,
scrive

nontroppo · March 14, 2022, 11:28am

I use Visual Studio Code for almost everything at the moment, it has a bunch of XML plugins (I use this one: XML Tools - Visual Studio Marketplace).

But what you want to do is “extract” information from XML automatically, then you need to think about a scripting language to do so (an XML editor only visualises the code). I’m most familiar with the Ruby language (currently built-in to macOS), which has a nice and simple syntax and great XML library called nokogiri (for searching nodes in XML, look at something called XPath which is a way to get content from XML):

The Scrivener XML is fairly simple but if you’ve never written a script before, that in itself is a hurdle to climb (others may recommend Python). Anyway a script will be able to walk the XML tree and be able to extract out the relevant bits, and then spit that out into another document.

By the way, if you want to roughly learn a language, choose one that has the option for step-debugging. Step-debugging allows you to stop at any line of the code and examine the variables and other parameters as you go line by line, I find it an excellent way to learn how a language works. Ruby and Python both have great step-debugging built in to VS Code for example…

nontroppo · March 14, 2022, 11:36am

By the way, you are looking at the wrong file! styles.xml is where the editor styles are defined, this is just what you see in the editor. What you are interested in is in the style transformations that are part of the compile format. You need to find Project.scriv/Settings/Compile Formats/ folder and open the .scrformat file you normally use in the compiler…

Offtopic, but Forklift is a totally awesome file manager for macOS, good taste in Finder replacements!

nontroppo · March 14, 2022, 1:00pm

Here is a starter for you, this ruby script (extract.rb) will load a scrivener compile XML format and find all styles and print out their names:

require 'nokogiri'

doc = Nokogiri::XML(File.open("/Users/ian/Scrivomatic.scrformat"))
styles = doc.xpath('//CompileFormat/Styles/Style')

puts "# Compile Format Summary\n\n"
puts "There are #{styles.length} styles in this compile format\n\n"

styles.each_with_index { |element, index|
	puts "* Style " + index.to_s + " is named: " + element['Name'] + " of type " + element['Type']
}

This generates something like:

ⓔ ▶︎ ruby extract.rb
# Compile Format Summary

There are 24 styles in this compile format

* Style 0 is named: Attribution of type Para
* Style 1 is named: Bibliography of type Para+Char
* Style 2 is named: Block Quote of type Para+Char
* Style 3 is named: Meta-data of type Para+Char
* Style 4 is named: Code Span of type Char
* Style 5 is named: Emphasis of type Char
* Style 6 is named: Strong of type Char
* Style 7 is named: Subscript of type Char
* Style 8 is named: Superscript of type Char
* Style 9 is named: Strong Emphasis of type Char
* Style 10 is named: Caption of type Para+Char
* Style 11 is named: Code Block of type Para+Char
* Style 12 is named: Heading 1 of type Para+Char
* Style 13 is named: Heading 2 of type Para+Char
* Style 14 is named: Heading 3 of type Para+Char
* Style 15 is named: Maths Block of type Para+Char
* Style 16 is named: Maths Inline of type Char
* Style 17 is named: Raw HTML of type Para+Char
* Style 18 is named: Raw LaTeX of type Para+Char
* Style 19 is named: Small Caps of type Char
* Style 20 is named: Ruby Code of type Para+Char
* Style 21 is named: InfoBox of type Para+Char
* Style 22 is named: WarningBox of type Para+Char
* Style 23 is named: Underline of type Char

You’ll need to change the file name in the code. You’ll see this uses xpath //CompileFormat/Styles/Style to find the style nodes and then you have an array of all styles. You can then loop through each item in the list. You can extract out the Prefix and Suffix this way too…

scrive · March 15, 2022, 12:35am

Hi nontroppo,

Thank you for these gems (pun intended).

To everyone else who responded to my post, please be patient as I am throughly overwhelmed (and back ‘in the weeds’) at the moment (and likely to be for some time - as is most of the world!).

Thank you,
scrive

nontroppo · March 15, 2022, 9:09am

Sorry if I contributed to your trip to the weeds… Abstract languages like those that can be used to automate our computing needs can appear like magic to the uninitiated. Even for those who know how to code, different paradigms still appear as magic. When I read code that uses the pure functional paradigm (Haskell for example), even though I can code it makes me feel like learning Klingon would be easier!

Here are a few more lines added to expand the scope of this script:

require 'nokogiri'

doc = Nokogiri::XML(File.open("/Users/ian/Scrivomatic.scrformat"))
styles = doc.xpath('//CompileFormat/Styles/Style')

puts "# Compile Format Style Summary\n\n"
puts "There are #{styles.length} styles in this compile format.\n\n## List of styles:\n\n"

styles.each_with_index { |element, index|
	puts "* Style #{index.to_s}: **_#{element['Name']}_** of type **#{element['Type']}**"
	element.children.each {|child| 
		next if child.name == "text"
		puts "	- Raw Markup: " + child.content if child.name == "RawMarkup"
		puts "	- Prefix: `" + child.content + "`" if child.name == "Prefix" && !child.blank?
		puts "	- Suffix: `" + child.content + "`" if child.name == "Suffix" && !child.blank?
		puts "	- Paragraph Prefix: " + child.content if child.name == "ParaPrefix" && !child.blank?
		puts "	- Paragraph Suffix: " + child.content if child.name == "ParaSuffix" && !child.blank?
	}
}

I run this script through Pandoc using ruby extract.rb | pandoc -o extract.pdf and get this PDF out:

extract.pdf.zip (126.5 KB)

This lists for each style whether it is Raw, along with all prefix and suffix values.

scrive · March 15, 2022, 10:38pm

Hi nontroppo,

You have absolutely NOTHING to apologize for. Just the opposite. If there are any apologies to be made, it is I for making such unusual requests.

I am ‘in the weeds’ because of the appeals I am making to understand enough of XML to be able to extract my user-input data.

More later … and thank you!

scrive

P.S. Still trying to install Nokogiri … experiencing file permissions issues! My apologies that this is going soo slowly …

nontroppo · March 16, 2022, 9:53am

If you are using the Ruby that comes with macOS, then you need to use sudo gem install nokogiri to install it. You may also need to first install the macOS command-line tools using xcode-select --install if you haven’t already done so…

EDIT: there may be a bug in macOS where on intel machines gems get installed as ARM binaries, at least when I just tried to reinstall nokogiri that happened on my intel MBP. This is a known problem since the M1 came out and Apple doesn’t seem to have done much to fix it. If that is the problem you face you can try sudo /usr/bin/gem install nokogiri --platform x86_64-darwin…

A slight tweak that visualises the newlines and also fixes printing of the ` character in the output (which at least for markdown is used for inline code):

require 'nokogiri'

doc = Nokogiri::XML(File.open("/path/to/My.scrformat"))
styles = doc.xpath('//CompileFormat/Styles/Style')

puts "# Compile Format Style Summary\n\n"
puts "There are #{styles.length} styles in this compile format.\n\n## List of styles:\n\n"

styles.each_with_index { |element, index|
	puts "* Style #{index.to_s}: **_#{element['Name']}_** of type **#{element['Type']}**"
	element.children.each { |child| 
		next if child.name == "text" or child.blank?
		content = child.content.gsub("\n","\\n") #make newlines visible
		puts "	- Raw Markup: #{content}"		if child.name == "RawMarkup"
		puts "	- Prefix: `` #{content} ``"		if child.name == "Prefix"
		puts "	- Suffix: `` #{content} ``"		if child.name == "Suffix"
		puts "	- ¶ Prefix: `` #{content} ``"	if child.name == "ParaPrefix"
		puts "	- ¶ Suffix: `` #{content} ``"	if child.name == "ParaSuffix"
	}
}

nontroppo · March 16, 2022, 6:31pm

As thoughtfully suggested by @gr I would also be very happy to generate the PDF for you from your Scrivener Compile Format file if you are willing to share it (it can be exported from the Compile Designer, see §23.2.5 of the User Manual, you can send it privately if you need)? What info do you want and how do you want it formatted — that way you don’t need to wrestle with Ruby

scrive · March 17, 2022, 1:03am

Hi nontroppo,

Thank you for all your help and assistance. There are so many gems you have sent my way.

Your expertise is of great value to me, and I need to be careful with the bandwidth of support you have offered.

I’ll need to spend time focusing on your communications re XML to ensure that I am not doing anything that could result in anyone spinning their wheels.

Please allow me a chance to review all that you have offered, before I get back to you.

Thank you for your help,
scrive

nontroppo · March 17, 2022, 9:27am

No worries, take your time, and remember not all struggles are worth struggling with