After exporting Scrivener projects to EPUB format, I see a lot of tags without attributes — a bit over 700 of them, in the EPUB file I’ve just exported. For example:
It would be tedious if, for every dialog box mentioned in the documentation, there were a note reading “…or tapClose(orCancel) to close the dialog without saving changes.” Unless the documentation specifies otherwise, assume that you can always tap Cancel or Close to dismiss a dialog without saving your changes.
In another program, a regular-expression s/r routine removed all such tags. Afterward, FlightCheck declared the EPUB to be valid. The above paragraph then appeared as follows:
It would be tedious if, for every dialog box mentioned in the documentation, there were a note reading “…or tap Close (or Cancel) to close the dialog without saving changes.” Unless the documentation specifies otherwise, assume that you can always tap Cancel or Close to dismiss a dialog without saving your changes.
When I open the EPUB in question in an Android e-reader app, the file opens without error and the above paragraph displays as expected.
What is the purpose of the elements without attributes?
I could write Perl scripts, or create a set of batch-search-replace routines in Sigil, so that the tags without attributes are removed after every export-to-EPUB. But before I put that kind of time into such a project: Do e-reader apps, for whatever reason, need these many tags without attributes?
I got a reply from tech support. The answer, paraphrased:
There’s a quirk in the HTML exporter that is producing the elements without attributes, and they’re looking into refining it. The attribute-less elements are harmless. Trying to remove them could cause errors in the EPUB’s formatting.
I agree with the latter point, but when one has to look into the HTML/XHTML file itself, the extraneous tagging makes the file much less “human-readable” than it would otherwise be.
So I will have to make either a search/replace routine in Sigil, or a Perl script, that removes at least a subset of the unwanted elements: those whose contents do not contain other HTML elements. The complete job would require either an XSL transform or a script in which one throws some logic at the problem – something considerably more involved than a simple search/replace routine. Otherwise, the file might be rendered not-well-formed. That would be a cure much worse than the disease.