The True And Potent SECRETS of SoFurry Story Formatting

Story by Onyx Tao on SoFurry

, , , , , ,

The True And Potent Secrets of SoFurry Story Formatting; a METHOD of cleanly converting WORD PROCESSOR DOCUMENTS of All Sorts and Descriptions into Gorgeous and Compatible SoFurry Journals and Text Story Entries. As Practiced By That Celebrated and Renowned SoFurry Author ONYX TAO: Being a Faithful and Honest Rendering Thereof


The True And Potent Secrets of SoFurry Story Formatting

A Method of Cleanly Converting

Word Processor Documents

of All Sorts and Descriptions

Into Gorgeous and COMPATIBLE SoFurry

Journals and Text Story Entries

As Practiced by
That Celebrated and Renowned SoFurry Author

Onyx Tao

Being a Faithful and Honest Rendering Thereof

The Background

SoFurry's story acceptance submission area is a small and fairly simple formatting text tool that does a reasonable job of copy-paste-convert. Having said that, it has some flaws.

A webform timeout can erase all the work and effort put into writing a journal entry or story.

  1. Writing a long story or journal directly in the webform risks losing text & formatting
  2. Spending too long editing within that webform risks losing all editing work

The small text box means a small visual window into the text being edited. Even when the box expands itself (as it does for certain refreshes), the scroll bar is not at all convenient.

For reasons that are perhaps interesting (and perhaps not), I use a traditional word processor to write my stories (LibreOffice Writer, part of the LibreOffice suite available at www.libreoffice.org by direct download and torrent). It is far from an ideal tool, but until I get around to finding or creating a better one, it will do. My stories are composed in LibreOffice files.

I also occasionally compose directly in HTML. Again, for reasons, instead of one of the extremely fine HTML editors, I use the HTML editor within an ePub book tool (Sigil, an open source tool available at sigil-ebook.com). HTML matches, exceptionally well, the way I think and mentally structure content, and CSS allows me to restyle everything the way I want (95% of the time, at least). Using Sigil also allows me to write and rewrite and rerewrite sections of text in scratch files while tracking all the scratch files and keeping them all together in single file rather than a scratch directory. As a bonus, strict CSS-only styling means everything I do is consistently formatted (and yes, that is important to me).

I know for an absolute fact these things would drive some authors insane, and they could not and cannot work they way I do. In other words, if this composition process doesn't match yours: that is perfectly fine!

Preparing text for publication on SoFurry is a process, and the only critical part to that process is starting off with text formatted in some way not friendly to SoFurry's simple web editing entry box. (Because, if your text is already ready to cut-and-paste ... why bother with any of this? )

Since my method of formatting for SoFurry is doing a cut-and-paste of SoFurry-friendly HTML into that text box (after setting it to HTML mode, of course), I would like to show what the raw text that will go directly into the text box actually looks like. In the Sigil editor, something like this.

<div style="margin: 1em 5% 1em 5%; font-family: monospace"> <p>Preparing text for publication on SoFurry is a process, and the only critical part to that process is starting off with text formatted in some way not friendly to SoFurry’s simple web editing entry box. (Because, if your text is already ready to cut-and-paste … <b>why bother with <i>any</i> of this?</b>)</p> <div>

I say something like this, because the Sigil editor colors HTML tags blue, HTML attributes red, HTML attribute text cyan, and direct HTML entity escapes purple. SoFurry's HTML support is limited , I suspect to prevent anyone from entering malicious (or obnoxious) HTML.

Colorizing text, background, and specifying font is possible on SoFurry, but it adds significant complexity. Worse, playing games with color does not play nicely with SoFurry's Cozy mode, meaning that it breaks the site. I do not think I should have to point out that breaking the website is NOT a good thing to do! Obviously it is possible to use color. In this very limited case, highlighting a small, specific section of text seemed worth the additional effort. I'll discuss how to override SoFurry's color and font settings below, as well as going more deeply into why and how this can break the site.

I wouldn't try to do escaped stuff that by hand (and I did not do that by hand, I used an HTML escaping tool, because the above looks like this in my editor:

<div style="margin: 1em 5% 1em 5%; font-family: monospace"> <p><p> <div style="margin: 1em 5% 1em 5%; font-family: monospace"><br/> <p>Preparing text for publication on SoFurry is a process, and the only critical part to that process is starting off with text formatted in some way not friendly to SoFurry&rsquo;s simple web editing entry box. (Because, if your text is already ready to cut-and-paste &hellip; <b>why bother with <i>any</i> of this?</b>)</p><br/></p><div></p> </div>

Doing that by hand would be exceedingly tedious, and error prone, but I did want to show what the raw text looks like, because we will be working with that raw text. Please do not be intimidated by this this raw text code if you are not a coder, and this looks scary. This is simple, and I am going to through the process, step by step as well as explaining the tools I use, and why.

The Process

Use a word processor to save the file to RTF (Rich Text Format).

Open the file in Wordpad. Add a space. Remove the space. Save the file, again in RTF Format.

Wordpad will internally reformat and massively simplify the RTF file, removing a amount of extraneous formatting. My LibreOffice-generated RTF files can shrink anywhere from 40% to 70%, and all of that stripped formatting is formatting that need not be dealt with later.

Open the file in LibreOffice, and use File->Export->Directly as Epub , and save it. For purposes of description, I will assume it is saved as MYSTORY.EPUB.

I used to export to HTML, but LibreOffice generated HTML is horrible to deal with manually. Selecting Epub puts all the formatting into a stylesheet, and makes the transformations we have to do a lot easier, especially if using Sigil. With Sigil, the preview window will immediately show you if the HTML is unbalanced or wrong, and it is much simpler to find the error than if you make all the transformations and then try to figure out why or where there is an unbalanced tag. Also, the error message in the Sigil preview window is usually spot-on.

The preview in the Preview window is also helpful to see what the story will look like once it is published.

Close LibreOffice; we will not need this program again.

Open the file MYSTORY.EPUB in Sigil.

Before doing anything else:

Use Tools->Restructure Epub to Sigil Norm to make the internal file layout Sigil-friendly. Supposedly this step is no longer needed, but Sigil is always happier (that is, less crash-prone) when it has the default internal layout.

Use Tools->Reformat HTML->Mend and Prettify All HTML Files to render the text more reader-friendly.

This will make it easier to look at and work with.

  1. Converting the title, copyright byline, author byline, any special notices is most easily done simply by re-writing them in HTML with the header tags, rather than converting them (they are probably littered with font and size tags). SoFurry's default style sheet makes them left-aligned (that is, they start at the left side of the page). I have a personal preference for them centered. <div> <p> <br /></p> <p> <br /></p> <p> <br /></p> <h3 style="text-align: center;">The True And Potent Secrets of SoFurry Story Formatting</h3> <h4 style="text-align: center;">A Method of Cleanly Converting</h4> <h1 style="text-align: center; font-variant: small-caps;">Word Processor Documents</h1> <h2 style="text-align: center; font-variant: small-caps;">of All <i>Sorts</i> and <i>Descriptions</i></h2> <h3 style="text-align: center;">Into Gorgeous and COMPATIBLE SoFurry</h3> <h2 style="text-align: center; font-variant: small-caps;"><i>Journals and Text Story Entries</i></h2> <h5 style="text-align: center; font-variant: small-caps;"><b>As Practiced by</b></h5> <h5 style="text-align: center; font-variant: small-caps;"><b>That Celebrated and Reknowned SoFurry Author</b></h5> <h1 style="text-align: center; font-variant: small-caps; font-family: serif;">Onyx Tao</h1> <h5 style="text-align: center; font-variant: small-caps;"><b>Being a Faithful and Honest Rendering Thereof</b></h5> </div>

I invite you to use the above as a template.

_ IMPORTANT: There are three forced blank paragraphs at the top._ Without these three paragraphs of white space, your title will collide and interact strangely with some of the display boxes in the SoFurry site layout. The simplest way to avoid this is to avoid using the top three lines, and let your text start underneath the problematic section.

Technical Note: HTML can be funny about white space, and interpreting white space, and how white space may be displayed. Browsers can and will collapse white space, and may collapse empty paragraphs. To force the those lines to display, I put the Unicode character, non-breaking white space ( ) into the paragraph.

Technical Note: Another way in which HTML is funny about white space are line breaks: a new-line in the HTML file itself counts as white space, so an HTML end-tag followed by a newline will have a space after the content in the tag. This means that the HTML code:

(<b>bold text here</b>)

Displays as:

( bold text here )

and

( <b>bold text here</b> )

Displays as:

( bold text here )

Note the spaces between the text and the parends to either side. If you are seeing spaces where you do not expect them, this is a good first guess as to why.

Fix the CSS styling.

At this point, the text will be full of <span class=...> and <p class=...> tags. SoFurry does not permit us to embed our own CSS sheets directly into text. Doing that would open huge avenues for abuse of SoFurry and even serve as a vector for attacks on other websites. Therefore, we have to replace all the class=... statements with roughly equivalent style=... statements. I say roughly, because we want to ignore a great deal of the formatting. Our text will inherit a lot of just-fine styling from SoFurry itself. We want to concern ourselves primarily with text alignment (right, centered, or left), italics, and bolding.

Technical Note: What about nice typographical quotes? And other typographical niceties like ellipses, and long dashes, and all the other characters one needs to present text? Alas! SoFurry does not support Unicode characters. It will silently translate them into ASCII-compatible text. I don't know if this is a technical limitation of the software, a security precaution, or even a deliberate decision. All I know is, they will not work. If your story cannot be told without them, you must tell it somewhere else.

Converting From class to style attributes

Here is where Sigil's preview window comes in handy. Here is a a sample paragraph to convert: (Text used with the permission of the author)

<p class="para2"><span class="span4">"Oh, it's a demon all right," White Bull said confidently. "I can ... smell it. I smelled </span><span class="span5">demon</span><span class="span4"> when Rocking Hammer came over to ask me for my thoughts on his friend's letter. And now that I'm here, I can say confidently that a demon is trying to break into your mind, consume your soul, and take over your body. It's not having an easy time of it, probably because you don't believe in demons. It's probably showing itself to you, and </span><span class="span5">that</span><span class="span4">, I think, is what wakes you up, rather than seeing </span><span class="span5">it</span><span class="span4">. The demon is probably just as frustrated as you are, I shouldn't wonder."</span></p>

We need not bother with the typographical niceties: SoFurry will convert them to acceptable characters on its own. We need only replace fix the span tags. By comparing the text in the editor with the text in the preview window (which has the CSS styling applied to it), we can see what the styling ought to be. <span class="span4">, for example, is extraneous. The editor will allow us to strip it from the text. Use CONTROL-H to bring up the edit/replace window. We're using a fairly advanced feature, but we are using it in a simple way. In the Find: box, enter:

<span class="span4">([^<]*)</span>

This is regex string, and it finds all of the <span class="span4"> along with the matching </span> end tags and the text between them. The ([^<]*) is what matches the internal text. The outer parentheses do not actually mark anything; they are a signal to the processor to remember the text that matches the stuff inside to use in a replacement. The square bracket indicates we want to match one of some set of characters. That set is defined by the ^<. The caret ^ is yet another special marker, saying we want to match any characters that are not part of a set. The single less-than character itself < is the set of characters that do not match what we want to capture (that is, a set consisting of that single character).

The asterisk * is likewise important. The square brackets define what to look for; the asterisk tells it to look for zero or more matching characters, which allows the expression to match EVERYTHING in the span.

In the Replace: box, enter:

It is worth double-checking the slash direction. This is yet another special character, instructing the find-replace processor to insert the text that matched the stuff we called out as interesting (the stuff in the parentheses), and the specifically means the text in the first set of parentheses. In this example, there is only one set, but more complicated expression ... are utterly out of scope for these instructions.

For the options, make sure that Wrap is checked , and all other options are un checked. Set the Mode to Regex. In the box that probably says All HTML Files (but might say either Current File or Selected HTML Files , make certain it is set to Current File.

I urge you to REREAD the last few paragraphs, and be absolutely certain you have typed this in the way you think did.

The exceptionally brave may now click on the Replace All button.

More prudent editors may want to click on Find , to make certain they find the text they expect to find, and then Replace , to make sure what happens is what is expected. Those things being true, clicking on Replace All will then make the change throughout the remainder of the file.

All the span tags for class="span4" and their closing tag should be gone. My sample text will should now look like:

<p class="para2">"Oh, it's a demon all right," White Bull said confidently. "I can ... smell it. I smelled <span class="span5">demon</span> when Rocking Hammer came over to ask me for my thoughts on his friend's letter. And now that I'm here, I can say confidently that a demon is trying to break into your mind, consume your soul, and take over your body. It's not having an easy time of it, probably because you don't believe in demons. It's probably showing itself to you, and <span class="span5">that</span>, I think, is what wakes you up, rather than seeing <span class="span5">it</span>. The demon is probably just as frustrated as you are, I shouldn't wonder."</p>

The case of setting italics and bold is similar. In the sample above,

<span class="span5">demon</span>

should be set in bold. I'll go through the steps, with a little less explanation. It is very similar to what we did earlier.

The Find box should have:

<span class="span5">([^<]*)</span>

The Replace box will be a little different; we need to retain the span, but we need to specify the formatting. We want this:

<span style="font-weight: bolder;"></span>

The result:

<p class="para2">"Oh, it's a demon all right," White Bull said confidently. "I can ... smell it. I smelled <span style="font-weight: bolder;">demon</span> when Rocking Hammer came over to ask me for my thoughts on his friend's letter. And now that I'm here, I can say confidently that a demon is trying to break into your mind, consume your soul, and take over your body. It's not having an easy time of it, probably because you don't believe in demons. It's probably showing itself to you, and <span style="font-weight: bolder;">that</span>, I think, is what wakes you up, rather than seeing <span style="font-weight: bolder;">it</span>. The demon is probably just as frustrated as you are, I shouldn't wonder."</p>

And for italic? Bold? _ Both? _ Small caps? Italic and bold small caps? All you need do is put the right CSS formatting instructions into the style attribute:

<span style="font-weight: bolder;"></span> <span style="font-style: italic;"></span> <span style="font-style: italic; font-weight: bolder;"></span> <span style="font-variant: small-caps;"></span> <span style="font-variant: small-caps; font-weight: bolder; font-style: italic;"></span>

Finally, let us deal with the <p class="para2"> tag. In the Find: box, we need:

<p class="para2">

For Replace: we just need:

<p style="text-align: left;">

Now, it is simply a matter of repeating this throughout the document to strip out any remaining class tags.

I take advantage of this to ALSO override the SoFurry's default text justification setting. Professional typesetters and typesetting programs can make this look good; I have yet to see a browser's layout engine that can manage what is (I admit) an exceptionally challenging task. A pleasing ratio of white space to black text is ultimately an aesthetic decision. Professional, high-quality fonts are carefully hinted and designed to help layout engines maintain that aesthetic.

Justification, on the other hand, involves inserting white space into a line to keep a clean right edge. Done properly, this can be beautiful and pleasing -- but unless done perfectly, it changes the ratio of white space to text, which spoils the aesthetic of the text, the line, the paragraph, and the page. Professional typesetting packages STILL depend on human intervention to make this happen. Browser layout engines are not yet up to the task. I suspect it will require improvements and ultimately hinting instructions embedded in fonts before they are.

Consequently, I am opinionated about justification in text on the web. For A reader with atypical neurology associated with reading and text display, poorly kerned and poorly justified text can literally hurt to encounter.

Since each span tag is replaced globally, it only needs to be replaced once. Tags that needs replacement can be found by searing for the string class="

Technical Note: I use font-weight: bolder; rather than font-weight: bold; because bold is (supposedly) a specific, predefined weight (700, numerically) while bolder is relative to the weight of the surrounding text. A display system should attempt to render as best it can.

Colorizing Text on SoFurry

Text color is set using the style attribute on an HTML tag. A discussion of HTML tags is out of scope for this article, so I will discuss them in context of the div tag, which is sort of like the p (paragraph) tag, only less semantically meaningful. (A div is simply a division within content, while a p (paragraph) is that, as well as indicating it contains a paragraph of text. HTML gets a lot of its power from being a semantic markup language, so content within an HTML document ought to be marked with the right tags that define what the content is: this makes the text friendlier for tools that present the text in non-visual ways, or that analyze the text for semantic meaning.)

<div style="color: navy; background-color: #daa520; padding: 1px 3px 1px 3px;">

To set color, all you need is the color tag itself. However, SoFurry can change the underlying color of the page. I recommend that the colorized text have a specific background color, and you want it to look decent against both the light default and the Cozy version of SoFurry. To my mind, this means darker colors with high contrast. Here, I have used dark blue (navy) and a darker yellow (goldenrod (via numeric #daa520).

That is not enough. Without the padding tag, the background color will come to exactly the edge of the text, and look tight and restrained (aesthetically displeasing):

<div style="color: navy; background-color: #daa520;">

Note how the text at the left edge of the colored area is right up against the page background, rather than being surrounded by the goldenrod background. That is the effect of thepadding specification, which adds a few (3, in this case) pixels to the right and left edge, respectively.

Also worth noting is that, writing this section, I discovered that although a named color works for thecolor specification, attempting to setbackground-color worked only if I supplied the hexadecimal color number. In other words, this is fragile. It may break with the next site design. It may break because of some other reason.

Why show this with adiv tag and not aspan tag? For HTML reasons (that are out of scope for this article),padding does not apply to thespantag. You could do this with ap (paragraph) tag; I just think it makes more semantic sense to associate a color and background color with a less-specific sectional tag.

Specifying Font on SoFurry

Do not do this unless you have a an amazingly excellent and incredible reason to do it. It is both complex and will probably work only for you. There is no way to tell the user's browser where or how to load the font you specify, so if the font you pick is NOT on the user's system -- the user will see the browser's fallback sans-serif or serif font. This means you LOSE any typographic specifications that come from SoFurry itself.

As SoFurry defaults to sans-serif font with an option to switch to serif, specifying a sans-serif or serif font breaks site functionality.

I set a font in this article, but the reason I do so is to set a monospace font for HTML code and snippets (which are neither sans-serif nor serif). Monospace is best set on adiv orspan tag, like so:

I set a font in this article, but the reason I do so is to set a monospace font for HTML code and snippets (which are neither sans-serif nor serif). Monospace is best set on a <span style="font-family: 'IBM Plex Mono Text', monospace;">div</span> or <span style="font-family: 'IBM Plex Mono Text', monospace;">span</span> tag, like so:

Please note two things. First, I do specify a named font. I do that not because I expect anyone to have that font or see it on SoFurry, but because otherwise the display in MY EDITOR is suboptimal. That setting is for me. The critical part is the default fallbackmonospace Even that restrained finagling around font adds a lot of complexity for fairly minor effect. 99% of the time, it is totally not worth the trouble.

In Conclusion

Maybe that was NOT as easy as I made it out to be in the beginning ... having a little understanding of HTML, CSS, and even what regular expressions can do helps a great deal.

I hope you agree that the resulting beautifully presented text is worth the effort!

Cheers, Onyx Tao