Adding AI tools to Scrivener

robyn · January 26, 2024, 10:31pm

Scrivener could add inline A.I. tools like Novel Crafter and have people bring their own API keys.

robyn · January 26, 2024, 10:34pm

I think A.I. tools would be very helpful for overcoming writer’s block and making corrections. Good tools do not supercede the writer but provide aid.

November_Sierra · January 26, 2024, 10:58pm

Reminds me of that day when the garage refused to fix my car because of mechanic’s block.

jcarman · January 31, 2024, 6:45pm

Even worse when they put your car up on blocks.

bbottema · May 30, 2024, 7:31am

Here’s the spell-checked and slightly edited version of your text:

I find this to be a little short-sighted. AI integration and having AI write content for you are two separate things. AI integration is about automating non-creative tasks, converting, brainstorming, discussing things with a buddy, and having language discussions, all in the context of what you’re working on right now. It involves giving descriptions of what you want and having it generate world-building items, summarizing your content in the notes, helping you maintain your world’s integrity, and warning you when you violate an established timeline. It’s about pulling reviews and editing earlier in the workflow.

There is no legal swamp here because we’re still not generating content with AI. You’re just leveraging AI within the context of your work. It’s a snugger version of copying and pasting your stuff to an external AI chat all the time. I think software like LivingWriter proves its merits.

Regarding third-party services being unreliable, that’s nothing new. All software deals with this, and it’s on the developers to ensure the software is robust enough to handle this and to provide clarity about availability. That’s about expectation management and being able to deal with network errors.

Really, the (premature) almost hostile stance against AI integration is a bit disheartening to see. Going forward, AI is going to play a much bigger role, and it’s here to stay. Over the past few years, more and more competitors have emerged and are becoming serious contenders compared to Scrivener. While Scrivener has strong features and community benefits, it’s lagging way behind in the cloud space (still no web client or Android app). If the power of AI-supported authoring continues to be ignored, Scrivener is set to become a boomer’s go-to tool and, like Japan’s pension system, will collapse due to a lack of younger users, even if its core features are stronger than its competitors’.

Let’s stay progressive and adventurous, like your users.

AmberV · May 30, 2024, 11:45am

Here’s the original version of your text, written and proofread by a human:

While Scrivener has strong features and community benefits, it’s lagging way behind in the cloud space (still no web client or Android app).

A subjective point of view. Spreading yourself as thin as a pat of butter over an entire loaf of bread is another kind of lagging behind. Either you spend your days supporting every other operating system you get asked about, and “web app”, and never really reach a level of sophistication as a result, or you do what you do best on the systems you actually enjoy using. So this is a form of taking whatever is important to us and looking for developers that don’t do it, and calling them “lagging behind” for that reason: i.e. subjective.

As for the rest, I haven’t played with “AI” too much, mainly because whenever I see someone posting their ChatGPT tips every other sentence is incorrect and the whole thing sounds like a bland PR consultant, leveraging and synergistically empowering users, left and right. If it gets that much wrong about the stuff I know, then why would I trust it to be a “writing buddy”? And if I can’t take it seriously because it sounds like corporate satire, why would I want it clogging up every single program I use and wasting developers’ time (like web apps waste their time, subjectively). Generated text looks about as appealing to me as generated art: it doesn’t.

Let the tech mature, or die off (hopefully), then maybe it will be worth consideration. But until then it is at best mildly amusing (like when Google suggested that adding 100ml of Elmer’s Glue to the pizza sauce will help the cheese from sliding off), but more realistically it’s kind of awful, and is already turning the Web into even more of a muddled layer of mediocre “content” than it was when the SEO experts got well into destroying it.

Is that a hostile take? I don’t think so, but some people do equate candid reactions with hostility, so maybe from some points of view, sure.

bbottema · May 30, 2024, 12:44pm

I’m sorry to say, but that does come across as rather hostile. Your negative opinion is based on anecdotal evidence, which is not a very strong position. It’s pretty clear you haven’t spent much time with AI, which is a pity, because you might actually discover how natural it is to converse with one and how much knowledge there is to be had (forget about the buzzwords). More importantly than just knowledge, it’s a marvelous brainstorming tool: forget about the data or info it gives—which may have been hallucinated to a certain extent, although less and less with each iteration—and again, forget about the buzzwords. Just try once and give it your idea or thought on something and ask it for alternatives or something like that. Just like a sparring partner, it bounces around ideas. You might find it stimulates your creative energy and boosts your motivation to write. I’ve even used it to analyze other novels, which gave me great insights as to how stories can be structured for example, or suggest plotting work flows. If you are even a little bit curious, give it a try at https://chat.openai.com/chat. Even if just to get a better grip on why you don’t like it, because not only is “whenever I see someone posting” not very useful in debate, it’s unfair to yourself. You can do better!

AI may not be for you if you’re set on ignoring it until you can’t anymore (it won’t die off, that much is clear). However, you should realize you are not the only type of user of tools like Scrivener. There are swaths of new users that will grow up with AI-infused software or are used to it already, and Scrivener needs to think of them too, or it will indeed fade away together with the people screaming at the clouds, rather than embracing it.

rms · May 30, 2024, 1:24pm

There are now plenty of ways now to use AI products and put the results into the author’s Scrivener manuscript. Scrivener is a tool for enhancing human intelligence.

I’m not keen to encourage Scrivener to do a lot of “integration” as I can see how it will dilute their development efforts, could possibly explode support issues, and perhaps introduce legal burden on copyright and all that (currently not yet tested in courts, AFAIK).

November_Sierra · May 30, 2024, 1:30pm

Scrivener is for writers. There are plenty of other options available for Letsomethingelseforthemwriters.

bbottema · May 30, 2024, 2:04pm

Agreed, but I fear you’re not seeing the bigger picture. I’ve been following this trend since the start, both as a user and as a developer, and I can confidently say, for a variety of reasons, that this type of AI is going to be ubiquitous, not least because it’s becoming very cheap to use and integrate as a service. It will be everywhere, and users—especially the future user base—will be expecting it.

I get your point about development being spread out; at the same time, other developers somehow manage. Also, Scrivener’s feature set appears to have remained largely consistent over the past few years. Broadly speaking, it has the same feature set as it did five years ago. It has matured, and I think it is ready for new adventures.

Regarding legal burden and copyright, again, AI integration doesn’t have to be about generating your content. In fact, I advocate against it. The whole point is to look at it as a tool to help you write more and better.

Personally, I’m sad about the mindset towards AI that I find here because I’m familiar with the synergistically empowering aspect of AI (sorry, Amber), and I miss it. I would have gone to another writing platform, but when I look at all the other competitors, I ultimately seem to be drawn back to Scrivener due to its compatibility with my specific workflow (I’m rather demanding in top-view drill-down navigation and quick in-place editing). Every other online tool seems to be great at a few things but often missing one thing I have on my list of requirements (like switching to the scene edit mode directly from its linked outline or note).

I’ve now examined Campfire Writing, Dabble Writer, Fictionary, LivingWriter, Novel Factory, NovelPad, Novlr, Plottr, Squibler, and WorldAnvil. Dabble and LivingWriter come very close to my needs, but Scrivener has it all—except AI integration and easy access from my Android devices. Now, I can get an iPad for the first, and I can even work with the latter, but it involves completely breaking immersion, having to copy-paste snippets all the time for analysis, discussion, editing, and switching programs. But it goes beyond that: take LivingWriter, for example. The AI integration is such that you can ‘talk’ or query your whole manuscript: “How many times did I mention Jake?”, “Is the timeline realistic?”, “I’m struggling with where to go with chapter 6; can you suggest a few possible routes?”, “Is my tone of voice consistent?”, “Did I resolve my log line?”. That’s just very neat, especially for complex novels, and from a technical point of view not even that difficult (and there’s a huge community around doing these kind of things). Of course, you can do without all that; writers have for ages. The same can be said for people traveling from A to B using horses until Ford had some bright ideas about that.

rms · May 30, 2024, 2:08pm

Feels like mind reading … I try to avoid that.

November_Sierra · May 30, 2024, 2:20pm

Horses are still a thing, though. People enjoy riding them, despite having access to (cheaper, faster, more convenient, etc.) cars. Not once have I met an equestrian wishing for horses to be more like cars.

AmberV · May 30, 2024, 2:43pm

I did not go into detail on the matter because it felt irrelevant, and left it to trust that I wasn’t just outputting language without having any knowledge of what I’m speaking of (ahem).

By saying I haven’t played with it too much, that shouldn’t be confused with my never having once tried it! Rather, I have in fact spent hours looking into it, mainly from the standpoint of what is capable on consumer-grade hardware (because putting privacy invading portals that pipe every user’s WIP into corporate AI servers in your software is a massive no-no). Thus, no interleaved multiple Nvidia H100 Tensor Core GPUs running through a PyTorch CUDA setup capable of loading a 140B parameter LLaMA 2 model (previous gen, but v3 is limited to 70B last I checked) with GPTQ 32g 4bit quantisation and SuperHOT/RoPE fine-tuning to allow for 32k token contexts, and a Scrivener-specific LoRa trained against it. Whew.

Such would limit our potential user base to those that can afford $32,000 USD per GPU and have an office that looks like a server room. More modestly, an 8–13B model, which can fit into the modern array of Mac low to mid-range hardware, with its universal memory setup, and using the GGUF k-Quant (Q4, which is the threshold in my opinion, where below that perplexity noise is too high) builds to allow for a reasonable token count on the context prompt, whilst still having enough left over for inference that isn’t in the sub-0.1 token/s output range. Of note, I had to boot my MacMini 8GB headless, and log in with a remote shell to avoid the full GUI layer from booting up, otherwise the LLM and so much as a Finder window would be too much for it. So I was basically using the Mac as a server for another machine—an abstract test purely for the sake of seeing what the absolute best for that economy-range hardware looks like.

The result? Crap. This is with something hogging almost the entire MacMini, so much so you can’t even log in without dropping the inference rate to 20 to 30 minutes for 200 tokens or so. With a slightly smaller model I could get a decent output rate while logging in and using it as a computer as well, but it really needs to be emphasised how wooden and “confused” a <7B model can be. There are some really cute tiny models, such as the LLaMA 1.1B, but it honestly doesn’t come off much different than Ask Jeeves in 2004. How it gets there is more interesting than the actual result though, because pushing that level of optimisation is how this stuff is going to be feasible.

So, I guess not using any jargon at all to discuss this is maybe a constraint I don’t understand, because we might as well talk about it completely and with technical awareness of the limitations and where they arise from, in available hardware and current modelling science. When I say it’s not ready yet, the above is the framework within which I say it. Because again, us delegating your private work to Meta or Google or OpenAI is unethical from our perspective and thus out of the question.

Until local models can provide a reasonable level of simulated discourse, it’s a toy. The only other alternative is us hosting something ourselves, and that isn’t something we have the time or expertise to do (never mind how expensive it would be for this to delegate to thousands of instances). I mean, I could probably set up some iron that does it, I have after all, toyed with a short-term big rig rental just to see what a big model can do in comparison to the small stuff—but just fooling around with tests is far away from saying, “trust us to keep your data private and for this thing to always work and not be silly”.

But enough exposition. Let’s see how the well-regarded (within its constraints) Mistral 7B v0.3 model works!

USER: Hello, could you check the following text for hostile tone, and let me know what can be improved?

LLM: The provided text seems to contain specialized terminology related to artificial intelligence, computing, and technology. While this may not necessarily convey a hostile tone, it could potentially be overwhelming or confusing for readers who are less familiar with these topics. To make it more accessible to a wider audience, consider simplifying some terms, providing explanations when necessary, or using analogies to help illustrate complex concepts. Additionally, breaking longer paragraphs into shorter, easier-to-digest sections can also improve readability.

Commentary: the language generation did not adequately compensate for the notion of it being a targeted response to the original quotation, which implored me to try using an LLM just once. The response was written to demonstrate how I have gone quite a bit further into that, and certainly further than logging into some service. So the advice to remove that intentional level of detail is in contradiction with the intent of the piece, and furthermore is irrelevant to the original query of checking for a hostile tone. The advice it does give is bordering on common sense—I shouldn’t need 95% of my Mac’s resources to come up with that.

Meanwhile, the average paragraph length does not strike me as being problematic. That it produced that critique could be why blog and news articles these days tend to consist of Twitter-length optimised single-sentence “paragraphs”, but this is speculation on my part.

Continuing on, using all of the previous tokens to simulate a conversation…

USER: I am less interested in whether there is technical jargon, as the discussion is about this technology. Simplifying that for a broader audience is not necessary. Please refer to the original question on whether the tone of the writing is hostile.

LLM: Upon reevaluation based on your request, the text does not appear to have a hostile tone. However, it could still benefit from being made more approachable to a wider audience by explaining certain technical terms or providing examples. Breaking up long sentences into shorter, more digestible parts can also enhance readability.

Commentary: it is still hung up on making the text more pleasing to a broader audience, despite my having specifically requested its generation be tuned to avoid that parameter. At this point, it has also shifted to sentence length instead of paragraph length. I’ll admit, I can write some breathless sentences.

USER: All right, please rewrite the sample text with an even less hostile tone, even though you did not detect one, but keep the technical jargon and do not mess with the overall paragraph and sentence structuring.

LLM: Here’s a revised version of the text, keeping the technical details while making it friendlier and more accessible:

bbottema:

Hey there! Let me share some thoughts about AI and writing tools.

I often suggest trying out new ideas and asking for alternatives or similar concepts. Today, let’s focus on AI-powered writing assistants.

While I might not delve deeply into specific aspects, rest assured that I aim to understand the subject matter before discussing it. My apologies if it seemed otherwise.

Recently, I’ve been exploring various AI writing models, focusing on their capabilities given consumer-grade hardware. One example is the LLaMA 2 model, but remember, it requires expensive equipment like multiple Nvidia H100 GPUs and a powerful setup that isn’t practical for most users. A more affordable option would be an 8–13B model, which can run smoothly on modern Mac computers with universal memory and GGUF k-Quant (Q4).

Even with these optimizations, the performance wasn’t impressive. Smaller models, like the LLaMA 1.1B, show potential, but they’re still quite limited compared to larger ones. It’s important to note that until local models can mimic human-like conversation effectively, they’ll remain toys rather than reliable writing partners.

In conclusion, while we appreciate your work and wouldn’t want to jeopardize its privacy by outsourcing it to large corporations, currently, there aren’t many viable options for AI-assisted writing. We’d love to develop our own solution, but we lack both the resources and expertise required to create a robust, private system. For now, let’s continue experimenting and learning together!

Commentary: so, this is you it is quoting, lets be clear. Not only did it rephrase my words and put them in your mouth, it doesn’t even get the facts straight to point of committing word salad at some points. At least it got the quotation syntax right.

But hey, now it is apologetic, chipper, positive, and affirmative… and bland! Note, I specifically did not prompt for it to be more accessible, but at least the “even less hostile” equating to “friendlier” part worked. To speak of the result, I would personally feel put off by a response like this from someone though, as it comes off patronising to me, so whether it is truly friendly is a matter of taste I suspect.

There is nothing useful in that for me to work with, so I’ll stick with the original human generated variant.

bbottema · May 30, 2024, 2:52pm

That’s because you were not there when the revolution happened

However, like you, the people at first were not so receptive of cars either. Cars became ubiquitous of course and every horse and wagon company would go bankrupt if they didn’t adapt. Except for, as you say, vendors for the rare horse aficionado or more commonly, the quaint romantic experience of being driven around in a horse carriage. Sports remains ever popular, though.

fto · May 30, 2024, 2:59pm

@bbottema @November_Sierra Could it be that one of you is older than 40 and the other younger?

bbottema · May 30, 2024, 3:07pm

I guess you felt the need to demonstrate your extensive knowledge on the topic as if it would invalidate my points somehow. Your detailed response didn’t go unnoticed; however, it didn’t address any of them. What was the point, really? Was it merely about keeping your data local, which can’t be done with current models? If so, how is that related to Scrivener offering such capability?

Users of tools like Scrivener can decide for themselves if they want to use AI for their benefit. All software developers face this issue; it’s nothing new or unique. Do you think all companies are keen to share their data from Word or OneNote with Microsoft? Of course not, but it’s still available for the rest of its user base. What about mind mapping software like TheBrain? All opt-in, of course, as it should be. I fail to see how this relates to your venture into trying to reword your hubristic comment on language models and GPUs and failing because you use local hardware. Whew.

bbottema · May 30, 2024, 3:09pm

I’m right on the edge

AmberV · May 30, 2024, 3:41pm

I guess you felt the need to demonstrate your extensive knowledge on the topic as if it would invalidate my points somehow.

Yes, the somewhat tongue in cheek response was specifically to address this point:

It’s pretty clear you haven’t spent much time with AI, which is a pity, because you might actually discover how natural it is to converse with one and how much knowledge there is to be had…

To clarify the exercise: we would not be comfortable with sending user data (even if there is informed consent) to a third-party, given the choices available for doing so—other people doing that doesn’t change this. Other people send your usage patterns in the software to third-party analytics services. That doesn’t, in and of itself, mean it’s fine so long as you have an opt-out checkbox buried in settings somewhere. Thus if you are of that inclination, this leads to the examination of privacy-friendly local models, which are as demonstrated, not very insightful or useful, yet.

They almost certainly will be at some point, and then we’d be comfortable looking at it in theory. Questions still arise as to whether this is largely a marketing gimmick or something that really needs to be inside the software itself, though. By that I mean, telling people to Alt+Tab is a perfectly viable approach to using a computer, but telling people you don’t have to Alt+Tab could be seen as a gimmick.

As an aside: as a developer you might be interested to note that Scrivener’s compiler is programmable for plain-text output, which you’d want here, via its Processing compile format pane. It would be trivial to create a compile Format that generates a prompt and interfaces with an API, and in turn updates something Scrivener monitors on the disk, like the Scratch Pad folder, as an output.

kewms · May 30, 2024, 3:47pm

I’m not sure “using more power than entire countries” is part of my definition of “cheap.”

The independent AI vendors are not profitable at this time. The in-house projects at places like Google and Microsoft are not releasing profitability metrics. It is not at all clear how “cheap” these tools will be once the actual cost is being supported by the user (or by ads targeting the user). There are certainly abundant examples – especially online – of services that attempted (successfully or not) to monetize themselves only after attracting significantt numbers of users.

fto · May 30, 2024, 3:48pm

@bbottema Mr. … would not have been offended by the way you called him. Of course I don’t know him, but I know him well enough from this forum to know that he’s pretty smart and relaxed.

I fear a generational conflict is raging here. Or let’s say you realize that not everyone is the same. And that also has to do with age. People like me say “please” when they ask the AI something. And then “thank you” when they have received the answer. Completely absurd, I know. But I wouldn’t feel comfortable doing it any other way. So bear with me if I don’t feel like having a “sparring partner” whose name I can barely pronounce.