True, but with LLMs you have a quality tradeoff with smaller models. I have used a lot of these over the years, and it really is noticeable. So the amount of available RAM you have will limit you in that regard.
I've been playing around a lot with the Qwen3 235B model, and that thing eats 90Gb of system RAM on top of 20Gb of VRAM. Granted, that is overkill for most creative writing exercises, but modern models have gotten quite big, and the smaller ones tend to be quirky at best.
I gave the abliterated GPT-OSS 20B a try with some zoo content, and it went off the rails after just a few prompts. But perhaps some of the smaller Qwen3 or Mistral models could do well in a more resource constrained setting, those tend to stay sane for longer in this context.