LLM Writing Leaderboard & Comparison
Here we rank today’s LLMs on real writing capability through the Chatbot Arena ranking & specs.
Model | Creative Rank | Δ Rank | Context (tokens) | Input $/M | Output $/M | Cache $/M |
---|
How We Ranked the LLM Models
To spot true writing talent, we combined objective signals with hands‑on tasks.
- Chatbot Arena – Creative board
We start with LMSYS crowd‑votes for creative writing. This leaderboard captures real user preferences. - Delta vs. overall rank
A big positive gap shows a specialised writing strength worth noting. - Seven professional writing tasks
From SEO blog to scientific note, we test structure, style, and accuracy in the wild.
#1 Gemini 2.5 Pro

- Creative Rank: #1 (Δ 0)
- Context: up to 1 M tokens
- Input: $1.25 / M (<200k)
- Output: $10 / M
Blends climate data with glowing saplings in one breath—equally strong in fiction and research notes.
#2 ChatGPT‑4o

- Creative Rank: #2 (Δ 0)
- Context: 128k tokens
- Input: $5 / M
- Output: $20 / M
SEO king—hit 9/9 keywords yet still reads like a human. Fiction paragraphs flawless.
#3 Grok 3

- Creative Rank: #2 cluster (Δ +1)
- Context: ≈ 131k tokens
- Input: $3 / M
- Output: $15 / M
Raw, edgy voice—great for dystopian fiction and gritty copy. Rhyme still safe.
Why One‑Shot Prompts Aren't Enough for Quality Writing
LLMs models are designed to spit out the most likely words and phrases. They say only what people want to hear.
These models draw from a vast, generalized dataset, which cannot align with a very distinct style or subject matter.
You might also notice that LLMS’ text feels repetitive and robotics. The AI often falls into patterns, using similar phrases and structures across different pieces of content. This makes your content sound monotonous and predictable.
As a result, your content’s engagement will suffer if you get out of the loop of the writing process. Readers today expect content that delivers unique expertise & style. They want to hear from strong experts and characters and learn about personal stories and experiences.
How to Prompt LLMs for Quality Writing
- Give it editorial guidelines – share your publication’s voice, must‑hit arguments, and preferred tone.
- Provide a writing sample – paste 2‑3 standout paragraphs so the model can mimic cadence and phrasing.
- Make it build an outline first – ask for structure before prose to lock in logic and flow.
- Infuse it with your ideas – inject anecdotes, data points, and opinions only you can bring.
- Remove repetitions and add variety – run a final prompt to shorten cliché phrases, swap sentence lengths, and diversify openings.

Frequently Asked Questions
Creative rank on Chatbot Arena is crowdsourced—real humans judging story quality, imagery, and flow. It’s a quick proxy for narrative skill.
Up to a point. Longer windows reduce chop when you paste large docs, but can increase latency and cost.
Some providers discount repeat calls within minutes. If you loop over similar prompts, you pay the cache rate.