AI Analysis

How to tell if a newsletter is AI-generated

June 23, 2026 • 9 min read • Newsletrix

TL;DR

You cannot prove a single email was written by AI, and the detectors that promise to are built for essays, not newsletters. The reliable tell is a flatness signature across a sender's archive: sentence-length variance collapses, Flesch reading-ease sticks in a 55 to 65 band, paragraph sentiment goes uniform, and hooks repeat the same openings. Read ten issues, not one. A competitor whose copy suddenly flattens is usually cutting editorial cost, which is the real intel, not a quality verdict.

If you want to tell whether a newsletter is AI-generated, the first instinct is to paste it into a detector and read the percentage. Resist it. That number is theater. The honest answer is harder and more useful: you read the sender's back catalog, you measure how their writing varies, and you decide on the pattern, not a single send. Across the publications we track in the Newsletrix corpus, the AI-drafted ones give themselves away through consistency, not any one phrase. Here is what holds up, and what to ignore.

Why one-shot AI detectors fail on newsletters

GPTZero, Copyleaks, and Grammarly's AI detector all work the same way. They score a block of continuous prose against a model of what machine text looks like and hand back a confidence figure. That design assumes the input is a student essay or a long article: a few hundred words of uninterrupted writing on one topic. A newsletter is none of those things.

A typical issue is chopped into short blocks, stitched together with links, padded with product names and UTM-tagged buttons, and then run past a human editor who rewrites the intro and the call to action by hand. Feed that into a detector built for essays and you get noise. We have watched the same Substack issue score 12 percent AI on one tool and 81 percent on another the same afternoon. Both numbers are guesses dressed up as measurements.

The failure runs both ways. A careful human writer who favors clean, plain sentences gets flagged as AI, because plain prose looks probable to the model the detector runs. And a lightly edited AI draft sails through clean, because the human edit on the opening drags the score down. So you get false positives on your most disciplined competitors and false negatives on the lazy ones. Stop pasting emails into detectors and start reading archives.

The five tells that actually hold up

These are structural patterns, not banned words. A word list goes stale the week a new model ships. The shape of the writing is harder to fake and easier to measure, and four of these five you can quantify with the tools at the end of this page.

Sentence-length variance collapses

Human writers swing. A four-word punch, then a 38-word aside, then a fragment for effect. Models smooth that out because the most probable sentence sits near the middle of the distribution they learned. In the AI-drafted sends we have flagged, sentence length hugs a 15 to 25 word range almost the whole way down, with a standard deviation under 6 words. Human-written issues in the same niche routinely swing across a 30-word spread. Paste an issue into a counter and look at the variance, not the average.

Readability clusters in a narrow band

This is the same effect from a different angle. Flesch reading-ease for AI-drafted newsletters parks in a 55 to 65 band and barely moves issue to issue. A human team writing without a model produces a much wider scatter, because some weeks the writer is sharp and some weeks they ramble. When a sender's last ten issues all land within a few points of 60, that is not discipline, that is a model holding the wheel. Our readability sweet-spot guide covers why 60 is the magnet number.

Sentiment goes uniform paragraph to paragraph

Run a sentiment pass across a human-written issue and you see movement: a warm open, a tense problem statement, a hopeful resolution. AI drafts tend to hold one emotional register the whole way through, usually a mild, agreeable positive. There is no dip, no edge, no paragraph that lands harder than the rest. We treat flat per-paragraph sentiment as one of the stronger signals, and it is the one most operators never think to check. The sentiment analysis guide walks through how to read the curve.

Hooks converge on the same openings

Models love a handful of opening moves. The rhetorical question. The "imagine if" setup. The "in a world where" frame. The single-sentence stat-drop. A human writer reaches for these too, but across a year they mix it up. An AI-run newsletter recycles the same three or four hook shapes on a loop, because the model keeps returning to its highest-probability openers. Line up the first sentence of a sender's last ten issues. If you can sort them into two or three buckets, you have your answer. Our guide to the first 100 words breaks down what a strong, varied hook looks like.

The em-dash habit and other punctuation tics

This is the soft one, so weight it least. Current models overuse the em-dash (the long dash, not the plain hyphen) at a rate that stands out once you notice it. They also lean on tidy lists and the "it's not X, it's Y" construction. None of these prove anything alone, because plenty of humans write that way. But when em-dash density runs three or four times higher than a sender's older archive, alongside the four structural tells above, the picture sharpens.

Measure the flatness yourself

Drop any newsletter's text into the Newsletrix readability calculator to see its Flesch score and sentence-length spread in seconds. Run it across a few issues from the same sender and watch whether the numbers move or sit still.

Try the readability calculator →

Read the archive, not the email

Everything above only works in aggregate. This is the part the detector vendors cannot sell you, because there is no single-button product in it. A model can produce a paragraph indistinguishable from a sharp human writer on any given day. What it cannot do is fake variance across twenty sends. Variance is the human fingerprint, and a model has none to give.

So the method is simple to describe and a little tedious to run. Pull the last ten to twenty issues from a sender. You do not need to subscribe to do this; most of it is sitting in public archives and the Wayback Machine, which we cover in how to read competitor newsletters without subscribing. Then score each issue on the four measurable tells and look at the spread, not the per-issue number. A human team scatters. A model clusters. The tighter the cluster, the higher the odds.

This is also where the honest tradeoff lives. The archive method is slower than pasting one email into a tool, and asks you to gather ten sends first. If you need an answer in thirty seconds, you will not get a trustworthy one. The speed you give up buys you a conclusion you can defend, instead of a percentage you will quietly distrust. We think that trade is worth it every time, but it is a real cost and worth naming.

What AI writing signals about a competitor

Here is the reframe that makes this worth your time. Catching a competitor using AI is not gossip, it is operational intelligence. When a publication's copy flattens across a clear date, something changed on their side. Usually it is budget. A writer left and was not replaced. Someone decided the newsletter was a cost center and handed it to a model.

That shift tells you where they are vulnerable. A newsletter on autopilot stops responding to its audience. The cadence might hold, but the spark goes, and engagement decays on a lag of a few months. If you spot a rival flatten in March, their open and reply rates are usually softening by summer even if their subscriber count looks fine. That is the window. You can read the same change in their content mix and timing, which ties into the content gap analysis approach and the tooling on the Panoramata comparison page.

Now the opinion a generic answer will not give you. An AI-written newsletter is not automatically a worse newsletter. I have seen one-person operators draft with Claude or GPT, edit hard, and ship something with more voice than a five-person content team phoning it in. Authorship is not the verdict. Uniformity is. The competitor who should worry you is the one using AI and editing it into something with a pulse. The one you can take share from is the one who pastes the model output straight in and hits send. The tells on this page separate the two.

How to run the check without subscribing or guessing

Pull the archive, then score it on the tells you can measure. The readability calculator gives you the Flesch band and sentence spread. The hook tester lets you compare openers across issues to see whether they repeat. Run both across ten sends and lay the numbers side by side. If readability sits inside a five-point band and the hooks sort into three buckets, you are looking at machine output. If both scatter, a human is still writing it.

For the operational layer, which ESP they use and whether their sending behavior changed alongside the copy, the ESP detector reads the platform straight from a recovered HTML file. And if you want the whole archive scored at once instead of issue by issue, the Newsletrix AI analysis engine runs readability, sentiment, and hook scoring across every send it ingests. That is the per-archive view the manual method builds by hand. Background on how that scoring works lives in understanding AI newsletter analysis.

Whichever route you take, hold the same discipline: never call it from one email. The detector industry trained everyone to want a single-send verdict, and that is the one thing the data cannot honestly give. Read the archive, measure the variance, and let the spread tell you. A model cannot hide in a year of sends, no matter how good any one of them looks.

Frequently asked questions

Can you prove a newsletter was written by AI?

No. There is no test that proves a single email was drafted by a model, and anyone selling you a percentage score is selling false confidence. What you can establish is a probability across a sender's archive. When ten or more issues from the same publication share a flat readability band, uniform paragraph sentiment, and the same opening formula, the odds tilt hard toward AI drafting. One email proves nothing. A back catalog with no human variance is the real signal.

What are the signs a newsletter is AI-generated?

The signals that hold up are structural, not vocabulary based. Sentence-length variance collapses so almost every sentence runs 15 to 25 words. Flesch reading-ease clusters in a narrow 55 to 65 band issue after issue. Sentiment stays uniform from paragraph to paragraph with no emotional swing. Hooks converge on a handful of formulaic openings. And em-dash density runs far above what a human marketer types. Any one of these can show up in edited human copy, so weight the combination across an archive, not a single send.

Do AI detectors work on emails?

Not reliably. Tools like GPTZero, Copyleaks, and Grammarly's detector were trained on student essays and long-form articles, where the input is hundreds of words of continuous prose. A newsletter is short, chopped into blocks, full of links and product names, and often heavily edited by a human after the model drafts it. That mix produces both false positives on real human copy and false negatives on lightly edited AI copy. Pasting one email into a detector is the least reliable method on this page.

Why does AI writing have flat readability?

Language models optimize for the most probable next token, and the most probable sentence length, clause structure, and reading level sit near the middle of the distribution they were trained on. The result is prose that lands in a tight readability band and stays there. Human writers vary wildly: a punchy four-word line, then a 40-word aside, then a fragment. That swing is what AI drafting smooths out, and it is the easiest pattern to measure across an archive.

Is AI-written content bad for a newsletter?

Not on its own. A skilled operator who drafts with a model and edits hard can ship a genuinely good newsletter, and authorship is not a quality verdict. The problem is uniformity. When every issue reads at the same pace with the same emotional flatness, engagement decays because there is nothing to react to. So the question is not whether a competitor used AI, it is whether they edited it into something with a pulse. Most do not, and that is the opening.