Thoughts on AI checkers

EmilyMiller

Vindicated Vixen
Joined
Aug 13, 2022
Posts
16,143
I posted this in How To… in response to a question. It might be of use to others. Though, as I say, I’m not an expert in this area:



The site used AI detection software, which is highly likely to generate false positives and false negatives […]

I’m not an expert. But I believe the software looks at two things. Do the words chosen in the text vary from the norm? Is the sentence structure and length variable or consistent.

AI written text tends to have rote word choices and unvarying sentence structures and length.

So:

The cat sat on the mat.

The feline arranged itself on the shag rug.


The second probably scores a lower probability of AI.

Or:

Emily woke up in the morning. She went and brushed her teeth. Then she got dressed for work. The bus was late this morning.

Emily woke. It was still early. She brushed her teeth, tried to figure out something appropriate for work, and got dressed. Waiting at the bus stop, it was raining; where the fuck was the bus this morning?


The second probably scores a lower probability of AI.
 
Last edited:
An AI is trained on massive amounts of data so that it knows the most probable next words given a half-completed text. But it choses the next word from a list of 20 or 50 or 100 depending on how much freedom you give it (set it higher and you start to get more and more nonsense). One AI looking at another AI's output can say that the next word the other AI chose was within the probablities to be chosen and so *could* have been generated by AI.

I suspect that both of your example paragraphs are well within the most likely string of texts and thus are just about equally likely to be AI - you just haven't put anything unique into the second example.

I've no real idea how it would be programmed, but Generative Ai works of tokenization of words not letters. Therefore mispelled words really are a genuine tell that something was human created. The other way round isn't necessarily true.
 
Prompt:

Write a short story about ______ in a way that is indistinguishable from human writing enough that it would fool an AI detector.



Prompt:

Check this piece of writing to detect if it was written by AI.


 
Prompt:

Write a short story about ______ in a way that would fool an AI detector.



Prompt:

Check this piece of writing to detect if it was written by AI.
You can ask an LLM to tell you how to fool an
AI detector, but it’s answer will be as reliable as any other LLM answer on an area which is not fully settled.
 
I understand it's become a huge problem for many here lately. Although to be honest I'm having a tough time understanding why.

I've had a grand total of one story rejected for AI. I contacted Laurel, asked for it to be re-examined and it was published almost immediately.

That said, helpful hints to avoid being rejected in the first place can be, well... helpful.

Although I honestly dont know if any of the multiple suggestions/Tips I've read through the multiple posts on the topic actually make a difference, especially since we don't know EXACTLY what's triggering the red flags.

Although I'd say if anyone is using AI to edit their work, that's gonna be an obvious problem and increase the odds of rejection for sure.
 
You can ask an LLM to tell you how to fool an
AI detector, but it’s answer will be as reliable as any other LLM answer on an area which is not fully settled.

Fully settled?

If the AI that does the writing knows the criteria of an AI checker, it just needs to avoid those criteria while writing.
 
There's also the issue of burstiness. AI checkers aren't just looking at word choices, they are looking at sentence length and structure.
This is something of an oversimplification, but because AI is picking the average word choice you get average length (10-20 words) sentences pretty consistently.
If you are using varying sentence lengths consistently in your writing, which you should be anyway, you are less likely to get flagged as AI.
 
I understand it's become a huge problem for many here lately. Although to be honest I'm having a tough time understanding why.

I've had a grand total of one story rejected for AI. I contacted Laurel, asked for it to be re-examined and it was published almost immediately.

That said, helpful hints to avoid being rejected in the first place can be, well... helpful.

Although I honestly dont know if any of the multiple suggestions/Tips I've read through the multiple posts on the topic actually make a difference, especially since we don't know EXACTLY what's triggering the red flags.

Although I'd say if anyone is using AI to edit their work, that's gonna be an obvious problem and increase the odds of rejection for sure.
It’s documented how AI checkers work. They look at perplexity and burstiness (the two things I mention above). And they typically employ machine learning to do this.

How LLMs themselves work - in principle, as opposed to their operation being explicable - is also documented. There was an interesting piece of research recently which proved that they retain all the ingested text (not in a straightforward way) and can regurgitate novels word for word by iteratively feeding their output back into them. This blows away the lie that they are not infringing copyright but ‘discarding’ training material.



Again, I’m not an expert.
 
Last edited:
If you are using varying sentence lengths consistently in your writing, which you should be anyway, you are less likely to get flagged as AI.

This.

"Don't write robotically" is good advice whether you're a robot or not. I think the concern is that it won't be long before AIs are smart enough to start taking that advice themselves.
 
Fully settled?

If the AI that does the writing knows the criteria of an AI checker, it just needs to avoid those criteria while writing.
You are thinking that the AI is a sentient, rational being instead of something regurgitating a statistically probable set of tokenized words. Admittedly, I think the human brain does something similar at some stages of thinking and communicating, but the idea that it is figuring the problem out this way is...just not where AI is at at the moment.
 
If the AI that does the writing knows the criteria of an AI checker, it just needs to avoid those criteria while writing.
Interesting (OK maybe that’s debatable) idea. Prompt an LLM to write a 20,000 word story which is guaranteed to not be detected by AI checkers. Then submit it to Non-Erotic and see what happens.

Aside from this wasting the site’s time, it might be illuminating. Asking an LLM to do something and it actually doing it are two different things.

At work recently, one of my colleagues got advice as to the legality of a certain practice (from a data privacy and protection POV) and Copilot (which is basically CharGTP) came up with an opinion that was immediately dismissed as wholly erroneous by two legal counsels.
 
regurgitating a statistically probable set of tokenized words.

While it may be doing this, it is doing it within the criteria of its instructions.

The LLM can potentially use hundreds of thousands of words in its instructions to determine which words it should use.

AI detection is like an arms race.
 
Words aren't words to AI. It doesn't actually understand anything it writes. AI works with numbers, and each word is associated with a number. When you give AI a prompt, it will find words that, according to all its data, relate to each other in a way that is typical to how humans write. Except that isn't how we write, honestly. AI is stiff. AI is soulless. AI is slop. But it's not about your sentence length at all. It's merely about how close you are to arranging your numbers (words) like AI would. This is what sets off the AI detectors. They read the numbers in the words and see how closely those words relate in a way that would have been generated by AI. These AI checkers aren't always correct. People with autism (like myself) tend to set of AI checkers from time to time because I relate words to other words that AI might, just because my brain works differently than a neurotypical person. This whole topic is... interesting. But at the end of the day, your sentence structure matters. That is all. Just how you talk. Be unique with your voice. Something so genuine and so human that AI cannot ever hope to replicate it.
 
It doesn't actually understand anything it writes. AI works with numbers, and each word is associated with a number. When you give AI a prompt, it will find words that, according to all its data, relate to each other in a way that is typical to how humans write.
Yes, this ☝️

It’s a statistical inference machine that iteratively works out the most likely next word, word element, or whatever based on what millions of humans have written. It has no expertise beyond aggregated human expertise. It doesn’t know anything.

But, replying 🤷‍♀️ to a prompt annoys users, so it’s programmed to fabricate when it can’t find a response with a high statistical match.

AI can be useful for creating bland shit that doesn’t matter - my go to example is annual objectives at work. It can faithfully (mostly) summarize and regurgitate stuff that is common agreed knowledge. But it has no expertise. Relying upon it for anything important is foolhardy in the extreme. As my colleague found out.
 
Yes, this ☝️

It’s a statistical inference machine that iteratively works out the most likely next word, word element, or whatever based on what millions of humans have written. It has no expertise beyond aggregated human expertise. It doesn’t know anything.

But, replying 🤷‍♀️ to a prompt annoys users, so it’s programmed to fabricate when it can’t find a response with a high statistical match.

AI can be useful for creating bland shit that doesn’t matter - my go to example is annual objectives at work. It can faithfully (mostly) summarize and regurgitate stuff that is common agreed knowledge. But it has no expertise. Relying upon it for anything important is foolhardy in the extreme. As my colleague found out.

Correct. That's exactly how it works, and knowing how it works helps to combat AI checkers. If you find your own unique voice, develop your own writing style and treat writing like the art it is, you won't get a false positive. Be so unapologetically human that there is no way AI could have ever come up with it. Be bold with your diction. Choose the strange word. The obscure metaphor. Push yourself to treat your writing like oil paintings, and give it your layers, give it your color. AI can never replicate this.
 
Correct. That's exactly how it works, and knowing how it works helps to combat AI checkers. If you find your own unique voice, develop your own writing style and treat writing like the art it is, you won't get a false positive. Be so unapologetically human that there is no way AI could have ever come up with it. Be bold with your diction. Choose the strange word. The obscure metaphor. Push yourself to treat your writing like oil paintings, and give it your layers, give it your color. AI can never replicate this.
Agreed 👍
 
The site used AI detection software, which is highly likely to generate false positives and false negatives

Take this with a grain of salt. Us forum dwellers/Lit authors don't have access to behind the scenes information like this. Not only do we not know what combination of software and/or human skill they're using to detect AI, we don't even know how many of these claims of 'false positives' are true, because if anyone has read through enough of these claims here in AH, you'll notice some sketchy patterns.

1. Many of the people claiming a false positive have AI avatars (which doesn't mean their story is for sure AI, it just means they like using AI)
2. Often, people posting about their false positives here eventually admit to using AI software like Grammarly or a translator (which is a genuine mistake on their behalf, but means it wasn't a false positive)
3. The term 'Highly Likely' in regards to false positives/negatives is speculative at best, misleading at worst (because we have no way of knowing how many false positive/negatives there actually is)

I think there's obviously going to be false positive/negatives, but I think it's very possibly that lazy writers are trying to use AI to help increase their quality and productivity because they find the process of writing to be daunting or boring, so they want to skip to the 'publish my story and gain followers' phase. (Equally as speculative as OP's claim. Just my personally suspicions.)

It's just, with thousands of stories being published without false positives, it seems like a leap to claim false positives are 'highly likely' or anything above 'rare occurrences'
 
My own experience has been with Grammarly Pro. So I have good spelling, grammar, and punctuation--except when I choose to make an intentional error in dialogue, to fit the character conception of the speaker.

Suggestions for extensive rewrites are often rejected or rarely accepted, but I rewrite in my own words.

Some of Grammarly's suggestions are clearly wrong. Its absolute preference for the active voice is annoying at times. I use the passive voice for a reason when I use it, though I generally avoid it in straight narration.

I have never been flagged for AI usage.
 
My own experience has been with Grammarly Pro. So I have good spelling, grammar, and punctuation--except when I choose to make an intentional error in dialogue, to fit the character conception of the speaker.

Suggestions for extensive rewrites are often rejected or rarely accepted, but I rewrite in my own words.

Some of Grammarly's suggestions are clearly wrong. Its absolute preference for the active voice is annoying at times. I use the passive voice for a reason when I use it, though I generally avoid it in straight narration.

I have never been flagged for AI usage.
Many people have said that they use Grammarly and have never had an AI rejection. I think when people use it on the way you describe, there is no problem, as borne out by multiple writers’ experiences.

It appears - and this is based on site convos, albeit a while ago - that the issue is solely when you accept Grammarly’s suggested text, which - as pointed out by others - leverages LLMs (because you can’t sell a product without saying ‘Now with added AI’ nowadays 🙄).
 
Back
Top