Statistical Analysis of LitE Stories

I'm suspicious of the Flesch-Kincaid scores. Novels and Novellas had one of the highest grade scores, so I went there and checked the last three days of stories. That's not a lot of stories.

Using the "style" tool on my desktop, the scores ranges between 3 and 7. Nothing approached the average of 10.8. I put some samples through on-line tests and came up with similar, but -- at least for Flesch-Kincaid -- slightly higher scores. Still, nothing approached 10.8.

Flesch-Kincaid uses the number of syllables/word and the number of words/sentence. Different automated tools are going to handle those variables differently.

I doubt that any of them count syllables/word. It's common to assume a representative number of words/syllable and estimated syllables/word from the average length of the word.

The measure of words/sentence is probably more of a problem. If the software assumes that sentences end with a full stop, then sentences ending with a question mark or exclamation point will be lumped into adjoining sentences, giving a higher score. If the software uses question marks and exclamation points to end a sentence, then sentences including dialog are splintered, leading to lower scores.

Generally, dialog in the analyzed text should result in a lower score, because dialog is usually a set of short sentences.

Flesch-Kincaid is commonly used in the US, but the rating was based on Navy technical manuals, and it was not intended for use with fiction. You can expect variable results.
I know very little about F-K scores. Here are some 1-page EC stories with very different F-K scores:
Following Your Lead (3.23)
Campsite Tease (10.01)
Karen Takes The Lead (18.01)
 
I know very little about F-K scores. Here are some 1-page EC stories with very different F-K scores:
Following Your Lead (3.23)

It scores a reading level of 1.6 on Flesch-Kincaid using "style." I looked at a few paragraphs. It's first person, present tense and the writer uses very short sentences to give it a very immediate feeling. It earned it's score.


This scored a Flesch-Kincaide grade level of 4.6 using "style" -- similar to what I usually get. I skimmed the story for sentence length and word length and in that sense it appears similar to my writing. I don't think anyone needs a 10th-grade reading level to follow the story.


"Style" gave it a Flesch-Kincaid grade level of 6.5. The story isn't all narrative, but there is quite a bit, and the sentences tend to be long. Word length isn't a problem, the higher grade level must come entirely from the sentence length. Do you need a high school diploma to read the story? No.

For what it's worth, the formula for Flesch-Kincaid is (according to the "style" manual):

Kincaid = 11.8*syllables/wds+0.39*wds/sentences-15.59

"style" says the last story had 1.23 syllables/word (many one-syllable words, despite things like "ohhhhhhhhhhh fuuuuuuuuuuuuuccccckkkkkkkkk?" How do you draw out a consonant like that?) and 19.3 words/sentence. The formula gives a grade level of 6.45.

There is at least a correlation between the scores you got and the scores I got.

EDIT:

I did a manual check on the first paragraph of "Karen Takes the Lead." "Style" gave it an automated grade level score of 14, based on 1.43 syllables/word and 32.5 words/sentence.

My manual check agreed completely on the number of words and sentences. The author could have used an editor. The two sentences in the paragraph were comma-spliced, and could have been much shorter. I counted 94 syllables for 1.45 syllables/word. The manual check gave a grade level score of 14 -- an exact agreement.
 
Last edited:
Jim Morrison’s quote “There are things known and things unknown and in between are the doors” probably sums it up better. 🌹Kant
Strictly speaking, that's Aldous Huxley from The Doors of Perception - which is where Jim took the band name from.

Trust me, this is from the man who has only ever owned one band T-shirt, with Jim's golden lion image on the front :).
 
There is at least a correlation between the scores you got and the scores I got.
Again, I don't know much about F-K, but the numbers I'm getting seem to have statistical meaning.

Looking at this story, the software I use says that it has 4236 words, 261 sentences, and 5780 syllables. It gives it an F-K grade level of 6.84. 11.8*(5780/4236)+0.39*(4236/261)-15.59 is 6.84. Word says it has 4258 words.
 
My contention isn't that no one will read a chapter, if they haven't read the previous one. The point is that consumption of multi-part pieces of most kinds of media tends to decline as you progress from one part to the next. This makes simple, logical sense, and if you take large samples, you'll find that the trend generally fits.

.

We agree on this. With very long series there is a long-trend that sees a decrease in the number of views from one story to the next, which is evidence that readers tend to drop off as the series goes along, but that the rate of dropoff declines over time, and generally is highest from chapter 1 to chapter 2.

My point is that you have to be cautious about estimates of how many people read chapter X based upon how many people viewed chapter X+1. Another reason one might want to be cautious is that some potential new readers, seeing the story series for the first time on the hub page, may click on Chapter X+1 as a way of getting at Chapter X. We really have very little idea. We know that the number of original readers is somewhere between views and votes, and my guess is that for nearly all stories it's not too near either one of them , but that's not much more than a guess.
 
Unless I'm missing something, I just realized for the first time that Literotica's search function no longer allows you to search for stories that have a red H. That makes it impossible to determine easily for long periods of time what percentage of stories have red Hs. That's too bad. I don't know why the site removed that function. I think it's always better to make more, not less, information available. It would be interesting, for instance, to look at a year's worth of data to compare it with 8Letters' more recent information.
 
but that the rate of dropoff declines over time, and generally is highest from chapter 1 to chapter 2.
I reckon "settling in to read this thing" should be seen by chapter 3 - it's certainly obvious in my four year old shaggy dog story, and my latest shows almost the same curve (surprisingly the same). On that basis alone, I reckon a stand-alone story will be the same as a chapter one: reads = approx one fifth the number of views.

The thing I find odd is that the vote per view ratio remains quite consistent across the whole chapter set. How come all those folk who voted for chapter one, and presumably read at least chapter two, didn't keep voting in the same numbers on subsequent chapters? That's counter-intuitive to me. It's as if the 1:50 ratio or 1:100 ratio sets itself on chapter one, and that ratio then governs the remainder. Which is strange.
 
Again, I don't know much about F-K, but the numbers I'm getting seem to have statistical meaning.

Looking at this story, the software I use says that it has 4236 words, 261 sentences, and 5780 syllables. It gives it an F-K grade level of 6.84. 11.8*(5780/4236)+0.39*(4236/261)-15.59 is 6.84. Word says it has 4258 words.

"Style" tells me there are 398 sentences and 4223 words, and that's probably the biggest difference. It also gives me 1.22 syllables/word instead of 1.36.

11.8*1.22 + 0.39*(4223/398)-15.59=2.94

I pulled a section from the middle of the story (Starting "He was beyond mortified"). "Style" says it contains 184 words, 1.16 syllables/word, and 15 sentences, with a reading level of 2.9 -- same as the full text.

I count 193 words (which agrees with my word processor), 15 sentences and 253 syllables to give 1.31 syllables/word. So, the manual check gives

11.8*1.31 + 0.39*(193/15)-15.59=4.88

That's not a very good agreement. I can see how the selected text could cause some problems for automated grade-level calculations. I checked it sentence-by-sentence and found the only difference to be in the second-to-last sentence where the author uses hyphens in the place of em-dashes. That causes "style" to count fewer words than are there.
 
Last edited:
I reckon "settling in to read this thing" should be seen by chapter 3 - it's certainly obvious in my four year old shaggy dog story, and my latest shows almost the same curve (surprisingly the same). On that basis alone, I reckon a stand-alone story will be the same as a chapter one: reads = approx one fifth the number of views.

The thing I find odd is that the vote per view ratio remains quite consistent across the whole chapter set. How come all those folk who voted for chapter one, and presumably read at least chapter two, didn't keep voting in the same numbers on subsequent chapters? That's counter-intuitive to me. It's as if the 1:50 ratio or 1:100 ratio sets itself on chapter one, and that ratio then governs the remainder. Which is strange.

I have two series. For each, the first chapter has the highest view:vote ratio, and after that the ratio tends to bounce back and forth rather than showing a clear trend one way or another.

One possible explanation is that Literotica has a very high attrition rate, and a constant influx of new readers. So unless your entire series is published over a very short period there's going to be a big dropoff of readers even though they might have been inclined to read the later chapters had they stuck around. New readers, not familiar with earlier chapters, might click on the later one to see what it's about or to use it to navigate to chapter 1.

I've noticed that over time the view to vote ratios rise slowly, which seems counterintuitive unless it's evidence of stories being read multiple times.
 
Unless I'm missing something, I just realized for the first time that Literotica's search function no longer allows you to search for stories that have a red H.

you are, in fact, missing something. In advanced options, on the bottom line, choose 'most popular' on the right.
 
A curiosity

Since this topic has come up from a number of angles, I’d like to throw a curious case study into the mix. Bear in mind that case studies are microcosms, and should not be treated the same as large scale surveys. On 4/14/18, I posted a story called The Favor, and the following day (4/15/18) I posted another story called Derelict 0006.

Both stories are lesbian stories
Both stories are resubmissions of stories I’d had up on Lit and taken down
Both stories were listed at the very top (first) slot of the “New Submissions” on the Lesbian hub on their day of posting
Both stories have approximately the same word count (9700 vs 9100, or three Lit pages)
Both stories have approximately the same readability
Both stories have similarly obscure titles and descriptions (ie, not “How I banged the busty neighbor”)
Both stories have hard (non-HEA) endings
Both stories feature non-white protagonists and white love interests

By 4/30, the two had gotten wildly different reactions

Favor 18,753 views 4.52 rating* 164 votes
Derelict 3,058 views 4.69 rating 35 votes

*A sweep in June greatly increased this number

In between 4/30 and 10/15 (at 9 am this morning) I collected hundreds of data points, and the general trend has been very similar. Currently:

Favor 22,929 views (+4,176) 4.61 rating 174 votes
Derelict 4,006 views (+948) 4.67 rating 42 votes

The Favor generates 3-5 times as many views as Derelict does over any given length of time. By most metrics, these two stories are otherwise identical until you click on them and start reading (which would generate ‘a view’), but somehow readers are finding one and not the other.

Derelict 0006 was nominated for one of the year end awards for 2015 (Most Literary-Genre Transcending). I don’t mean to say that it’s great, or even that it was necessarily one of the best stories of 2015, but I think it can be reasonably assumed that it doesn’t suck. It certainly isn’t 17% as good as The Favor (which was not nominated for anything by anyone).

The following is conjecture.

We are not privy to search results on Lit, but I believe that the tags of The Favor are what has (and continues to) propel the difference in attention between these two stories.

Derelict: Lesbian - Girl on Girl - Loss - Sci fi - Fingering - Love
The Favor: Lesbian - Girl on Girl - Fingering - 69 - first time - straight girl - married - amnesia - office

If I was guessing, I would say the two highlighted tags are the most likely to be drawing extra, curious attention.

There is a very obvious, perpetual trend in the visibility level of these two stories. The Favor continues to outpace Derelict by a factor of four long after their initial periods of high visibility were over. Their tags (what makes them searchable) seem to be extremely important, and I don’t know that I had ever realized how important it could be until recently. I wouldn’t change the composition or makeup of a story to play to a readership, but the difference here is so great that I thought it worth mentioning.
 
Last edited:
Derelict: Lesbian - Girl on Girl - Loss - Sci fi - Fingering - Love
The Favor: Lesbian - Girl on Girl - Fingering - 69 - first time - straight girl - married - amnesia - office

If I was guessing, I would say the two highlighted tags are the most likely to be drawing extra, curious attention.
I'd agree if the intake to each story was from a tag search alone, but you're saying the significantly different take-off for each story was there from the start, when they were on the category front page (where tag searches would be far less of a factor).

I'd say your story titles were more significant to get those first views - "Derelict" is negative, a downer, while "The Favor" is positive and promising.

What is uncanny, though, is how the "shape" of a story's profile is made quite early on, in the first days, and then seems to follow the same trend over the story's life. It really is as if stories get lives of their own and somehow continue to get the same interaction from readers over time - the same chapter getting marked down, the same chapter significantly higher, the same chapter getting read twice, that kind of thing.
 
The most reliable statistic I can put to this is that this probably helps explain why those obsessed with how their stories are doing at Literotica have fewer than a hundred stories posted here and I have, in combined accounts, well over a thousand. When I'm done writing one, I just move on to writing the next one. The relative reception of stories on Literotica is a mostly meaningless and manipulatable crap shoot.
 
I definitely obsessively track my views. I take cursory data on ratings and votes, but views are tracked every day (since 7/1/2015).

The spreadsheets where I keep all of this are enormous, full of charts, ratings, and extrapolations of interest to no one but myself.
 
People

Not sure how this might relate to the statistics above --

I think there are many people who do not linger at the earlier part of the story, the background, the setup, etc. They want to get to the 'good stuff' pretty quickly. And thus skip forward to later chapters to do so.

At the other end of things, I think there are people who do want the background, the buildup, the 'literary foreplay' so that the 'good stuff' has a deeper and more satisfying impact. They will read every chapter to do so.
 
The most reliable statistic I can put to this is that this probably helps explain why those obsessed with how their stories are doing at Literotica have fewer than a hundred stories posted here and I have, in combined accounts, well over a thousand. When I'm done writing one, I just move on to writing the next one. The relative reception of stories on Literotica is a mostly meaningless and manipulatable crap shoot.

I bet SamuelX doesn't either. What does that tell you?
 
You implied that your experience has elevated you beyond the petty concerns I spend my time on or, alternately, that my (and others) inexperience lends itself to silly and childish dithering. I implied that you mass produce garbage.

How is one any more nasty than the other?
 
For the record, though, I respect SameulX. He produces what he wants regardless of what anyone thinks or says, and he does it with as little fan interaction as i've ever seen. He's a machine.
 
I definitely obsessively track my views. I take cursory data on ratings and votes, but views are tracked every day (since 7/1/2015).

The spreadsheets where I keep all of this are enormous, full of charts, ratings, and extrapolations of interest to no one but myself.

I think I agree with EB. Tags may make a difference, but it seems likely that something else is at work if the difference between the two stories appeared from the beginning. One story has a more positive and attractive title than the other. The Favor also has a more appealing tagline. I think that has an impact, too.

Literotica readers have many options, and many have little time to exercise them. Small differences in the attractiveness of title, taglines, and tags can make big differences in views and reads. The more stories I publish the more I see this. What's interesting is how little correlation there is between scores and views. My best-read story in the Exhibitionism category is my lowest-rated story in that category, but I think its title grabs readers more effectively than the higher scores of other stories.

I track view data, too, along with comments, favorites, votes, scores, etc., on spreadsheets. I may someday give up doing so when the number of stories gets too large, but I'm not there yet.
 
I bet SamuelX doesn't either. What does that tell you?

Anytime I see Sam's name mentioned I always think...good for that guy! Not in the sense of the insane amount of stories he'd posted, but...the guy does not have one H(last I looked anyway) in his list...not one!

His writing seems almost universally hated here(its easy to see why, its extremely racist and sexist) he gets bombed 24/7 and trolled, but he plugs along. This is really a guy writing for himself. The polar opposite of what you see in discussions like this.
 
For the record, though, I respect SameulX. He produces what he wants regardless of what anyone thinks or says, and he does it with as little fan interaction as i've ever seen. He's a machine.

For the record, you were being intentionally nasty. Any time you'd like to compare our writing talent with an impartial jury, I'm game. I just never have been snide about your writing. I haven't read any of it, which I'm sure is no different from you having read mine and having the actual ability to compare it to yours. I'm game because I have no hangups about my writing. It is what it is and it makes money.
 
Back
Top