Regression To The Mean

Is it based on device, though? Or your login? Have you been testing anonymous voting?
Both log-in and device. Anonymous votes leaves traces of the device used. When I first joined Lit I ran a basic set of tests to figure out how voting worked, from different entry points to the system.
 
I don't think you're implying it, but I want to be clear anyway.

I have never deliberately tried to abuse the voting system. I may have voted more than once on a story, but only because the option was presented to me organically.
I didn't. I merely used the opportunity to jump in and point out that literally everyone is more or less wrong in their assumptions about the way voting works. The funny part is that those who've been here the longest seem to have the least clue about how it works.
 
Voting more than once, hum, you sound like the problem, not the solution.
Sweeps are a thing of the past. And even if they weren't, they do the job poorly. I'm saying this from a standpoint of someone who tested casting multiple votes on the same story and then observed the effect of the sweep (back when there were still sweeps, as rare as that was)
 
For the sake of argument, let's assume that for every 10 votes, you get one 4. That means the highest possible score you can have is a 4.9.

Let's build on that assumption and assume that for every 50 votes, you're going to get one 2, from someone who genuinely doesn't like your story but doesn't want to bomb it with a 1, along with 45 5s and four 4s. Not a single 1. In that case the highest possible score you can get is a 4.86.

I don't know if these assumptions are true, but they seem on their face reasonable. I know of no reason to assume they are unreasonable. If that's the case, it's not hard to see how stories in long-term all-time scoring lists with many, many votes tend to top out no higher than the mid 4.8s. And you can get to that without any assumptions about people lying in wait ready to downvote stories that get above a certain level to hold them down. It may happen (and I believe it does happen in some cases), but there's no particular reason to believe that it is a substantial explanation of the long-term score patterns we see.

The Incest category, which has the highest number of views and votes, appears to bear this out. The highest rated story on the all time list has a score of 4.86. It has 17,575 votes. That's a robust figure. That would indicate that with enough votes 4.86 is an approximate ceiling on how high a score one can expect to get.

Here's the Novels + Novellas toplist as it stood on July 16, 2024:
https://web.archive.org/web/20240716140905/https://www.literotica.com/top/Novels-and-Novellas-33

The top three stories had ratings of 4.92; the next five had 4.91. One of those 4.91s has 4766 votes, and there's a 4.90 with 8718 votes. The top 50 stories are all on 4.89 or higher.

Clearly 4.86 was not a hard maximum at that time. Yes, it's hard to score higher than that because it's impossible to please everybody, but evidently some stories were able to do so. The laws of mathematics haven't changed in the last 17 months, and I doubt the behaviour of the typical reader has changed enough to explain the difference. So what do you suggest has changed?

Supposing that your assumptions were correct: for a really good story, the best it can hope for as a long-term pattern is something like 2% 2s, 8% 4s, and 90% 5s. Let's call that set of averages a "perfect story". That does indeed make for a long-term average of 4.86, but for a finite number of votes, it's not going to score exactly that average, in the same way that flipping a coin 100 times won't often give you exactly 50 heads.

In that scenario, you'd expect to see "noise" (standard deviation) in the scores of about 0.5/sqrt(n votes). So a perfect story, one that expects to average 4.86 in the very long term, will probably be somewhere between 4.81 and 4.91 after its first 100 votes; at this point, about 2.5% of "perfect" stories would be scoring at least two standard deviations above the mean (4.96 or higher)*.

Because that noise decreases with increasing vote counts, the scores for low-voted stories should be more volatile. If we look at the very top scores (page 1 of the toplist), we would expect to find quite a few stories that are only a little bit over 100 votes, because that volatility makes it easier for them to luck out and score well above their long-term average. These lower-vote-count stories have a natural advantage in getting to the toplist.

And historically, that's what we do see. Looking back to October 11 2021, most of the top-10 stories have fewer than 1000 votes; the top-ranked story has 4.93 off 490 votes.

But looking at N+N today, every single story on the front page of the top list has well over a thousand votes. That in itself is deeply suspicious. If these scores were coming from individuals voting their own opinions, we should still be seeing a strong presence from low-vote stories in the charts, but we're not. I'm struggling to think of a legitimate explanation for that absence.

It is consistent with what we'd expect if somebody were, say, downvoting every story that scored above some threshold. The stories with lower vote counts will be more vulnerable to this, for all the reasons we've already discussed. There may be other scenarios that could cause this, but I'm struggling to think of them.

Do you have a non-manipulation theory for why these patterns would have changed so much in less than two years?

(Unfortunately there's no archive for the Incest/Taboo list so I can't see whether it follows that same pattern.)
a large vote base is precisely what helps insulate the score from manipulation

Indeed. So the fact that only stories with a large vote base are now making it into the top ranks of N+N - when by all rights, stories with a smaller vote base should have an easier time doing so - seems suggestive.
 
Here's the Novels + Novellas toplist as it stood on July 16, 2024:
https://web.archive.org/web/20240716140905/https://www.literotica.com/top/Novels-and-Novellas-33

The top three stories had ratings of 4.92; the next five had 4.91. One of those 4.91s has 4766 votes, and there's a 4.90 with 8718 votes. The top 50 stories are all on 4.89 or higher.

Clearly 4.86 was not a hard maximum at that time. Yes, it's hard to score higher than that because it's impossible to please everybody, but evidently some stories were able to do so. The laws of mathematics haven't changed in the last 17 months, and I doubt the behaviour of the typical reader has changed enough to explain the difference. So what do you suggest has changed?

Supposing that your assumptions were correct: for a really good story, the best it can hope for as a long-term pattern is something like 2% 2s, 8% 4s, and 90% 5s. Let's call that set of averages a "perfect story". That does indeed make for a long-term average of 4.86, but for a finite number of votes, it's not going to score exactly that average, in the same way that flipping a coin 100 times won't often give you exactly 50 heads.

In that scenario, you'd expect to see "noise" (standard deviation) in the scores of about 0.5/sqrt(n votes). So a perfect story, one that expects to average 4.86 in the very long term, will probably be somewhere between 4.81 and 4.91 after its first 100 votes; at this point, about 2.5% of "perfect" stories would be scoring at least two standard deviations above the mean (4.96 or higher)*.

Because that noise decreases with increasing vote counts, the scores for low-voted stories should be more volatile. If we look at the very top scores (page 1 of the toplist), we would expect to find quite a few stories that are only a little bit over 100 votes, because that volatility makes it easier for them to luck out and score well above their long-term average. These lower-vote-count stories have a natural advantage in getting to the toplist.

And historically, that's what we do see. Looking back to October 11 2021, most of the top-10 stories have fewer than 1000 votes; the top-ranked story has 4.93 off 490 votes.

But looking at N+N today, every single story on the front page of the top list has well over a thousand votes. That in itself is deeply suspicious. If these scores were coming from individuals voting their own opinions, we should still be seeing a strong presence from low-vote stories in the charts, but we're not. I'm struggling to think of a legitimate explanation for that absence.

It is consistent with what we'd expect if somebody were, say, downvoting every story that scored above some threshold. The stories with lower vote counts will be more vulnerable to this, for all the reasons we've already discussed. There may be other scenarios that could cause this, but I'm struggling to think of them.

Do you have a non-manipulation theory for why these patterns would have changed so much in less than two years?

(Unfortunately there's no archive for the Incest/Taboo list so I can't see whether it follows that same pattern.)


Indeed. So the fact that only stories with a large vote base are now making it into the top ranks of N+N - when by all rights, stories with a smaller vote base should have an easier time doing so - seems suggestive.

I don't have a good alternative explanation for that. Give me some time and I might think of something, but my brain is starting to hurt!
 
Here's the Novels + Novellas toplist as it stood on July 16, 2024:
https://web.archive.org/web/20240716140905/https://www.literotica.com/top/Novels-and-Novellas-33

The top three stories had ratings of 4.92; the next five had 4.91. One of those 4.91s has 4766 votes, and there's a 4.90 with 8718 votes. The top 50 stories are all on 4.89 or higher.
Romance all-time toplist: 136 stories at 4.86 or better on April 6, 2015. 201 on February 23, 2017. The links for 2020-2024 are basically broken, but on May 11 2024 there's 75 stories at or above 4.90; going to page 3 breaks everything. But it's fair to say that there's a lot of stories above 4.85 at that time. ALSO EDIT: the link for August 31 2022 works but only shows you the first two pages. There's 22 stories at 4.90 or higher, and the bottom of page 2 (story #100) is 4.89 with 105 ratings.

Edit: What's the general explanation for scores decreasing over time, anyway? As discussed elsewhere, there's more stuff being submitted now than ever. Almost nothing is ever deleted. Over time you'd expect to see a general rise in the scores on the toplists, as you see in romance from 2015-2017, where the number of 4.86+ stories increased by 50%. More submissions means more good submissions that should earn a place on the toplist. But instead there's been a massive pullback in the category.

Third edit: Sci-fi/fantasy: the #50 story in 2017 is rated 4.90. The #50 story is rated 4.92 in 2024 (these are the only links I can get to work). The #50 story today is rated 4.85, and so in fact is every story all the way on up to #11. Five of the stories in the top 11 just broke 100 ratings, so we'll see if they're still up there in the next week or so.
 
Last edited:
One last one, I think: the #250 all-time on Feb 23, 2017 had 161 ratings at 4.89. May 17, 2021, #250 is at 4.89 and 934 ratings. Today, that would get you #5 all-time.
 
Desire & Duende Chapter 2 was published in July 2024 and took ages to get to 100 votes, but eventually got there in February 2025 with a score of 4.92, which it had held pretty consistently over that time. It went in at number 1.

It immediately started falling the same day. Bear in mind it had taken 9 months to get 100 votes, suddenly it was getting 4 each day, all cast at around the same time it seemed (between 6am and 8am CET). . . . It eventually dropped as low as 4.56, by which point it had 126 votes.

This same kind of behavior has also been documented by (German language author) TiefImWesten in this thread.. He found that beginning on one particular date (12/23/22), the scores of all of his highly scoring stories began to drop consistently by 0.01 points or more per day.

I became intrigued and tracked the contents of the German language last-12-months top list from January to April 2023. I found that up until 03/12/23 all the stories on the list had fairly stable scores. But starting on that date, the scores of almost every story that had more than 100 votes (more than 50 stories in all) began to steadily decline. The scores of the stories that had less than 100 votes did not decline. Three stories crossed the 100-vote threshold after 3/12, and their scores began to drop off as soon as they did. These findings are shown here.

The abrupt and seemingly coordinated nature of the downturns in the ratings strongly suggest that they were the result of a deliberate downvoting attack rather than normal voting behavior. Presumably the motivation was to block the stories from entering the all-time top lists.
 
The effort seems way too big for ordinary fanatics. The drop is happening in every category. I suspect it's bots casting votes, although the reason eludes me in that case.
I don't see what is there to be gained by this downvoting of every category.

But again, it's the easy to abuse system, along with the complacency of those who should be protecting it, who's truly responsible.
 
Five of the stories in the top 11 just broke 100 ratings, so we'll see if they're still up there in the next week or so.
One day later none of the five stories in the top eleven of the Sci-Fi/Fantasy toplist that had between 100 and 150 ratings remain in the top eleven.
 
(Unfortunately there's no archive for the Incest/Taboo list so I can't see whether it follows that same pattern.)
There's a little bit of an archive! The earliest data point is from aaaaaall the way back on July 10, 2025, when there were 19 stories 4.86 or higher and a high score of 4.87. By August 10, there were only three at 4.86 and none at 4.87. Today there's only one 4.86 story. I don't know how many 1-stars it takes to move a story down even by 0.01 when that story has 37,000 ratings, but it's a lot.
 
I scanned most of the category toplists, and I have to admit it IS weird to see how 4.85-4.87 seems to be the top for nearly all of them. So, I have to walk back my comments from before a bit. I think regression to the mean probably is the main factor that accounts for where MOST stories end up near the top, but the uniformity of the top limit is suspicious, especially because it seems to be a more recent phenomenon. I'll have to leave it to the more technologically competent to explain what is happening.
 
I scanned most of the category toplists, and I have to admit it IS weird to see how 4.85-4.87 seems to be the top for nearly all of them. So, I have to walk back my comments from before a bit. I think regression to the mean probably is the main factor that accounts for where MOST stories end up near the top, but the uniformity of the top limit is suspicious, especially because it seems to be a more recent phenomenon. I'll have to leave it to the more technologically competent to explain what is happening.

My highest rated story is at 4.87, but it is recent. Highest rated older story is 4.85. I don't know when things started to change, but I had probably a dozen stories hold scores higher than that for years.
 
Back
Top