Counting your votes using the TLAR method

Duleigh

Just an old dog
Joined
Dec 12, 2004
Posts
6,796
We don't get access to the data other than the story stats that are handed out to us... How many 5 star votes? How many 1 bombs? We may never know unless we sit there refreshing the screen every 5 minutes. I took this issue to a numbers kind of guy who said the same thing I said - you're not going to be able to figure out the vote distribution unless you actually see the vote come in. He said, "Fortunately there's a system that's used by Amazon's rating analytics, Goodreads' Bayesian smoothing (what ever that is) and YouTube's rating estimators, it's called the TLAR system."

It produces realistic, Stable, Human-like ratings distributions. It's not your exact ratings, but it's a realistic spread. A baseline distribution typical for fiction:

5⭐ 70%
4⭐ 20%
3⭐ 7%
2⭐ 2%
1⭐ 1%

We just force excel to run these numbers into your actual score and this is what it looks like:

Screenshot 2025-12-22 162434.jpg

Are these votes right? Probably not, but they're close enough to add up, so who's to say it's wrong? It takes ten calculations per line to get the first one right but then it's all copy paste until you get the worksheet set up.

Cell A2 Story Title
Cell B2 Number of votes cast
Cell C2 Story score
Cell D2 =0.70 + (C2 - 4.55)*0.40 ***(This generates 5 star percentage)
Cell E2 =0.20 - (C2 - 4.55)*0.30 ***(This generates 4 star percentage)
Cell F2 =0.07 - (C2 - 4.55)*0.07 ***(This generates 3 star percentage)
Cell G2 =0.02 - (C2 - 4.55)*0.02 ***(This generates 2 star percentage)
Cell H2 =0.01 - (C2 - 4.55)*0.01 ***(This generates 1 star percentage)
Cell I2 =ROUND(D2 * B2, 0) ***(This concerts the data from Cell D2 to count of 5 Stars)
Cell J2 =ROUND(E2 * B2, 0) ***(This concerts the data from Cell E2 to count of 4 Stars)
Cell K2 =ROUND(F2 * B2, 0) ***(This concerts the data from Cell F2 to count of 3 Stars)
Cell L2 =ROUND(G2 * B2, 0) ***(This concerts the data from Cell G2 to count of 2 Stars)
Cell M2 =ROUND(H2 * B2, 0) ***(This concerts the data from Cell H2 to count of 1 Stars)

You get 5 cells of ugly numbers followed by 5 cells of what you vote count probably is. I hide Columns D through H so it's not so ugly.

And that's it, the TLAR method

TLAR stands for That Looks About Right and that's pretty much what analytics is.
 

Attachments

  • 1766439048056.png
    1766439048056.png
    9.3 KB · Views: 1
We don't get access to the data other than the story stats that are handed out to us... How many 5 star votes? How many 1 bombs? We may never know unless we sit there refreshing the screen every 5 minutes. I took this issue to a numbers kind of guy who said the same thing I said - you're not going to be able to figure out the vote distribution unless you actually see the vote come in. He said, "Fortunately there's a system that's used by Amazon's rating analytics, Goodreads' Bayesian smoothing (what ever that is) and YouTube's rating estimators, it's called the TLAR system."

It produces realistic, Stable, Human-like ratings distributions. It's not your exact ratings, but it's a realistic spread. A baseline distribution typical for fiction:

5⭐ 70%
4⭐ 20%
3⭐ 7%
2⭐ 2%
1⭐ 1%

We just force excel to run these numbers into your actual score and this is what it looks like:

View attachment 2585439

Are these votes right? Probably not, but they're close enough to add up, so who's to say it's wrong? It takes ten calculations per line to get the first one right but then it's all copy paste until you get the worksheet set up.

Cell A2 Story Title
Cell B2 Number of votes cast
Cell C2 Story score
Cell D2 =0.70 + (C2 - 4.55)*0.40 ***(This generates 5 star percentage)
Cell E2 =0.20 - (C2 - 4.55)*0.30 ***(This generates 4 star percentage)
Cell F2 =0.07 - (C2 - 4.55)*0.07 ***(This generates 3 star percentage)
Cell G2 =0.02 - (C2 - 4.55)*0.02 ***(This generates 2 star percentage)
Cell H2 =0.01 - (C2 - 4.55)*0.01 ***(This generates 1 star percentage)
Cell I2 =ROUND(D2 * B2, 0) ***(This concerts the data from Cell D2 to count of 5 Stars)
Cell J2 =ROUND(E2 * B2, 0) ***(This concerts the data from Cell E2 to count of 4 Stars)
Cell K2 =ROUND(F2 * B2, 0) ***(This concerts the data from Cell F2 to count of 3 Stars)
Cell L2 =ROUND(G2 * B2, 0) ***(This concerts the data from Cell G2 to count of 2 Stars)
Cell M2 =ROUND(H2 * B2, 0) ***(This concerts the data from Cell H2 to count of 1 Stars)

You get 5 cells of ugly numbers followed by 5 cells of what you vote count probably is. I hide Columns D through H so it's not so ugly.

And that's it, the TLAR method

TLAR stands for That Looks About Right and that's pretty much what analytics is.
Significance of the 4.55???
 
I'm not claiming (or attempting) to understand the maths, but does the analysis depend on this distribution? Because I don't think it's typical for Lit. And even then, there are probably differences between categories.
The 1% of 1-stars definitely seems low, especially in certain categories.
 
I'm not claiming (or attempting) to understand the maths, but does the analysis depend on this distribution? Because I don't think it's typical for Lit. And even then, there are probably differences between categories.
This is for an average score of 4.55 but it works on lower scores, I ran a couple of my LW scores through this and they looked as bad as it felt when I first saw the scores

Remember, this doesn't reflect what you have, it reflects something close.
 
Literotica has a hugbox culture of rating driven by the fact that the cutoff for Red-Hs is 4.5 and it's an outlet for amateur writers helping each other masturbate. Literally. So the expectation is that people simply hit the five star rating for everything they like. Even if it has flaws. Even if they think it could be better, or even should be better.

Because the expectation is that people hit the five star button for everything they like, that is the like button. Which in turn means that the site has a lot of nuance in how much you don't like something, and no nuance at all for how much you do like something.

In practice this means that if there are three stories that came out and you liked them, they all get fives. Obviously you liked one of those the most and one of those the least, they all get the same rating, because it's the only rating available for works that you liked. On the flip side, if you did not like something, you're free to give it a 4, a 3, a 2, or even a 1. All four of those ratings will reduce their average rating and make it less likely that they retain their Red H, but one of those ratings counts four times as much against them as another does. So if there are three stories you didn't like, you can hurt one a little and hurt another a lot.
 
We don't get access to the data other than the story stats that are handed out to us... How many 5 star votes? How many 1 bombs? We may never know unless we sit there refreshing the screen every 5 minutes. I took this issue to a numbers kind of guy who said the same thing I said - you're not going to be able to figure out the vote distribution unless you actually see the vote come in. He said, "Fortunately there's a system that's used by Amazon's rating analytics, Goodreads' Bayesian smoothing (what ever that is) and YouTube's rating estimators, it's called the TLAR system."

It produces realistic, Stable, Human-like ratings distributions. It's not your exact ratings, but it's a realistic spread. A baseline distribution typical for fiction:

5⭐ 70%
4⭐ 20%
3⭐ 7%
2⭐ 2%
1⭐ 1%

We just force excel to run these numbers into your actual score and this is what it looks like:

View attachment 2585439

Are these votes right? Probably not, but they're close enough to add up, so who's to say it's wrong? It takes ten calculations per line to get the first one right but then it's all copy paste until you get the worksheet set up.

Cell A2 Story Title
Cell B2 Number of votes cast
Cell C2 Story score
Cell D2 =0.70 + (C2 - 4.55)*0.40 ***(This generates 5 star percentage)
Cell E2 =0.20 - (C2 - 4.55)*0.30 ***(This generates 4 star percentage)
Cell F2 =0.07 - (C2 - 4.55)*0.07 ***(This generates 3 star percentage)
Cell G2 =0.02 - (C2 - 4.55)*0.02 ***(This generates 2 star percentage)
Cell H2 =0.01 - (C2 - 4.55)*0.01 ***(This generates 1 star percentage)
Cell I2 =ROUND(D2 * B2, 0) ***(This concerts the data from Cell D2 to count of 5 Stars)
Cell J2 =ROUND(E2 * B2, 0) ***(This concerts the data from Cell E2 to count of 4 Stars)
Cell K2 =ROUND(F2 * B2, 0) ***(This concerts the data from Cell F2 to count of 3 Stars)
Cell L2 =ROUND(G2 * B2, 0) ***(This concerts the data from Cell G2 to count of 2 Stars)
Cell M2 =ROUND(H2 * B2, 0) ***(This concerts the data from Cell H2 to count of 1 Stars)

You get 5 cells of ugly numbers followed by 5 cells of what you vote count probably is. I hide Columns D through H so it's not so ugly.

And that's it, the TLAR method

TLAR stands for That Looks About Right and that's pretty much what analytics is.
Literotica votes (and I post in enough in low traffic categories to be able to count votes as they come in quite often) follow a J curve, not the distribution you cite.
 
Last edited:
Something looks wrong.

Your first row should have a score of 4.63 with 139 votes, but the distribution gives 137 votes and a score of 4.60
The second row should have a score of 4.81 with 331 votes, but the distribution gives 317 votes and a score of 4.69
 
Literotica votes (and imposing enough in low traffic categories to be able to count votes as they come in quite often) follow a J curve, not the distribution you cite.
This isn't my experience. My stories and the distribution published by @8letters a few years ago follow a geometric distribution very closely. The distribution for individual stories can be modified by other factors, including small vote counts, punitive voting, and deliberate attacks. 8letters' composite distributions typically have a bump at 1* that is not consistent with a geometric distribution, but it's statistically minor.

The typical distribution for fiction in Duleigh's spreadsheet is very close to a geometric distribution.
 
Something looks wrong.

Your first row should have a score of 4.63 with 139 votes, but the distribution gives 137 votes and a score of 4.60
The second row should have a score of 4.81 with 331 votes, but the distribution gives 317 votes and a score of 4.69
Rounding within acceptable parameters. This is an estimate not a hard valuation.

What I do think is interesting is that if a site were to use an algorithm like this to meter excessive votes on either end, the effect would be to flatten the peaks and raise valleys. Curios...
 
This isn't my experience. My stories and the distribution published by @8letters a few years ago follow a geometric distribution very closely. The distribution for individual stories can be modified by other factors, including small vote counts, punitive voting, and deliberate attacks. 8letters' composite distributions typically have a bump at 1* that is not consistent with a geometric distribution, but it's statistically minor.

The typical distribution for fiction in Duleigh's spreadsheet is very close to a geometric distribution.
YMMV - I can give you dozens of examples. Then it may be a low traffic artifact.
 
We may never know unless we sit there refreshing the screen every 5 minutes.
I thought the same, but recently I came up with a way that gives exact results for all my stories, even for my most frequently rated one (339 votes).

It is not perfect, and will not tell you what the 27th vote was, but it is pretty simple to set up and maintain.

The spreadsheet looks like this:

Screenshot 2025-12-22 at 23.16.57.png

Start with filling cells from column J onwards with 5 until the count in column I is correct, then work backwards changing 5 to a lower value until the column H is correct.

I update it most days, but even doing it weekly is not a major effort.
 
Literotica votes (and I post in enough in low traffic categories to be able to count votes as they come in quite often) follow a J curve, not the distribution you cite.
I would think a J curve would be a historical perspective, tracking votes over time where TLAR would be a snapshot at any given point along the J curve. Not sure one excludes the other.
 
Cell A2 Story Title
Cell B2 Number of votes cast
Cell C2 Story score
Cell D2 =0.70 + (C2 - 4.55)*0.40 ***(This generates 5 star percentage)
Cell E2 =0.20 - (C2 - 4.55)*0.30 ***(This generates 4 star percentage)
Cell F2 =0.07 - (C2 - 4.55)*0.07 ***(This generates 3 star percentage)
Cell G2 =0.02 - (C2 - 4.55)*0.02 ***(This generates 2 star percentage)
Cell H2 =0.01 - (C2 - 4.55)*0.01 ***(This generates 1 star percentage)
Cell I2 =ROUND(D2 * B2, 0) ***(This concerts the data from Cell D2 to count of 5 Stars)
Cell J2 =ROUND(E2 * B2, 0) ***(This concerts the data from Cell E2 to count of 4 Stars)
Cell K2 =ROUND(F2 * B2, 0) ***(This concerts the data from Cell F2 to count of 3 Stars)
Cell L2 =ROUND(G2 * B2, 0) ***(This concerts the data from Cell G2 to count of 2 Stars)
Cell M2 =ROUND(H2 * B2, 0) ***(This concerts the data from Cell H2 to count of 1 Stars)

You get 5 cells of ugly numbers followed by 5 cells of what you vote count probably is. I hide Columns D through H so it's not so ugly.

And that's it, the TLAR method

TLAR stands for That Looks About Right and that's pretty much what analytics is.

Unless I've typo'ed somewhere, these predictions don't seem consistent with the input numbers.

e.g.: for a story with a perfect 5.00, this predicts a breakdown of 88% 5s, 6.5% 4s, 3.85% 3s, 1.1% 2s, 0.55% 1s, which would actually have an average of 4.80.

...and from there, every change of 0.1 to the input only changes the output by 0.054. For a story with a score of 4.00, this will give a breakdown that comes out at 4.26, and so on.
 
I would think a J curve would be a historical perspective, tracking votes over time where TLAR would be a snapshot at any given point along the J curve. Not sure one excludes the other.
Not really, no.

Since a 5 is an upvote and a 4, 3, 2, and 1 are all downvotes, there really is a huge bump at 5 (all the people who want to upvote your story will give you this exact rating), and there's a significant increase as you move down the scale as well. After all, if you did not like something, why would you send two downvotes by hitting 3* when you could send FOUR downvotes by hitting the 1*. It takes a very tempered mind to vote against something but to consciously decide to spike the rating by half as much as you could.
 
Not really, no.

Since a 5 is an upvote and a 4, 3, 2, and 1 are all downvotes, there really is a huge bump at 5 (all the people who want to upvote your story will give you this exact rating), and there's a significant increase as you move down the scale as well. After all, if you did not like something, why would you send two downvotes by hitting 3* when you could send FOUR downvotes by hitting the 1*. It takes a very tempered mind to vote against something but to consciously decide to spike the rating by half as much as you could.
I'm looking at this in relation to what I see with my own stories. I have some loyal followers, so my stories usually start off very well. this is mitigated by a sever drop as our infamous uno-bombers and other miscreants arrive on the scene, followed by a slow gradual rise to somewhere around the average for my stories. That's a traditional J curve. The TLAR metric is a snapshot at any given point on that historical path. From that perspective, I don't seen them as relaying the same information at all, but that they are actually complimentary.
 
YMMV - I can give you dozens of examples. Then it may be a low traffic artifact.
Maybe.

This is the distribution for all the votes that @8letters recorded a few years ago over the period of a month. These data underweight busy categories like T/I because votes came in too fast to figure out.

all_votes.png
The line in the graph is a best-fit geometric distribution for votes of 2-5. The 1* value on the line is whatever it takes for the total number of votes to work out. In 8letters data, the 1* votes are about 150 higher than the 2* votes, but given that there are more than 24000 votes in the set, that difference is insignificant and the (r squared) correlation between the actual data and the calculated distribution is 0.9989, meaning that the curve accounts for 99+% of the variation in the data.

My stories follow a similar distribution, but (maybe because of sweeps), the 1* votes on my stories are often lower than predicted by the curve.
meaningless_sex.png

If you want, you can calculate a distribution that might describe your votes, as long as you don't have a low number of votes or any of the other things that cause patterns to deviate. Using the data on @Duleigh's stories, it looks like this:

spreadsheet.png

The five values on the first row are the vote. They're just for illustration. The five values on the second row are the "trial." A 5* vote is the first trial, 4* is the second trial, and so on. The trial is used in calculating the distribution. The numbers under "Votes" and "Ratings" are the story's stats.

The long number to their right is the probability used to calculate the distribution. It is calculated as 1/(6 - Rating). The remaining values give the distribution. They are calculated by rounding the results of Votes*probability*(1-probability)^(trial-1).

The mean of the distribution will equal the story's rating, and the sum of the estimated votes will equal the story's votes within +/- one due to round off. It works pretty well for stories with ratings of 4.3 or more. On lower-rated stories there could be votes for trials greater than 5 that would be neglected, giving a slightly low vote count and high average.
 
Back
Top