Star Rating Breakdowns

LoquiSordidaAdMe

Literotica Guru
Joined
Aug 8, 2017
Posts
1,135
Is there a way to see how many of each star-rating a story gets? Like how many 1-star votes, how many 2-star votes, etc. for a particular story?

I've been mildly curious about this for a while, and every so often I'll look around to see if I can find this kind of data, but if it's out there, I can't find it.

I know I shouldn't obsess about star-ratings, or favorite-numbers, or things like that and I should just write stories that make me happy. But I'm a bit of a data nerd, and I'd get a kick out of tracking my stories in a spreadsheet and comparing them.

So does anyone know how I can get that data, or if it's even available?
 
Is there a way to see how many of each star-rating a story gets? Like how many 1-star votes, how many 2-star votes, etc. for a particular story?

I've been mildly curious about this for a while, and every so often I'll look around to see if I can find this kind of data, but if it's out there, I can't find it.

I know I shouldn't obsess about star-ratings, or favorite-numbers, or things like that and I should just write stories that make me happy. But I'm a bit of a data nerd, and I'd get a kick out of tracking my stories in a spreadsheet and comparing them.

So does anyone know how I can get that data, or if it's even available?

Only way that I'm aware of is to track scores and vote counts and then deduce the ratings from that. (New score)*(New # votes) - (Old score)*(Old # votes) = total of new votes.
 
Only way that I'm aware of is to track scores and vote counts and then deduce the ratings from that. (New score)*(New # votes) - (Old score)*(Old # votes) = total of new votes.

And watch your ratings constantly. With a new story that can mean every five minutes. :(
 
Only way that I'm aware of is to track scores and vote counts and then deduce the ratings from that. (New score)*(New # votes) - (Old score)*(Old # votes) = total of new votes.

And of course (New # votes)-(Old # votes)=number of readers voting. If that number isn't 1 then the results may be ambiguous.

If the roundoff is handled correctly then the results are good up to 50 votes cast. Above that, the potential for error increases and above 170 votes cast the chance that your result is wrong exceeds the chance that it's right. I usually find that the upper limit of reliability is a little under 100 votes cast.

Even with all the limitations, you can see the patterns.
 
And then the Web site sweeps come along and delete votes of unknown rating.
 
And of course (New # votes)-(Old # votes)=number of readers voting. If that number isn't 1 then the results may be ambiguous.

If the roundoff is handled correctly then the results are good up to 50 votes cast. Above that, the potential for error increases and above 170 votes cast the chance that your result is wrong exceeds the chance that it's right. I usually find that the upper limit of reliability is a little under 100 votes cast.

Even with all the limitations, you can see the patterns.

And usually after that first couple of days the rating won't move much so if you want to track it, check every few minutes for the first couple of days. It's a very manual process. Been there done that.
 
While you can't know the exact composition of the votes your story has received, you can figure out, for a particular score over 4.0, what the minimum number of 5s is that you received. See this article by Carlus Magnus: https://www.literotica.com/s/how-to-analyze-your-scores.

That was a great article! Thanks SD! I can have lots of fun building a spreadsheet with those formulas. It looks like I can't have the data I want, but at least I can calculate a uniform comparison between stories with very different vote numbers without having to check a new story every five minutes.

Thanks to everyone who offered advice and ideas.
 
And of course (New # votes)-(Old # votes)=number of readers voting. If that number isn't 1 then the results may be ambiguous.

If the roundoff is handled correctly then the results are good up to 50 votes cast. Above that, the potential for error increases and above 170 votes cast the chance that your result is wrong exceeds the chance that it's right. I usually find that the upper limit of reliability is a little under 100 votes cast.

50 votes? You should always be able to get the exact total up to 100 votes cast, and sometimes for higher counts up to 199 votes. At 100 votes, each star makes a difference of exactly 0.01 on your rating, which is too large to get lost in rounding.

If anybody wants to build a spreadsheet: for a story with N votes and a rounded score of S, the minimum possible number of stars is roundup((S-0.005)*N) and the maximum is rounddown((S+0.00499999)*N).

And then the Web site sweeps come along and delete votes of unknown rating.

Indeed, though sometimes you can use the same arithmetic to figure out what got swept.
 
Is there a way to see how many of each star-rating a story gets? Like how many 1-star votes, how many 2-star votes, etc. for a particular story?

I've been mildly curious about this for a while, and every so often I'll look around to see if I can find this kind of data, but if it's out there, I can't find it.

I know I shouldn't obsess about star-ratings, or favorite-numbers, or things like that and I should just write stories that make me happy. But I'm a bit of a data nerd, and I'd get a kick out of tracking my stories in a spreadsheet and comparing them.

So does anyone know how I can get that data, or if it's even available?

I'd love to have that!

A simple bar chart.
 
50 votes? You should always be able to get the exact total up to 100 votes cast, and sometimes for higher counts up to 199 votes. At 100 votes, each star makes a difference of exactly 0.01 on your rating, which is too large to get lost in rounding.

If anybody wants to build a spreadsheet: for a story with N votes and a rounded score of S, the minimum possible number of stars is roundup((S-0.005)*N) and the maximum is rounddown((S+0.00499999)*N).



Indeed, though sometimes you can use the same arithmetic to figure out what got swept.

The uncertainty in a single value rounded to 1/100th is +/- 0.005. When you perform an operation using two such rounded numbers (as you do when you calculate the change in the number of stars voted) then the uncertainties are convolved and the possible range in the result is doubled to +/- 0.01. The error distribution also changes from a rectangular probability function to a triangular probability function.

I can give you a graph of the probability of getting a correct result. I've run numerical experiments that show it to be true. The results of back-calculating the scores are only exact to 50 votes, but for most purposes the uncertainties are acceptable to somewhere near 90 votes.

There's also a problem that when you get a vote that doesn't change the score then the only result you can back-calculate correctly is the rounded-off version of the score, which may or may not be the right result.
 
I wish this site allowed one to search for stories within a certain score range or above a certain score. As it is, the only score parameter for search purposes is 4.5, which is around the 75th percentile for most categories but somewhat above that for Loving Wives. It would be interesting to know what the median score is and what the 90th percentile score is without actually having to go through a hundred or so stories and tabulating their scores.
 
I'm not sure I get your point, but if you have 100 votes, say 99 * 5 and 1 * 4, you get a total of 499 "stars". On average, that is 4.99. Hence, you can calculate back what was the number of 4 and 5 stars.

I think that's how it works; the total number of stars are counted, and divided by the number of votes/ both integers. So, you don't have any errors there.

Lets say that you have 86 votes and a total of 347 *'s (rounded off score of 4.03), then you receive one 5* vote. You now have 87 votes a total of 352 *'s and an average score is 4.05.

Now back-calculate that vote. You had 86*4.03 stars (346.58) and now you have 87*4.05 stars (352.35). That's a change of 5.77 stars. It looks like you just got a 6* vote. You didn't.
 
If you go to 'search', then select 'advanced search', you can choose a category, and sort on score. Then, without entering any word to filter for, click on the magnifying glass, and you'll get all stories from that category, sorted on score

Sure. But that doesn't give one the information I'm talking about. You'll get a gazillion stories in a given category ranked from highest score to lowest but it's not helpful in terms of easily providing the information I'm talking about.

What would be helpful would be to be able to do a search for stories within a given score range. One could then easily determine the percentile of the stories in that range. That's not possible with the available search tools.
 
I disagree. If you had a score of 4.03 with 86 votes, you can calculate it were 347 stars (you need to round). If you get a score of 4.05 with 87 votes, you end up with 352 stars (rounded)

In this case it gives you the right result. More generally, using round-off at intermediate steps in the calculations increases the number of errors. That is not only true in this case, it is a general computational rule; you destroy the precision in your results if you round off at intermediate steps in a calculation.

I've built my spreadsheets both ways. If you round off the estimated number of *'s then you will see errors in the result even below fifty votes.

Getting a 6* result doesn't necessarily mean that the right result is a 5* vote. It could be (for instance) a 3* vote and 4* vote added and a 1* vote removed by the site. The net result looks like an error; it's one vote and 6*'s.
 
Last edited:
I am sort of following the dialogue between NotWise and RubenR -- barely -- but my question is, why would one care about total stars? What's interesting is the distribution of scores, and unless you have followed the scoring of your story very closely from the beginning you cannot discern that from the information this site provides. If you have 100 votes and a score of 4.6, you don't know how the scores are distributed. You can figure out the minimum number and the maximum number of 5s that you have, and the maximum and the minimum number of 1s that you have, but that's it.
 
I am sort of following the dialogue between NotWise and RubenR -- barely -- but my question is, why would one care about total stars? What's interesting is the distribution of scores, and unless you have followed the scoring of your story very closely from the beginning you cannot discern that from the information this site provides. If you have 100 votes and a score of 4.6, you don't know how the scores are distributed. You can figure out the minimum number and the maximum number of 5s that you have, and the maximum and the minimum number of 1s that you have, but that's it.

Tracking the votes is the only way I know of to get the distribution of votes. I think that will remain true unless Manu can provide that information.

Having tracked votes for almost two years, I think most people would find the results uninteresting. One thing that does pop out is that 1* bombs on high-rated stories show up like a sore thumb.

A simple model for the vote distributions is that the # of votes in a category is approximately a constant times the votes in the next higher category. If the average score is above three then the constant is less than one. If the average is three then the constant is 1. If the average is below three then the constant is greater than 1.

Trolling breaks this simple pattern and it seems like the sweeps tend to restore it.

If you were to do a bar graph, then stories with a score over three would have a graph with regularly increasing heights toward 5* votes. A story with a 3* score would have a completely flat graph. A story with a score under three would have a graph with regularly increasing heights toward 1* votes.

I don't have any stories at three or below, so that's a feature of that simple model that I can't vouch for.
 
I am sort of following the dialogue between NotWise and RubenR -- barely -- but my question is, why would one care about total stars? What's interesting is the distribution of scores, and unless you have followed the scoring of your story very closely from the beginning you cannot discern that from the information this site provides. If you have 100 votes and a score of 4.6, you don't know how the scores are distributed. You can figure out the minimum number and the maximum number of 5s that you have, and the maximum and the minimum number of 1s that you have, but that's it.

I did pull together a quick spreadsheet using the formulas from the CarlusMagnus article mentioned above. The results were kind of interesting once I started calculating percentages so that I could compare a story with under 70 votes to a story with over 700.

But I have to agree with SD above. What I really want to know is if the story got a 4.5 score because half the readers loved it (5*) and half thought it was pretty good (4*), or if it got a 4.5 because almost everybody loved it (5*), but a few thought it was shit (1*).

I don't know that it would make any real difference, but it would satisfy my curiosity. Guess I'll have to get used to disappointment.
 
But I have to agree with SD above. What I really want to know is if the story got a 4.5 score because half the readers loved it (5*) and half thought it was pretty good (4*), or if it got a 4.5 because almost everybody loved it (5*), but a few thought it was shit (1*).

Odd voting patterns are possible. I haven't posted in LW but I would expect some quirks.

The first story I counted votes on was mis-posted to Romance (my mistake, not the site's). The Romance readers hated it, and the general population seemed to like it. It got about an equal number of 1* votes and 5* votes and not much in between. Then it fell off the front page of the New list while it was still up in the Romance hub and the 1* votes won.

Laurel let me pull that story out of Romance and put it somewhere else. It saved the Romance readers from spraining their mouse finger by clicking on 1* so many times.
 
Laurel let me pull that story out of Romance and put it somewhere else. It saved the Romance readers from spraining their mouse finger by clicking on 1* so many times.

It's really kind of disturbing how passionate some readers are about the purity of their favorite categories. But that's a discussion for another thread.
 
The uncertainty in a single value rounded to 1/100th is +/- 0.005. When you perform an operation using two such rounded numbers (as you do when you calculate the change in the number of stars voted) then the uncertainties are convolved and the possible range in the result is doubled to +/- 0.01. The error distribution also changes from a rectangular probability function to a triangular probability function.

Not quite.

If we were talking about a situation where the only information you had was the change in the rounded score, then yes, the uncertainty in that change would be +/- 0.01, for the reasons you state. If you see the rounded scores change from 4.63 to 4.60, and that's all the information you have, then the exact change in the score might be anything between (4.634999... - 4.595 = 0.0400...) to (4.625 to 4.6049999... = 0.0200...) i.e. 0.03 +/- 0.01. Just like you say.

And, yeah, IF story scores were uniformly-distributed random variables, then the distribution of a single rounding error would indeed have a rectangular distribution, and the distribution of the sum of two rounding errors would indeed have a triangular distribution.

But that's not the only information we have. We know the number of votes before and after, and that puts constraints on which pre-rounded values are even possible. And story scores aren't uniformly distributed; they're restricted to a discrete set of values, because they have to be integer multiples of (1/votes).

For example, suppose my story has 80 votes. The exact score has to be an integer multiple of 1/80, somewhere between 80/80 (= 1) and 400/80 (= 5).

369/80 = 4.6125, which would round to 4.61. Too small.
370/80 = 4.6250, which rounds to 4.63. Just right.
371/80 = 4.6375, which rounds to 4.64. Too large.

So if my rounded score is 4.63, from 80 votes, I can tell exactly how many stars it had: 370. Any other possibility will end up being too high or too low to round to 4.63. I even know exactly what the rounding error was: 0.005.

Suppose I then get one more vote, and my score changes to 4.60. This time, I know my score's going to be an integer multiple of 1/81.

372/81 = 4.59259... which rounds to 4.59. Too small.
374/81 = 4.61728... which rounds to 4.62. Too large.
373/81 = 4.60493... which rounds to 4.60. Just right.

So I know my story now has a total of 373 stars, and since I know it had 370 before that last vote, I know that the last vote was exactly 3 stars.

Up to 100 votes, you can always do this to get the exact number of stars at any time you have the rounded score and the number of votes. Between 100 and 200, sometimes, depending on the number.

I can give you a graph of the probability of getting a correct result.

I've run numerical experiments that show it to be true. The results of back-calculating the scores are only exact to 50 votes,

Sorry, but there's an error somewhere in your experiments - whether the issue I mentioned above, or something else.

If I'm incorrect, it should be easy to prove it. You just need to give one example of a case with 100 votes or less, where it's impossible to calculate the exact number of stars from the rounded score. (And, obviously, where that score is in fact possible with the given number of votes - e.g. if a story has exactly ten votes, it's impossible to have an average of 4.18.)

But I guarantee that no such case exists.

There's also a problem that when you get a vote that doesn't change the score then the only result you can back-calculate correctly is the rounded-off version of the score, which may or may not be the right result.

Sorry, I don't understand this. If my story has 4.85 off 99 votes, and then gets another vote but stays at 4.85, I can tell you with certainty that the last vote was a 5. There is no other combination of numbers that can produce this result.
 
We're actually talking about different things.

I am saying that the back calculation of votes has a significant probability of error with more than 50 votes. This

attachment.php


Shows the probability that your back calculated vote will be correct. The jagged blue shows the probability resulting from numerical experiments in the very simplified case where you have received 1 vote. The smooth dark red line shows the theoretical probability. In either case, the errors start at 50 votes.

What you have said is that you can use forward calculations to determine the exact number of votes. I won't dispute that, but I might suggest that doing so isn't very practical--especially if you don't start by doing the back-calculation first.

This is a screenshot of voting results on one of my stories on the morning it posted.

attachment.php


The columns from left to right are the # of votes, score, estimated # of *s, the change in *s, the change in votes, then the estimated # of votes from one to five. If the number of votes cast (5th column) is one, then the change in the *s (4th column) is the estimated vote.

The screenshot shows obvious errors at 64 votes and 72 votes. The errors are only obvious because you can't get 6 *s from one vote. Any of the other estimated votes could also be in error. And yes, I can do a forward calculation to ascertain that the correct vote was probably a 5 in each case. Sweeps hadn't started yet, otherwise the 6 * result could come from a combination of votes and sweeps.

Each of us decides when the level of effort goes beyond the point of diminishing returns. Given the ambiguity that comes from getting multiple votes between observations and of combining incoming votes with sweeps, I'm rarely willing to do a series of forward calculation to check the results. It doesn't add much to what I already know.
 

Attachments

  • graphs.png
    graphs.png
    18.8 KB · Views: 5
  • Screenshot from 2017-12-01 10-56-30.png
    Screenshot from 2017-12-01 10-56-30.png
    11.8 KB · Views: 5
Last edited:
Back
Top