Percentile ranking

HectorBidon

Should know better
Joined
Oct 10, 2010
Posts
435
The intent of the red H is to indicate how well a story is perceived across the entire readership. If everyone voted honestly, then a story scoring > 4.5 would mean that more than half of the raters "loved" it and found it to be "one of the best!" That would indeed be worthy of acclaim—if everyone voted honestly. The problem is that people's votes are distorted by the knowledge that their vote will affect whether the story gets its H or not. People are less inclined to give 4s to stories that really deserve them and more inclined to give 5s to stories that don't.

Another way to measure a story's popularity is by its ranking within its category. Alongside a story's score, the site could show its percentile rank. For example, a 4% rank would mean that the story is in the top 4% of its category.

To see how this might play out, I figured out the percentile ranks for some of my stories. (You can do this on the Search Stories page. You can sort all the stories in a given category by score and then find where your story ranks. It's a bit tedious sifting through all the pages to find a particular story, but it can be done.)


My highest rated story is a 4.77 in First Time. It turns out to be the 136th highest rated story out of 7982 in the category, which means it's in the top 2%. Zowie!

My second highest rated story is a 4.70 in Romance. It ranks 4478 out of 18608, giving it a percentile rank of 24%. Not spectacular, but not bad.

I have another story hovering at 4.48 in Romance. It ranks 10111 out of 18608 for a percentile rank of 54%. Hmmm.

My worst-rated story is a 3.7 in E&V. It ranks 21161 out of 22626, for a percentile rank of 93% (i.e., the bottom 8%). Yikes!

To get a broader view, I figured out the distributions for some of the categories. This table shows the scores that correspond to the indicated percentile ranks in different categories. For example, the top 1% of stories in Sci-Fi Fantasy have scores at or above 4.91, and the top 50% have scores at or above 4.58. (The precise cutoffs would require more than two significant digits, which are not available on the search page. Also, note that this analysis lumps stand-alone and chapter stories all in together.)

[tr][td].[/td][td].[/td][td]
Per​
[/td][td]centile[/td][/tr][tr][td]Category______[/td] [td]__Stories[/td] [td] _____75[/td] [td]___50[/td] [td]___25[/td] [td]___10[/td] [td]____5[/td] [td]____1[/td][/tr][tr][td]Sci-Fi Fantasy[/td] [td]
20440​
[/td] [td]
4.33​
[/td] [td]
4.58​
[/td] [td]
4.74​
[/td] [td]
4.83​
[/td] [td]
4.86​
[/td] [td]
4.91​
[/td][/tr][tr][td]Romance[/td] [td]
18608​
[/td] [td]
4.24​
[/td] [td]
4.51​
[/td] [td]
4.69​
[/td] [td]
4.79​
[/td] [td]
4.83​
[/td] [td]
4.89​
[/td][/tr][tr][td]Mature[/td] [td]
13397​
[/td] [td]
4.23​
[/td] [td]
4.43​
[/td] [td]
4.58​
[/td] [td]
4.69​
[/td] [td]
4.75​
[/td] [td]
4.83​
[/td][/tr][tr][td]Incest/Taboo[/td] [td]
50363​
[/td] [td]
4.21​
[/td] [td]
4.42​
[/td] [td]
4.57​
[/td] [td]
4.69​
[/td] [td]
4.74​
[/td] [td]
4.81​
[/td][/tr][tr][td]Exhib & Voyeur[/td] [td]
22626​
[/td] [td]
4.12​
[/td] [td]
4.36​
[/td] [td]
4.54​
[/td] [td]
4.68​
[/td] [td]
4.74​
[/td] [td]
4.82​
[/td][/tr][tr][td]Group[/td] [td]
24918​
[/td] [td]
4.09​
[/td] [td]
4.35​
[/td] [td]
4.54​
[/td] [td]
4.69​
[/td] [td]
4.75​
[/td] [td]
4.82​
[/td][/tr][tr][td]E Coupling[/td] [td]
60181​
[/td] [td]
4.07​
[/td] [td]
4.33​
[/td] [td]
4.52​
[/td] [td]
4.67​
[/td] [td]
4.73​
[/td] [td]
4.83​
[/td][/tr][tr][td]BDSM[/td] [td]
37203​
[/td] [td]
4.00​
[/td] [td]
4.27​
[/td] [td]
4.47​
[/td] [td]
4.62​
[/td] [td]
4.70​
[/td] [td]
4.82​
[/td][/tr][tr][td]First Time[/td] [td]
7928​
[/td] [td]
3.98​
[/td] [td]
4.27​
[/td] [td]
4.48​
[/td] [td]
4.65​
[/td] [td]
4.72​
[/td] [td]
4.80​
[/td][/tr][tr][td]NC/Reluctance[/td] [td]
26695​
[/td] [td]
4.00​
[/td] [td]
4.25​
[/td] [td]
4.45​
[/td] [td]
4.60​
[/td] [td]
4.68​
[/td] [td]
4.80​
[/td][/tr][tr][td]Loving Wives[/td] [td]
34244​
[/td] [td]
3.81​
[/td] [td]
4.11​
[/td] [td]
4.32​
[/td] [td]
4.44​
[/td] [td]
4.52​
[/td] [td]
4.69​
[/td][/tr]



What this shows is that most stories on the site are rated fairly high. More than half the stories in every category score above 4 stars. In most categories, more than three quarters do. In Romance and Sci-Fi Fantasy, more than half the stories get a red H. In many other categories more than a quarter do. Only in Loving Wives is the red H given to only the top 5% or so.

Would I like the site to show percentile ranks? Maybe. As an author it's neat to see that one of my stories is a top 2%er. On the other hand, it's embarrassing to have everyone see that another of my stories is a bottom 8%er. Mostly, I guess, it's sobering to see that most of my stories are pretty much just in the middle of the pack.

As a reader? Would it be easier to judge a story's hotness from its percentile ranking than from its score alone? Hard to say. I would still probably mostly choose stories based on title and blurb, but the percentile ranking would supply marginally useful information.
 
How did you figure this out? Did you take a sampling of each category or calculcate based on all stories within categories?
 
That's real interesting. Giving the percentile rank could be a little like "grading on the curve."
 
Sci fi fantasy is heavily skewed towards high scores because of the endless chapter series, many of which themselves are endless.

It should not be included in any type of scoring type analogy because its a joke...until the words The End are typed in the final chapter its not even a complete story.

A couple of other takeaways, it proves how soft the voting here, the average score is over four, but the forums are littered with whining about bombs and trolls

The other is the only thing that I find surprising is the low average for first time, I never thought of it as a tougher voting category.
 
Last edited:
Good data

Interesting data, thank you for sharing.

Frankly, I'm surprised the 75% cutoff score for LW is that high, with all the "cuck bombing" that goes on. 18/25 new stories currently on the front page are below 3.81.

I guess scores normalize over time as people find them later and upvote them, or bombs get swept, maybe.
 
Honestly, it would provide more information to readers, as the scores are so grouped above four that they blend together. Get rid of the red H, and it's a sea of almost pointless numbers. The percentage range gives a better idea of where it stands amidst its peers.

It would also be much more difficult to manipulate for boosters and trolls. They wouldn't have an exact target to aim for, and every vote would have less impact. If enough people are fucking around, they could even accidentally cancel each other out.

However, it would involve changing the code on a buttload of pages, and probably a lot more server tics to calculate. That means it wouldn't update as often as things do now. That's another point in frustrating trolls, but a severe negative toward implementation.

It's also something that no author could calculate for themselves. Without the ability to see how their "score" is determined, many are going to start screaming witchcraft and discrimination.
 

It's also entirely possible for your "score" to change without having a vote gained or lost, because it's determined by relation to the other stories in the category, and their changes affect your percentile ranking as well. That smacks even more of witchcraft to anyone who doesn't understand ( or care to learn ) what's going on.
 
It's also something that no author could calculate for themselves. Without the ability to see how their "score" is determined, many are going to start screaming witchcraft and discrimination.

It's not that hard to calculate. I just did it for my one-and-only First Time story and found it just outside the fifth percentile -- consistent with the table in the OP.

One thing I really don't know is how ties are broken. There are quite a few stories with the same score. They may be ranked by votes (as in the top lists), but votes aren't shown, so I can't be sure.

You're probably right, though, that it would be a lot of computational ticks to do it for every story site-wide.
 
It's also entirely possible for your "score" to change without having a vote gained or lost, because it's determined by relation to the other stories in the category, and their changes affect your percentile ranking as well. That smacks even more of witchcraft to anyone who doesn't understand ( or care to learn ) what's going on.

Come again on how the raw rating score of a story can change purely on the basis of the raw rating score of other stories in the category. That's the first of that I've heard at Literotica and it doesn't sound the least bit legitimate. You're just talking about story ranking, not it's rating score, right?
 
One thing I really don't know is how ties are broken. There are quite a few stories with the same score. They may be ranked by votes (as in the top lists), but votes aren't shown, so I can't be sure.

Number of relative votes, I think. The site knows the vote score.
 
I love data. I'm kind of a geek about it. But the more I look at this data the more I gravitate toward the view of those who think the Site should just get rid of the red H. It seems clear to me from looking at the data that scores are in part the product of gaming about the red H. Nice voters won't giving anything under a 5 because they don't want to deny the author the red H. Nasty voters don't care and 1-bomb freely. The result? A lot of votes that would be somewhere in the middle never happen. Scores are skewed by the weird and unpredictable differences in reader behavior from one category to another. A red H in one category means something completely different from a red H in another category.

I'm not sure it would be useful for the Site to give us percentile ranking data, either. My top-rated story rates at 1% in the BDSM category. My lowest-rated story scores around the bottom 20% in the Loving Wives category. Are they that different in terms of quality? No. It's all a matter of meeting category expectation and reader-bombing, which is only loosely connected to story quality. It might be better if the Site gave us raw scores but left it to us to do with them whatever we wanted rather than frosting those scores with red Hs as though they meant something.
 
I'm not sure it would be useful for the Site to give us percentile ranking data, either. My top-rated story rates at 1% in the BDSM category. My lowest-rated story scores around the bottom 20% in the Loving Wives category. Are they that different in terms of quality? No. It's all a matter of meeting category expectation and reader-bombing, which is only loosely connected to story quality. It might be better if the Site gave us raw scores but left it to us to do with them whatever we wanted rather than frosting those scores with red Hs as though they meant something.

The ratings don't measure story quality, they just measure popularity. If you want a system to measure story quality, then the ratings would have to be given by a panel of trained judges.
 
Come again on how the raw rating score of a story can change purely on the basis of the raw rating score of other stories in the category. That's the first of that I've heard at Literotica and it doesn't sound the least bit legitimate. You're just talking about story ranking, not it's rating score, right?

Not as things are now. Under this hypothetical percentile scoring system. Your story could be stagnant, but high votes coming in on other stories would cause your stagnant story to slip in the percentile rankings, because others improved and moved the curve. Your raw score wouldn't change, but since the displayed score would be based upon the rankings of the entire category, that would.

It could also go the other way, and a bunch of low scores could cause a stagnant story to rise in the percentile rankings, but that would more or less be ignored except as an example of the witchcraft being used to cheat them on anything that was going down.

You see the same thing now when someone has the same number of votes swept as new votes cast, leaving them with the same vote total and a different score as the last time they looked.

It would be worse in this hypothetical, because your raw score could actually go up, and as long as the overall score went up more your percentile ranking would still slip. ( Though not necessarily in a visible way. It could be in the invisible percentages that aren't displayed. Possibly decimals, but more likely something like 5% jumps )

That's assuming your raw score is still available to see. Odds are that if it was, it wouldn't be for long for exactly this reason. SOL uses a convoluted mathematical weighing system that also partially bases your displayed score upon the scores of other stories within a certain timeframe. For a while, you could see your raw score, but the cries of witchcraft based upon the differences caused them to remove the raw score from view. Now you can only see what the formula comes up with even in your private control panel.
 
Last edited:
Umm, OK. Of course hypothetical discussions on changes at Literotica aren't worth shit.
 
Umm, OK. Of course hypothetical discussions on changes at Literotica aren't worth shit.

So, since the whole discussion is based upon a hypothetical change, your reason for participation here is..?
A) Not actually reading the thread and just responding to what you think you see.
B) An excuse to rag on me
C) An excuse to rag on the site
D) All of the above :D
 
Even though it wouldn't be all that hard to code... er... implement. New tables, new algorithms, new update procedures, etc. The cost of updates in cpu and disk read/write time might be a little prohibitive.
 
Sci fi fantasy is heavily skewed towards high scores because of the endless chapter series, many of which themselves are endless.

...

The other is the only thing that I find surprising is the low average for first time, I never thought of it as a tougher voting category.

Wonder if that's the other side of the same coin. I would expect chapter series to be scarce in First Time for obvious reasons.

One option would be to calculate the percentiles separately by chapter number, so a Chapter 2 is rated against the other Chapter 2s in category rather than the whole category, etc. etc. Though in practice you'd probably want to lump chapters 4+ together; by that point readership has probably levelled off.

I don't think it's only the chapter effect in SFF, though. My highest-rating story is in SFF, and that's a one-shot. Or maybe that really is the best I've written.
 
Wonder if that's the other side of the same coin. I would expect chapter series to be scarce in First Time for obvious reasons.

One option would be to calculate the percentiles separately by chapter number, so a Chapter 2 is rated against the other Chapter 2s in category rather than the whole category, etc. etc. Though in practice you'd probably want to lump chapters 4+ together; by that point readership has probably levelled off.

I don't think it's only the chapter effect in SFF, though. My highest-rating story is in SFF, and that's a one-shot. Or maybe that really is the best I've written.

I don't get it, but there might be more chapter stories in First Time than you think. Right now, eleven of the twenty five stories on the hub are chapter stories.

Like LW, I think SF&F is a world of its own. It's just that SF&F is a friendly world, and LW isn't. When I post a new story in Romance, I/T, or EC (for example) I can get cross-over reads to a lot of other categories -- except SF&F. My stories there languish. I think people who commonly read other hubs rarely cross into SF&F
 
So, since the whole discussion is based upon a hypothetical change, your reason for participation here is..?
A) Not actually reading the thread and just responding to what you think you see.
B) An excuse to rag on me
C) An excuse to rag on the site
D) All of the above :D

Oh, just stuff that line of defensive attack.

I provide a whole hell of a lot more product here than you do.
 
Oh, just stuff that line of defensive attack.

I provide a whole hell of a lot more product here than you do.

Just thought I'd quote this before you decide to edit it. Otherwise, folks can see for themselves what happened here :D

I don't get it, but there might be more chapter stories in First Time than you think. Right now, eleven of the twenty five stories on the hub are chapter stories.

Like LW, I think SF&F is a world of its own. It's just that SF&F is a friendly world, and LW isn't. When I post a new story in Romance, I/T, or EC (for example) I can get cross-over reads to a lot of other categories -- except SF&F. My stories there languish. I think people who commonly read other hubs rarely cross into SF&F

+ Non-Human and to a lesser extent, Horror. Those three categories are a community all their own that cross over with each other, but not other categories.

It isn't just the chapters the pump up the scores there. The community is overall more tolerant ( especially when warned ) of varied content, resulting in fewer bomb votes. There are numerous one-shots that are only .02 off from the top, despite the well documented attrition of every other stat while the score rises, and a dominance of multi-chapter stories. There are also many shorter ( 5 chapter or so ) stories that have stood the test of time to maintain their place over many years like the work of Mack the Knife, Evil Alpaca, and more.

The trade-off is that there aren't many readers in that little community. You get a little more on SOL where the trend toward longer stories is even greater than here, and thus lends itself to world-building, and significantly less on Lush, where poetry and flash/microfiction outperform it.

Of course, that doesn't really matter much in the grand scheme of this proposal, because it started with the premise that the ranking would be within each category and wouldn't have the categories competing against each other. So Sci-Fi's friendly readership wouldn't disadvantage Incest or LW or any other category. A story that's top 1% in Incest is still top 1% even if the score is .10 less than the top 1% story in Sci-Fi & Fantasy.

If an overall toplist was maintained, that's a whole other kettle of fish that would have to be sorted out under the proposed percentile ranking system.
 
Last edited:
Let me take it a step even further, RR. What you posted to me is a downright disgusting line to take with folks providing this Web site's product for free. That isn't the Web site's most endearing position toward the suppliers of its profit. And you can take that and stick it where the sun don't shine.
 
I don't get it, but there might be more chapter stories in First Time than you think. Right now, eleven of the twenty five stories on the hub are chapter stories.

Yeah, there are some (I guess slow-burn pieces?) but I think 11/25 is still low relative to most other categories. For Lesbian it's 21/25, and one of the other four is a sequel. Celeb/Fanfic is 24/25.

Like LW, I think SF&F is a world of its own. It's just that SF&F is a friendly world, and LW isn't. When I post a new story in Romance, I/T, or EC (for example) I can get cross-over reads to a lot of other categories -- except SF&F. My stories there languish. I think people who commonly read other hubs rarely cross into SF&F

Yeah, that's consistent with what I've seen in my view counts. My SF&F piece is a lesbian romance, and people who enjoy my stuff in LS would probably enjoy that one too, but I really have to point them at it to see any crossover readership.
 
How did you figure this out? Did you take a sampling of each category or calculcate based on all stories within categories?

Based on all stories. For example, Search Stories showed 18608 stories in Romance. The top 10% would be the top 1861 stories. So I looked up the 1861st to get its score. And so on.

One thing I really don't know is how ties are broken. There are quite a few stories with the same score.

I just assumed that the scores are stored with more than 2 significant digits in the database, but who knows?

One option would be to calculate the percentiles separately by chapter number, so a Chapter 2 is rated against the other Chapter 2s in category rather than the whole category, etc. etc.

Or rank chapter stories separately from stand alones, or something. I heartily agree. This can't be done easily on the Search Stories page, but presumably it could be done with direct access to the database.

It's also entirely possible for your "score" to change without having a vote gained or lost... witchcraft

Your ranking could change without a score change in the same way that a story can be bumped out a toplist if a better one comes in. But this would be mild and infrequent. 36% to 37% type of thing.

The percentile rank isn't intended to replace the score, just to supplement it.

However, it would involve changing the code on a buttload of pages, and probably a lot more server tics to calculate.

The calculations wouldn't be onerous. One more field in the database and one sort per category, which could easily be done once per day. The hard part would be modifying the code to display the rankings on the various pages.

It might be better if the Site gave us raw scores but left it to us to do with them whatever we wanted rather than frosting those scores with red Hs as though they meant something.

This seems to be a growing consensus. The percentile rankings would be intended to help us interpret the raw scores in different categories.

The ratings don't measure story quality, they just measure popularity.

Amen.

Umm, OK. Of course hypothetical discussions on changes at Literotica aren't worth shit.

Granted.

[Addressing another topic in another thread. . .]
But its still a good topic, hell, at least its not about scores and H's and votes.

Different strokes.
 
Last edited:
Your ranking could change without a score change in the same way that a story can be bumped out a toplist if a better one comes in. But this would be mild and infrequent. 36% to 37% type of thing.

The percentile rank isn't intended to replace the score, just to supplement it.


The calculations wouldn't be onerous. One more field in the database and one sort per category, which could easily be done once per day. The hard part would be modifying the code to display the rankings on the various pages.

An improved replacement for the H rather than a replacement for score is a different animal indeed.

Not so sure about your assumptions concerning the database, though. You're assuming that they're storing an averaged score for simple display purposes, and only recalculating and replacing the stored value when the # of votes change. It's sensible, but the website debuted a long, long time ago, before sensible was commonplace. LOL

There's also already a lot of daily maintenance going on that slows the site down to a crawl in the wee hours of the morning for an hour or so. Adding another full query and calculation set to that...

You still get the same cries of witchcraft and discrimination when the raw average updates in real time as it does now, and the percentile only moves once per day. Less of an issue than if it was the only displayed statistic, but conspiracy theorists gonna theorize.

One question that comes up is "What about new stories?" If you're only updating once per day, that leaves them outside of the ranking throughout day 1. Not necessarily a bad thing, but if readers key into it as a primary selection criteria like they do the H, they could find that irritating.

Also adds more traps to every place where the statistic is displayed to replace it with a default none-yet, where the H is an image that can be displayed or not without causing huge display oddities. That adds up quickly when you start looking at things like the hubs, where new stories could potentially appear in up to 4 sections.

That reduces it mostly to implementation and possible server strain as far as major pitfalls. Unfortunately, implementation is a huge one in the midst of a complete live-site update that's already underway.

I think it would be beneficial to readers by spreading out the bunched up scores and not concentrating their attention on a small percentage with easily manipulated bling like the H. If they make use of it as a selection criteria, the benefits of a more difficult to manipulate primary statistic come into play for author angst reduction.

I think just removing the H is the better option, though. Easier to implement, and redirects attention to the title and description that are otherwise prominent compared to the statistics without that bright red bling drawing the eye.
 
Back
Top