8L Stats: Effect of dialogue occurence on story rating

I see this as being an issue with trying to do this analysis on anything I've written, because I was taught that new dialogue means a new paragraph. So
"Yes?"
"Yes, what?" His voice was low and commanding.
"Yes, sir."
"That's better."

Is now 4 paragraph, with 100% dialogue per paragraph. But any other equally long string of words would probably be 1 paragraph.
But I think this largely matches the reader's perception looking at the page.
 
I liked writing my story with no dialogue. It was fun, it was fast paced. It needed a bit of polishing but I still think over all it stands well. Which was the point. Fast paced narrative; something I needed to work on so that was the result.
 
What's the point of these things?

Are people going to change how they write over things like this?

Going to look at how long their WIP is and try to pad it to be longer, see how much dialogue and try to add or cut it over a graph?

I know for some its just 'interesting' to see it but for others, things like this are detrimental to actual writing.

But as time goes on I've begun to question more and more how serious a lot of people are about their actual writing as opposed to stats and how popular they are in the forum.
Honestly, I appreciate things like this to see where my writing *should* fall to be considered average for the site. I'd never change my writing based on this data. But mostly when I check these things out, I find that my stories tend to be an outlier in the "Statistically, this shouldn't have performed well/poorly because of x and y, but it did better/worse than the averages according to these numbers. I wonder why that is."
 
Trying to suss out independence of variables is going to be hard. Maybe impossible with the scarce data we have in many buckets.

Does Romance give higher scores, which it does, because that is the nature of the audience (we love our HEA, including good scores) or because it has longer stories on average? How much does the higher scores in Romance (and N&N), where stories are longer, drive the higher ratings for longer stories?

Are romance writers hustlers better writers, so get better scores? Someone had already pointed out that better writers may be more likely to write longer stories

This is beyond my statistical analysis wizardry to figure out. I might ask my son. Or even better, @TheRedLantern?
I think another important variable, especially where dialogue is concerned, is the POV of the story. I know that I tend to have far less dialogue in a story written in first-person versus third-person.
 
I think another important variable, especially where dialogue is concerned, is the POV of the story. I know that I tend to have far less dialogue in a story written in first-person versus third-person.
Interesting. Most everything I've written is in first person, but by 8letters' methodology all my stories should come in at 75% dialogue or above. But I like writing dialogue and it tends to be a crutch for me; when I don't know what should happen, just have two characters talk to each other.
 
What's the point of these things?

Are people going to change how they write over things like this?
lovecraft's question is excellent and it's a common question to ask when these types of analyses are done.

The simple answer is that we have the data and we're seeing what it tells us when we ask very, very, very specific questions. That's really it.

Someone who changes how they write because of these insights is following a guru except this guru speaks with theta, mu, and sigma, and usually speaks nonsense.

Someone who asks themselves if they should change how they write in a specific way because of these insights, whatever they decide, is hearing the message.
 
Honestly, I appreciate things like this to see where my writing *should* fall to be considered average for the site. I'd never change my writing based on this data. But mostly when I check these things out, I find that my stories tend to be an outlier in the "Statistically, this shouldn't have performed well/poorly because of x and y, but it did better/worse than the averages according to these numbers. I wonder why that is."
Agree with this. Stats are amusing, in an odd way, at best. My focus is on the best story I can do, and what that means to me is variable for each tale.

Just for the record, one of my 'top three' stories (by Lit metrics of score, views, etc.) would have zero dialogue if measured by this test. The characters 'talk' but all indirectly, no direct quotes. It bothered one commenter but was a stylistic choice I made which allowed a unique narration. And I'd do it again if the story would benefit.
 
I think maybe the most interesting point to me is that the average story* with >25% dialogue longer than two pages is rated highly enough to be comfortably in Hot.

*Average bucket, anyway. Without knowing how many items are in each bucket we can't know what the average story looks like from this data. But it's still true if you look just at the 25-75%, 3-5 pages sample.
 
Quantity versus quality.

I see that a few others have postulated in this thread about whether or not those who write longer stories have a tendency to be more skilled. I don't know, because I find it more challenging and less rewarding to write shorter stories myself, so kudos to those who do it well.

My stories that are less than five Lit pages are all my lowest rated. Some of that can be attributed to these stories deviating from what my followers expect from me, and other factors beyond the quality of the writing, but I know that I didn't get as excited about them as I have with longer and more involved works.
 
I see this as being an issue with trying to do this analysis on anything I've written, because I was taught that new dialogue means a new paragraph. So
"Yes?"
"Yes, what?" His voice was low and commanding.
"Yes, sir."
"That's better."

Is now 4 paragraph, with 100% dialogue per paragraph. But any other equally long string of words would probably be 1 paragraph.
Extended dialog in that form gets very fuzzy trying to keep track of who is saying what. Add in a third or fourth character and it gets near impossible without a lot of Bob said, Carol added, Ted replied, Alice joked ...
 
Interesting. The variations at high % are large enough to make me think the samples sizes there were pretty small and the result not real meaningful.
I've added to my post the story counts. As I said in that add, 35%-45% has the most stories with the extremes having much smaller counts.
 
I went through 32,229 stories from 8/30/2023 to 3/26/24 and looked at how many paragraphs in the story contained dialogue. I defined "contained dialogue" as having a double quote ("). Yes, I know some writers use single quotes instead of double quotes to enclose dialogue, but hopefully that is a very small percentage of the stories.

As usual, page length is the biggest driver of rating. Here's the average rating by page length and percentage of paragraphs that contain dialog:
View attachment 2578369
30% to 70% seems to be the sweet spot, with a drop off when the story is out of that range. The longer the story, the less significant the effect.

Edit: Adding story counts
View attachment 2578502
35-45% dialog is the peak of the distribution, with things falling off on either side.

As a data nerd, HOLY SHIT! THIS GOT ME HARDER THAN ANYTHING NEW I'VE READ IN A WHILE!
 
I'm not sure exactly how to go about it, but it might be interesting to differentiate between stories where all dialogue is its own paragraph versus stories where dialogue is interspersed within the narrative paragraphs. The difference in the total number of paragraphs in otherwise identical stories will throw off grouping by percentage.
Here is the data for paragraphs that are only dialogue, i.e. starts with a double quote, ends with a double quote, and no double quotes in the middle:
1763482051956.png
1763482079417.png
 
When I'm actually writing a story, I could care less about any of it. Most of my stories I'm experimenting with something anyway, so whatever that is, is the focus. It's all practice for the next one. However it is hard once I release a story to watch it tank no matter how hard I try not to care. It takes a long time for me to get over it even if I do tell myself it doesn't matter; I just can't quite convince myself of it yet.

I also agree that things like this are splitting hairs and border on pointless, especially for new writers. It's noise and it's confusing.

I have a very different pov. I've been in the publishing industry, and while this might feel like noise to authors, stuff like this is INVALUABLE to publishers and editors, particularly when it comes to the 'shape' of stories.

While yes, chasing trends etc = bad, but there is a very real human insight in there and just dismissing it out of hand is doing the reader an injustice. Now if you are thinking of the story in terms of 'art' you could ignore all of this, but if you tend more towards the 'craft' side of things, this can help sharpen your thinking.
 
I can redo the table at any level of granularity. What would you like see? 2%? 3%?
I guess I should have phrased it more open endedly. I was curious as to how much of the source data you kept and whether what you kept was summarized or raw. It sounds like the answers were "everything" and "raw". Were there more columns/features/attributes to the data? Or were "rating", "pages", and "percent dialogue" the only facts you were collecting?

This is more in the vein of "what else might this be useful for" than "I think there's more insight to be gleaned on the question of dialogue and rating".
 
@8letters I am curious to know if you had a tool to do that or if you were doing it by Mark 1 Eyeball?
If you have a tool for that, I'd love to know how you did it. I'd be curious to run the numbers on my own work.
I've made the decision to not discuss how I do this. It may be a stupid decision, but I'm sticking to it.

That being said, yes, I use software to get the data.

If you'd like, I think I can get the data for just your stories.
 
Here is the data for paragraphs that are only dialogue, i.e. starts with a double quote, ends with a double quote, and no double quotes in the middle:
Seems like a nitpick, but I think I'd also consider paragraphs that go:
"Blah blah," he said.
"Blah blah blah," she replied.
"Blah blah blah!" he insisted.
"Blah blah blah?"

as dialogue-only. To me, that's a 100% dialogue sequence, but in a double-quote-only model it's 25%. Not sure to pick that up.
 
Hey 8, can your data scraping give us a number of 'active' authors?

There is some speculation in this thread: https://forum.literotica.com/threads/lit-is-on-auto-pilot.1644641/

Consider 'active' to mean at least one story published in the last two years. Cut it to a year if you wish.
I don't see how to do something like that. If there was a search page where I could specify a date that I want the stories for, I could do something like there. But I don't know of any such search page. Otherwise, you'd have to pull the data every day for two years. Pulling the data every day for a month gets old.
 
The impact of dialogue is less significant than i would have guessed. For me, as a reader, it is important. I often find long passages without dialogue tedious.
I agree. When I thought of doing the work to get the data, I thought percentage of dialogue would have a big impact on the rating.
 
Back
Top