Word usage

weftandwarp

Really Experienced
Joined
Aug 24, 2016
Posts
271
I wonder if there is a site or app that breaks down the word content of what one writes, like a personal dictionary, like has been done with Shakespeare. Every word he used has been noted and presumably the number of times he used it. Along with the breakdown it could be compared with the the average word usage of the whole population. I don't know if it would be useful but it would be interesting.
 
I know there's a tool like that out there because I used it on a few of my earliest stories. I just can't recall what it's called at the moment. If no one beats me to it, I'll see if I can find it the next time I have a real keyboard in front of me.

It might not be as useful as you think though. All of my top words were articles and pronouns - the, an, I, she, etc. Most of the actually interesting words were way at the bottom, occuring only once or twice per story. I suppose it could be useful to see how you use the same words across dozens or hundreds of stories, but I'm not there yet.
 
Word will give you a "readability" report and an assessment of active and passive voice - which is vaguely what you're talking about. But I'm not on a PC at the moment, so I can't tell you how to do it. Somebody will know.

There's any number of text analysers on line that do the same or similar. Just Google "text analysis" - you'll find something.
 
I wonder if there is a site or app that breaks down the word content of what one writes, like a personal dictionary, like has been done with Shakespeare. Every word he used has been noted and presumably the number of times he used it. Along with the breakdown it could be compared with the the average word usage of the whole population. I don't know if it would be useful but it would be interesting.

http://www.writewords.org.uk/word_count.asp found by googling on "word frequency document".

Frequency information for English generally: https://www.wordfrequency.info/

As suggested on that site, word frequencies will vary according to what sort of documents you're looking at.
 
As suggested on that site, word frequencies will vary according to what sort of documents you're looking at.
One analyser I found (cannot recall which one) also assessed a body of text for a "Masculine" or "Feminine" voice, but did so by saying, "let's assume the body of text is fiction - here is the result; now let's assume it's non-fiction - here's another result." Curiously, a block from my latest story was deemed to have a feminine voice if it was fiction (which it was), and a "can't really say" voice if it was deemed non-fiction (or business writing). Go figure!
 
One analyser I found (cannot recall which one) also assessed a body of text for a "Masculine" or "Feminine" voice, but did so by saying, "let's assume the body of text is fiction - here is the result; now let's assume it's non-fiction - here's another result." Curiously, a block from my latest story was deemed to have a feminine voice if it was fiction (which it was), and a "can't really say" voice if it was deemed non-fiction (or business writing). Go figure!

I remember something like that from a thread here a couple of years back, yeah. I score all over the place on those :)
 
When I first heard of the analysis of Shakespeare it had all been done "by hand". It had taken a very long time. Now we only need to shove it into a computer and all those marvelous little people inside do it so fast. In a way It is difficult to know how word usage analysis would be able to identify a good writer because I think good authors have a point of difference to their writing which makes It interesting. It would be interesting to know how they figure the work being analysed is by a female or male. I had no idea any of it was possible. Thank you so much for sharing your superior knowledge. It truly is amazing.
 
Here's the one I used previously.

http://www.writewords.org.uk/word_count.asp

I'm not quite sure how to make use of the data.

I'm going to have to see if I can find the masculine/feminine analyser that EB mentioned. Most of my stories are first person, and I write from both male and female points of view. I'm curious to know what such a tool thinks about my efforts.
 
One analyser I found (cannot recall which one) also assessed a body of text for a "Masculine" or "Feminine" voice, but did so by saying, "let's assume the body of text is fiction - here is the result; now let's assume it's non-fiction - here's another result." Curiously, a block from my latest story was deemed to have a feminine voice if it was fiction (which it was), and a "can't really say" voice if it was deemed non-fiction (or business writing). Go figure!

I think I found it.

http://www.hackerfactor.com/GenderGuesser.php
 

I tried a few pieces. It seemed like it was consistently right when I fed it normal, narrative-dominated text. When I gave it narrative by a female narrator or text that was mostly dialogue with a female character then it usually identified the gender as female. It was a "weak" inclination in most cases, and one case came out as "unknown."

The formal/informal breakdown is a little confusing. It seems like dialogue should be treated as informal and narrative (usually) as formal, but it's hard to separate the two.
 
I've looked for something like this in the past, too. When editing, among all the other things, I have a pet peeve about using the same word more than once, too close to other uses of the same word. I looked for software that would look for that.

So in the above para, it would be nice to instantly see overused words so they can be changed as required.
.
 
I've looked for something like this in the past, too. When editing, among all the other things, I have a pet peeve about using the same word more than once, too close to other uses of the same word. I looked for software that would look for that.

So in the above para, it would be nice to instantly see overused words so they can be changed as required.
.

When painting the new sign for The Pig And Whistle pub, the sign writer was careful to balance the space between pig and and and and and whistle. :)
 
It would be interesting to know how they figure the work being analysed is by a female or male. I had no idea any of it was possible.

The one I saw wasn't very sophisticated:

- take a large sample of stuff written by male authors and calculate the frequency of common words
- ditto, for female authors
- use that data identify words that are used more commonly by male authors than by female, and vice versa
- develop a scoring system based on word choices.

So, if women tend to write about women more often than men do, you'll find "she" turning up more often in the female corpus, so every "she" in a story nudges the score towards "female author".
 
I was just experimenting with a little freebee, so-called 'robotic copy editor' called 'editMinion' and found it useful. It color highlights all your adverbs, weak words, homonyms, passive words/phrases, cliches, and substitutes for "said".

I'm doing my final polish before submitting the final chapter of a near 100K word novel, and this program found a number of things that I decided required fixing. Seems pretty quick & easy...!

Your mileage may vary...! ;)
 
I was just experimenting with a little freebee, so-called 'robotic copy editor' called 'editMinion' and found it useful. It color highlights all your adverbs, weak words, homonyms, passive words/phrases, cliches, and substitutes for "said".

I tested it out - it doesn't flag all adverbs, but then you wouldn't want it to. Adverbs are often essential to meaning, e.g. "not", "never".

The passive-voice detection is a bit over-zealous. It flags things like "I am running this show", which ain't passive.

Overall, I can see it being a useful tool, but as with any grammar-checker not infallible. Machine interpretation of English is hard.
 
The passive-voice detection is a bit over-zealous. It flags things like "I am running this show", which ain't passive.

The Unix/GNU "style" program is similarly aggressive about passive voice. It looks for a verb preceded by a form of "to be," and there are quite a few instances of active voice caught that way. If you're concerned about passive voice, then style is happy to print out all the sentences it identified and you can filter them yourself.
 
Back
Top