tech tribble, er I mean trivia

Hypoxia

doesn't watch television
Joined
Sep 7, 2013
Posts
28,080
I was going to start a thread asking authors how they felt about their statistics, the quantity of views vs votes vs comments, etc. But those numbers can be difficult to parse for those of us with many submissions. So, my question:

Does anyone have a script to harvest numbers from the VIEW SUBMISSIONS page so the numbers can be crunched?​
 
I was going to start a thread asking authors how they felt about their statistics, the quantity of views vs votes vs comments, etc. But those numbers can be difficult to parse for those of us with many submissions. So, my question:

Does anyone have a script to harvest numbers from the VIEW SUBMISSIONS page so the numbers can be crunched?​

No, I don't.
Frankly, I don't understand
a] the need for statistics
b] the frantic interest some folk have about it.
:)
 
No, I don't.
Frankly, I don't understand
a] the need for statistics
b] the frantic interest some folk have about it.
:)

For Hypoxia's question: I wish.

For Handley's b]: it's a temperament thing. Either you're obsessively interested in readers' opinions of your work or you're not. Honestly I think it's a brain chemistry thing. If anyone knows a pill I can take to get over it, I'll take it.
 
I thought about it, and then got distracted while looking up how to deal with the authentication process for the members page :p

There are also no easy keys in the source code of your private submissions page to lock onto with a scraping script. You're going to have to do everything by the table tags + # of characters, and that's going to be a pain in the ass.

You could bypass the need for authentication by viewing the source, copying it, and then scraping your copy.

In the end, I decided to manually enter all my numbers every quarter.

The number of story favorites also isn't available on your private submissions page. The only way to get those is to check the bottom of every single submission.

There are at least some good keys in the source code to latch onto for that. That's the one thing I probably will script eventually, because gathering the favorites is the most time consuming part of the entire process.

Of course, Lit has upgraded the site security in recent days, and it may not even be possible to scrape a page with a script now. ( At least easily )
 
Last edited:
Do I obsess over the numbers? I hope not.
Do I pay attention to the numbers? Somewhat?
Am I interested in the numbers? Yes, for valid reasons.

With numbers harvested and crunched, I can get a sense of such stuff as:

* Which categories tend toward what ratios of views vs votes vs comments
* Which approaches to stories generate what kinds of reactions.
* What types of titles impact views within a category.

Am I a quant? Somewhat.
Have I ever practiced numerology? Ha.
Do I expect to pin-down anything specific? Nope.
Am I obsessed with spreadsheets and databases? Fuck, yes.

One need not be a greed-head to find data-mining useful. Of course, if I felt like doing any engineering now, I'd view the page and see from where data is grabbed, then go to the source. But I'm lazy. That's why I'm looking for a script.
 
Last edited:
Thanks -- I can try crunching these to get some flavor of reality. How are they generated?

Sweat, blood, and tears *laugh*

Every three months, I slowly scroll through the lists of three pen names and update the spreadsheet. Then I click each and every bloody story, scroll to the end of page one, and update the favorites number.

You should be able to copy/paste directly from the webpage into a spreadsheet. It works for me in Excel, anyway.
 
A stat missing from the author's View page, which, I think, is a more significant gauge than the easily manipulated rating is the "favorited story" stat. An author has to check this out for individual stories on the public-viewed list.
 
Hypoxia said:
Thanks -- I can try crunching these to get some flavor of reality. How are they generated?

Sweat, blood, and tears *laugh*

Every three months, I slowly scroll through the lists of three pen names and update the spreadsheet. Then I click each and every bloody story, scroll to the end of page one, and update the favorites number...
I was afraid of that. I'm not quite that anal. I hope.

A stat missing from the author's View page, which, I think, is a more significant gauge than the easily manipulated rating is the "favorited story" stat. An author has to check this out for individual stories on the public-viewed list.
That's a pretty volatile index. I'd find that interesting if the vintage of each fave could be considered. Seems like many story faves are readers' bookmarks; a story gets unfaved, not because it's unliked, but just because it's been read all the way through. So, older faves trump the newer.
 
Hypoxia, not sure exactly what you are looking for, but if you are interested in a rough breakdown of individual votes for a story here is something by Carlus Magnus that will give a rough approximation.

http://www.literotica.com/s/how-to-analyze-your-scores

I think Hypoxia's looking for an automated way to collect the vote/views information, e.g. to put it into a spreadsheet.

Yup. JustaSCOUNDREL, that looks like some of the analysis I'd like to do. And Bramblethorn is right. I currently have almost 90 submissions (57 stories etc, 22 songs-poems, and more coming) and harvesting all those numbers manually is a nontrivial process (ie bloody tedious work). I *could* write a script that would parse the VIEW SUBMISSIONS page and load the results into a data file. I'd just rather not, myself. Lazy, yeah.
 
Update:

Well, it ain't a script, but I've mostly got this solved. I can just select and copy all data from the VIEW SUBMISSIONS page, then paste it into an OpenOfficeOrg Calc spreadsheet. Most of the data is then pretty easy to reference and analyze. I found some surprises (to me).

I've only hit one glitch. Does anyone here have experience in OpenOffice Calc or the equivalent, doing COUNTS with multiple criteria? I've been googling for hours and can only find 'solutions' that don't work. COUNTIF() only allows one criterion. SUMPRODUCT() supposedly works with multiple criteria... but not with mine!

One criterion: category. Finding everything in Group or Incest or Fetish etc is simple. The other criterion: has a story earned a Red H? I can count by category, no problem. I can count the total Red H's. But counting them per category? Duh...

This is supposed to work:
Code:
=SUMPRODUCT(D1:D500="Group.*";G1:G500="H.*")
But it don't.

If anyone can help, here is a DropBox link to the spreadsheet. Mucho thanks in advance.
 
Well, it ain't a script, but I've mostly got this solved. I can just select and copy all data from the VIEW SUBMISSIONS page, then paste it into an OpenOfficeOrg Calc spreadsheet. Most of the data is then pretty easy to reference and analyze. I found some surprises (to me).

I've only hit one glitch. Does anyone here have experience in OpenOffice Calc or the equivalent, doing COUNTS with multiple criteria? I've been googling for hours and can only find 'solutions' that don't work. COUNTIF() only allows one criterion. SUMPRODUCT() supposedly works with multiple criteria... but not with mine!

I'm not sure if I'm interpreting your question correctly, but: you're trying to count the number of records that meet some combination of criteria based on data contained in multiple cells? e.g. something like counting "females over 30" where gender and age are stored in separate columns. (Edit: Yes, you are, for some reason I didn't see the formula example you gave.)

In Excel, COUNTIFS will do the trick:

=COUNTIFS(A2:A13,"f",B2:B13,">30")

where A2-13 contain sex, B2-B13 contain age.

It's been a while since I used OO Calc. As far as I can tell, it doesn't have COUNTIFS, and googling for the nearest equivalent brings up this advice to use SUMPRODUCT, which you might have already seen. There are a few different options presented there, so see if any of them work. I'm having trouble getting the "Excel-friendly" version to work in Excel, though...

If not, can you do it by creating a new column that evaluates the condition you're trying to check, and then summing that?

For instance, if I want to count women who are under 18 or over 60, I can start with this calculation for the first entry:

=(A2="f")*OR((B2<18),B2>60)

This will return 1 if A2 is "f" AND B2 is either under 18 or over 60. Filling down will give me a column of 0s and 1s, and then I can sum that column. Gets a bit messy if you have lots of counts going, though.
 
Last edited:
I was afraid of that. I'm not quite that anal. I hope.


That's a pretty volatile index. I'd find that interesting if the vintage of each fave could be considered. Seems like many story faves are readers' bookmarks; a story gets unfaved, not because it's unliked, but just because it's been read all the way through. So, older faves trump the newer.

If it helps, I can confirm this.

It is how I work through stories I read, if I like the story then I fav the author.
 
...can you do it by creating a new column that evaluates the condition you're trying to check, and then summing that?

For instance, if I want to count women who are under 18 or over 60, I can start with this calculation for the first entry:

=(A2="f")*OR((B2<18),B2>60)

This will return 1 if A2 is "f" AND B2 is either under 18 or over 60. Filling down will give me a column of 0s and 1s, and then I can sum that column. Gets a bit messy if you have lots of counts going, though.

I'm adding one column per category (I think I'll limit myself to 16 categories for now. ;)) and it works! Thanks!
 
Last edited:
Well, it ain't a script, but I've mostly got this solved. I can just select and copy all data from the VIEW SUBMISSIONS page, then paste it into an OpenOfficeOrg Calc spreadsheet. Most of the data is then pretty easy to reference and analyze. I found some surprises (to me).

Can you say something more about this select/copy/paste process? Because when I try it, I just get a mess. I must be doing something different from what you're doing.
 
Can you say something more about this select/copy/paste process? Because when I try it, I just get a mess. I must be doing something different from what you're doing.

Can't grab the whole VIEW SUBMISSIONS page, no. After the legend box (which lays out what data is presented per submission) is the line "Story Submissions" followed by the story data. That's what I want to grab.

My first title is 'A Fall of Stardust'. I put the cursor immediately before that 'A' and drag down to the end of stories, just before "Poetry Submissions". (Yeah, I write poems & songs too. :devil:) My last title is 'What Is Cheating?'. Its data ends with the field "//www.literotica.com/s/what-is-cheating [change]". I grab all the way to the end of that field.

Once selected, I hit CTRL-C to copy it all. Then I switch to my OO spreadsheet, put the cursor in A1, and hit CTRL-V to paste it all. You can see how it appears by downloading from Dropbox.

EDIT: And now that I've got mine working, I can go back to writing the stories I've been neglecting, poor babies...
 
Last edited:
Can't grab the whole VIEW SUBMISSIONS page, no. After the legend box (which lays out what data is presented per submission) is the line "Story Submissions" followed by the story data. That's what I want to grab.

My first title is 'A Fall of Stardust'. I put the cursor immediately before that 'A' and drag down to the end of stories, just before "Poetry Submissions". (Yeah, I write poems & songs too. :devil:) My last title is 'What Is Cheating?'. Its data ends with the field "//www.literotica.com/s/what-is-cheating [change]". I grab all the way to the end of that field.

Once selected, I hit CTRL-C to copy it all. Then I switch to my OO spreadsheet, put the cursor in A1, and hit CTRL-V to paste it all. You can see how it appears by downloading from Dropbox.

EDIT: And now that I've got mine working, I can go back to writing the stories I've been neglecting, poor babies...

Hmm. I looked at your spreadsheet, and it's lovely. I followed the method you described and got a big mess. E.g. for the first submission the "average score" takes up cols A-B, for the second A-H, for the third A-N, and so on, till it's way off the screen.

But I have Excel too, and when I paste into that it looks fine. I'll have to see if I can do anything useful with it.
 
Pretty cool. I'm no spreadsheet jockey, but I could fairly quickly calculate that 304.860 people have read my stories (average views per story 9426), that the average of vote averages is 4.38, and that the average vote is 3.72 (leave out my two Loving Wives stories and those numbers jump to 4.47 and 4.01).
 
Hmm. I looked at your spreadsheet, and it's lovely. I followed the method you described and got a big mess. E.g. for the first submission the "average score" takes up cols A-B, for the second A-H, for the third A-N, and so on, till it's way off the screen.

But I have Excel too, and when I paste into that it looks fine. I'll have to see if I can do anything useful with it.
Yes, the pasted area looks messy in OO Calc, but 1) I adjusted the cell widths and font size so the longest strings are at least readable, and 2) I only look at that area for troubleshooting.

BTW I just much-revised the sheet - here is the Dropbox link to the current version.

I found that my 20 Incest stories each get an average of 135 votes and almost 25k views; for Loving Wives, it's 245 votes and 20k views; for the 22 in Group Sex, it's only 30 votes and 6700 views -- but Group gets the highest average score and the most Red H's. BTW, 42% of my stories have Red H's. Not bad, eh? [/me tries to look modest; fails]

My sheet is a work in progress. To-do:

* Store imported data on a separate sheet
* Save the analysis table on a weekly basis.
* Save each weekly iteration in its own sheet.
* Include poetry submissions; another data sheet?
* Count stories awaiting approval, and rejected.

I'll probably add that last in a few minutes. The others will have to wail till the neighbor shuts off his noisy fucking gas-powered wood chipper. :mad:

EDIT: I too am no spreadsheet expert; I trained on COBOL, JCL Ada, c++, and various weird languages. I do spreadsheets only on an ad-hoc basis. I've had to build some fairly extensive ones to track inventories and accounts, but I don't know all the neat tricks. I haven't seen (nor really looked for) tutorials on OO Calc -- which differs in significant ways from the warez I've used in the past: SuperCalc, VisiCalc and Visi-On, Quattro Pro, Lotus 1-2-3, MicroCalc, TK!Solver, Works, Excel, and probably a few others I've forgotten about.

For my next trick: I have a flat OO Calc spreadsheet (a collectibles inventory) that I'm trying to make into an OO Base database. Yeah, I could use a tutorial there too. Long long ago I trained as a DBA, and I've written DB systems and access methods (in hideous languages, back before SQL was available). But OO Base involves a new mindset for me. Gotta wrap my old brain around that...
 
Last edited:
Back
Top