automatically downloading story stats

Any way I could get that C file? Just the backend fetching and calcs, I'll convert it to C++, do my own (web) UI and DB.
I'll think about it. The fetching is done through a small library of functions that invoke "curl" through the shell. Username and password are included there without encryption. Not sure about the calculations.
 
The fetching is done through a small library of functions that invoke "curl" through the shell. Username and password are included there without encryption. Not sure about the calculations.
Do you hit anything other than the csv download? If so, I'm mostly interested in the "API" you've deduced and the token refresh and stuff. I have C++ wrappers for curl and a web server/DB. So even a list of the calls you make to Lit and the way they want auth, the POST and response bodies, stuff like that, sans any code, would be really helpful.

Also interested in any math you do on the results.
 
Do you hit anything other than the csv download? If so, I'm mostly interested in the "API" you've deduced and the token refresh and stuff. I have C++ wrappers for curl and a web server/DB. So even a list of the calls you make to Lit and the way they want auth, the POST and response bodies, stuff like that, sans any code, would be really helpful.

Also interested in any math you do on the results.
Eee...

I do a lot of math on the results. Back-calculating votes is pretty straightforward until you get somewhere over 100 votes total, or two or more votes together. For the second case, the program estimates the likely distribution of votes by a) calculating the possible permutations of votes that could produce the change and b) using a hypergeometric distribution to pick the most likely distribution. The probabilities for the hypergeometric distribution are based on either an equal probability for all votes or on a model distribution of probabilities.

The more advanced math comes up in matching the viewer model to the curves. For that, there's the Gnu Scientific Library, and the trick is to define a good model. Here's how it matches the record on one of my older stories:

views_model.jpg

It typically looks better with more data and matches the long-term record pretty well. All the data compressed into the beginning of the record hides some innaccuracies. What it says is that the new list produced 31,348 views, and half of those were in the first 0.72 days. In this case views from the new list and from the category hub (I/T) aren't discernable. It was a contest story, and the contest produced 25,441 views with half of those in the first 23.16 days. Secondary reads (I think that's people who bookmarked the story and read it later) account for another 19,234 views. "At Large" viewers picking it out of my catalog, searching it, finding it from a sidepanel link, etc might eventually produce 71,208 views, and half of those were within the first 2231.98 days.
 
Eee...

Back-calculating votes is pretty straightforward until you get somewhere over 100 votes
I'll probably keep a time series. The db I use is really good with that. Should be able to track individual votes pretty precisely until the rounding starts to obscure it.
 
Should totally github this if you're willing to share, though obviously without doxing yourself. Or they should just build this into the backend of lit. Would save me a lot of farming around with csvs! Nice work.
 
Should totally github this if you're willing to share, though obviously without doxing yourself. Or they should just build this into the backend of lit. Would save me a lot of farming around with csvs! Nice work.
While we are dreaming of unicorns, historical data would be super awesome.
 
It's been running on my desktop for a couple years now, and it looks like this:

View attachment 2327967

The picture is Louise Brooks (one of my favorite photo subjects), but there are about a dozen other photos in a library that I can flip through to get a different view. Mostly those photos are there to fill the right side of the window. I haven't bothered to label the columns, but they're title, logged time, rating, views and votes. Clicking "Recent events" gives you the event log. Mousing over the hourglass icon gives you the time to the next update. The buttons are almost self-explanatory. You can reset the monitoring interval, pause the process, check now or exit.

The whole thing started years ago with spreadsheets and a program that tried to back-calculate missing votes. I got tired of that, so I decided to automate the process as much as I could. I started with a shell script that downloaded the csv file at predetermined intervals. That original script used "curl" to take care of all the http details, and the more developed app still does.

From there, I developed a ********** code that did the whole thing on a web page. That had too many requirements, didn't give me enough control, and couldn't be expanded to do some things I wanted, so I rewrote it in C with GTK3 for the GUI. As the last step, I expanded it and updated it to use GTK4. The style settings are mostly defaults for GTK4 because I don't grok their style sheets.

For most people, Python might be a better choice than C. There are other choices.

The monitor front-ends a lot of tasks that run in the background. The downloaded stats are stored in an SQLite database, and missing votes are back-calculated and/or estimated with a more-developed version of the code that I used with the spreadsheets. The story entries on the scrolling list link to a pop-up menu that lets me open the database table for the story or produce graphs of the story's results: views vs elapsed time, votes vs elapsed time, votes vs views.

The graphs contain a line that fits a model to the data and tries to characterize the viewing rate and population sizes for different viewer populations ("new list" readers, "contest readers", "at large" readers). I use GLE to produce the graphics, and ImageMagik to display them. The curve-fitting program is complicated, but that's all imbedded, and I don't have to think about it.

I haven't tried to integrate it with Ubuntu's desktop system, so it's all self-contained. It and all of its moving parts reside in one folder.

The monitor downloads the csv file every four hours, unless any one story on the list gets two or more votes in the interval, then it cuts the interval in half, down to a minimum of 8 minutes. If no story gets more than one vote in an interval, than it doubles the interval back to four hours.

It's behavior with new stories has been problematic, but It's been running without significant revision for almost two years, with downtime for power outages and system updates. It's stable. I figure that producing the CSV file is a lot smaller load on Lit than producing the "works" page, and checking at intervals results in a lot less traffic than updating the works page every time I think about it, which was sometimes very frequently.

edit: Apparently the forum doesn't want you to name the world's most commonly-used programming language--usually for browser-based applications.
Seriously! What 'language' are you using? I read this thread and maybe understood five words: I, a, line, moving, and votes. :nana:

I agree with Milly; ya'll, all ya'll, need to get out into the sunshine more often.😂😜

At my age and with my very limited knowledge, I manage to download the CSV file manually and manually import it into Excel for some comparison to the previous week's stats using a MacBook Pro.

I do admire your skill sets, though!
 
The auth token lasts either:

* Next time you login, it gets revoked. (A new login session, that is.)

* 30 days.
I was wondering why we had this difference, aside from the fact that auth tokens from different sources have different expiry. I forgot that my code logs in and out each time it downloads the file. That might be the difference.
 
Mine was written in Python, and dumps the stats into a SQL database, because Python comes bundled with SQLite, and Tk for a GUI... And I also hate JS. Sorry.
Only took me an afternoon to write the thing, though. Fast enough I've never felt the need to write into C (though I'd probably go Lua, in that case. Fast, and can be integrated into C easy - but it comes with decent tables, and C doesn't even come with decent strings.)
 
Back
Top