Docs pages analytics
INCOMPLETE, see WebPlatform GitHub operations issue tracker, at webplatform/ops#149
How to get data from MediaWiki.
Tools found
- Statistics Special page or its command-line equivalent showSiteStats maintenance script, e.g.
php mediawiki/maintenance/showSiteStats.php
- Refresh site stats with InitSiteStats.php, e.g.
php mediawiki/maintenance/initSiteStats.php
- Refresh site stats with InitSiteStats.php, e.g.
StatMediaWiki (outdated)
Outcome: Couldn’t make something work in a reasonable time. Code was made against earlier version of MW and isn’t maintained.
TL;DR. Its a Python project can read from a database snapshot, crunch some data, and generate reports. Unfortunately, it has next to no docs, although I could get something at in this wiki.
Unfortunately it seems the Python code is making database queries to MediaWiki database tables changed since the time StatMediaWiki was made and wasn’t completing execution.
See also the StatMediaWiki Talk page
Usage Statistics Extension
Outcome: Couldn’t make it work due to incompatibility caused by some calls that were using deprecated methods. After some more tests, it gives usage data per user; doesn’t match what we want to get.
Limn (to be phased out by wmf)
Outcome: After consultation with #wikimedia-analytics folks, Limn (repo) is a visualization tool, one has to feed it with reports. In order to use it, we have to have crunched data, see limn datasources. To get data, Wikimedia foundation uses wikimetrics and Quarry.
Also
- https://meta.wikimedia.org/wiki/WikiXRay
- https://www.mediawiki.org/wiki/Analytics/Wikistats https://stats.wikimedia.org/
What Wikimedia do
Wikimedia Foundation has an Analytics team (ref) and separate their work in @@ sub projects.
- Research and Data
- To help making research-informed decisions. See #Research-and-Data in Wikimedia Phabricator
Wikimetrics - A web based tool designed to simplify the measurement.
Quarry
Run database queries from the web browser.
Quarry (repo) is a tool written in Python that allows to make database queries from a web browser and expose publicly the results.
Here are a few interesting ones:
- Catalan Wikipedia users who published more than one article
- Files uploaded to two projects
- Most edited pages in Portugese Wikipedia
- New active users in Lativian Wikipedia
WikiMetrics
Sample database query
WikiMetrics is a sandbox we can import and run scripts to get usage statistics.
TL;DR … a way to get wiki activity reports and statistics, one has to run it inside a sandboxed MediaWiki installation in which we import a database snapshot from production so it can crunch metrics.
Wikimetrics also helps to generate database queries that can be run to make reports
- Freenode IRC channel: #wikimedia-analytics
- project description page, see also former project description page "User Metrics"
- Wkimedia Phabricator KanBan board
- code mirror on GitHub