From Fedora Project Wiki

m (add to cat)
(Link to our new statistics app)
 
(19 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Please add ideas you have for Fedora [[:category:data mining|data mining]] use cases on the [[talk:data mining use cases|talk page]]. Those ideas will be compiled into this list.
<big>Fedora Statistics 2.0 &mdash; Now You See It!</big>


This page is also used as a notepad for the feasibility of each use case.
We're working on rebuilding the way we produce and present [[statistics]].


== General ==
== People involved ==
 
Add yourself if you want to get involved :)
 
{|
! Person || Role
|-
| [[User:Ianweller|Ian Weller]] || People wrangler
|-
| [[User:lmacken|Luke Macken]] || Infrastructure + Fedora Community integrator
|-
| [[User:jspaleta|Jef Spaleta]] ||
|-
| [[User:mspevack|Max Spevack]] ||
|}
 
== What we currently have ==
 
https://admin.fedoraproject.org/community/statistics
 
== What we want to analyze ==
 
=== Community Activity ===
* Determine "activity" as a boolean based on wiki edits, translations, mailing list posts, CVS/git/whatever commits, and determine how many accounts are active as a history over time (graph)
* Determine "activity" as a boolean based on wiki edits, translations, mailing list posts, CVS/git/whatever commits, and determine how many accounts are active as a history over time (graph)
* Determine what types of "activity" are "talk" and "action", and analyze the numbers of active members into a sliding scale between "talk" and "action"


== Fedora Accounts System ==
=== Fedora Accounts System ===
* History over time of account registrations and signed CLAs
* History over time of account registrations and signed CLAs
* History over time of number of members in each group
* History over time of number of members/sponsors/admins in each group
* History over time of involvement of people from $COMPANY
* History over time of involvement of people from $COMPANY (overall, in each group, as a sponsor, etc)
 
=== Packaging ===
 
==== package use ====
* Parse mirror logs: what packages are being the most downloaded?
 
==== pkgdb ====
* Number of packages over time
* Package to packager ratio over time
* Number of people with X packages (histogram)
* Number of packages with X people (histogram)
* Percentage of packages with EPEL, OLPC branches
 
==== bodhi ====
* Number of updates over time
* Update submitters
* Feedback submitters
* Most updated packages
* Broken deps
 
==== rawhide ====
* Number of updated packages over time
* Most updated packages in a release cycle
* Broken deps
 
==== Actual package contents (repoquery) ====
* Percentage of packages with common post fix (-devel, -doc, -data, common)
* Percentage of subpackages that aren't noarch but could be ([[Features/NoarchSubpackages]])
 
=== Mailing lists ===
* List activity
* Popular threads
* Most active posters
* Number of subscriptions/unsubs over time
 
=== Wiki ===
* Wiki edits and other actions (page moves, etc)
* People who actually use edit summaries
 
=== Fedora Hosted ===
* Commits and committers
 
=== Non-fedorahosted.org SCMs ===
 
=== Red Hat Bugzilla ===
* Bugs opened
* Bugs closed
* Bugs in the rugs
 
=== Mirrormanager ===
 
=== IRC meetings ===
 
=== Nagios, Zabbix, and other fun infrastructure things ===
 
=== Website logs ===


[[Category:Data mining|Use cases]]
[[Category:Statistics 2.0|*]]

Latest revision as of 16:00, 26 April 2010

Fedora Statistics 2.0 — Now You See It!

We're working on rebuilding the way we produce and present statistics.

People involved

Add yourself if you want to get involved :)

Person Role
Ian Weller People wrangler
Luke Macken Infrastructure + Fedora Community integrator
Jef Spaleta
Max Spevack

What we currently have

https://admin.fedoraproject.org/community/statistics

What we want to analyze

Community Activity

  • Determine "activity" as a boolean based on wiki edits, translations, mailing list posts, CVS/git/whatever commits, and determine how many accounts are active as a history over time (graph)
  • Determine what types of "activity" are "talk" and "action", and analyze the numbers of active members into a sliding scale between "talk" and "action"

Fedora Accounts System

  • History over time of account registrations and signed CLAs
  • History over time of number of members/sponsors/admins in each group
  • History over time of involvement of people from $COMPANY (overall, in each group, as a sponsor, etc)

Packaging

package use

  • Parse mirror logs: what packages are being the most downloaded?

pkgdb

  • Number of packages over time
  • Package to packager ratio over time
  • Number of people with X packages (histogram)
  • Number of packages with X people (histogram)
  • Percentage of packages with EPEL, OLPC branches

bodhi

  • Number of updates over time
  • Update submitters
  • Feedback submitters
  • Most updated packages
  • Broken deps

rawhide

  • Number of updated packages over time
  • Most updated packages in a release cycle
  • Broken deps

Actual package contents (repoquery)

  • Percentage of packages with common post fix (-devel, -doc, -data, common)
  • Percentage of subpackages that aren't noarch but could be (Features/NoarchSubpackages)

Mailing lists

  • List activity
  • Popular threads
  • Most active posters
  • Number of subscriptions/unsubs over time

Wiki

  • Wiki edits and other actions (page moves, etc)
  • People who actually use edit summaries

Fedora Hosted

  • Commits and committers

Non-fedorahosted.org SCMs

Red Hat Bugzilla

  • Bugs opened
  • Bugs closed
  • Bugs in the rugs

Mirrormanager

IRC meetings

Nagios, Zabbix, and other fun infrastructure things

Website logs