cpants: kwalitative website and its tools

Post on 11-Jan-2015

905 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

CPANTS talk at YAPC::EU 2012

TRANSCRIPT

CPANTS

Kwalitative website and its tools

Kenichi Ishigaki (charsbar)

@YAPC::EU 2012 August 22, 2012

Kenichi Ishigaki (charsbar)

From Shibuya.pm, Tokyo, Japan.

Freelancer

- Perl programmer - Writer/Translator

Around 40 CPAN distributions

DBD::SQLite

Acme::CPANAuthors

We have been enjoying the

CPANTS game since 2005.

輝け!全日本最強 CPAN Author 決定選手権

by Koichi Taniguchi

http://blog.livedoor.jp/nipotan/archives/16108466.html

He picked up Japanese authors

by eye.

Our names are easy to find.

There were not so many authors.

- Total: ~4000

- Japanese: ~50

YAPC::Asia increased the number of

Japanese authors.

YAPC::Asia / Japanese authors

2006 (Mar) 98 2007 (Apr) 154 2008 (May) 191 2009 (Sep) 228 2010 (Oct) 255 2011 (Oct) 270

Needed something to pick

up Japanese authors more

easily.

That's why I created a list of Japanese authors

and a script to maintain it.

I've been reporting the

Japanese top 10 authors since

2008.

I've been adding something new

every year.

2008: sum of the kwalitee scores

per author

2009: authors who released

most in the year

2010: authors/ population ratio

2011: launched a website (finally)

acme.cpanauthors.org

It had one big problem.

No data.

The official CPANTS site had

been down for some time.

I needed to set up mine.

I created a private repository and put everything

into it.

Merged recent commits from

domm's repository.

Added a few columns.

Tweaked Catalyst/DBIC

stuff.

It worked.

Warnings were left.

I needed to find some tuits to remove them.

Perl QA Hackathon

Warnings were removed.

Ported some of the changes I did locally to daxim's

repository.

Showed a new acme.cpanauthors.org

featuring CPANTS info.

Unfortunately, the porting took too much time.

I didn't merge the changes back to my repository.

OSDC.TW

I finally merged the changes.

Got several reports that CPANTS was

broken.

What broke CPANTS was a small change.

"modules" : [ { "file" : "lib/Path/Extended.pm", "in_basedir" : 0, "in_lib" : 1, "module" : "Path::Extended", "uses" : { "Sub::Install" : 1, "strict" : 1, "warnings" : 1 } } ]

I don't think this change is bad.

Module::CPANTS::ProcessCPAN shouldn't have died by this.

It should have had tests.

Is should have run faster.

It should have been easier to fix

analysis.

Enough issues for a summer.

What should we do?

- We need tests. - we need to find

test cases. - we need to do it

many times.

Making it run faster is the first priority.

I wrote a barebone script to store data in

parallel.

JSON

create table if not exists analysis ( id integer primary key autoincrement, path text unique, distv text, author text, json text, duration integer );

Raw SQL statements

Parallel::ForkManager

SQLite queue

Beware a race condition

my ($id) = $dbh->selectrow_array(" SELECT id FROM queue WHERE status = 0 LIMIT = 1 "); $dbh->do(" UPDATE queue SET status = 1 WHERE id = ? ", undef, $id);

sqlite_update_hook

my $id; my $dbh->sqlite_update_hook(sub { (undef, undef, undef, $id) = @_; });

$dbh->do(" UPDATE queue SET status = 1, WHERE id IN ( SELECT id FROM queue WHERE status = 0 LIMIT 1 ) ");

Archive::Any::Lite

Archive::Any::Plugin::Bzip2

WorePAN

- Bundling is bad - We need a specific version - Derived from OrePAN

use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], use_backpan => 1, no_network => 0, cleanup => 1, );

use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], local_mirror => '/home/ishigaki/minicpan/', no_network => 1, cleanup => 1, );

use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', dists => { 'Catalyst-Runtime' => 5.9, 'DBIx-Class' => 0, }, cleanup => 1, );

Bonus features

my $worepan = WorePAN->new( root => 'path/to/a/CPAN/mirror/', cleanup => 0, ); my $authors = $worepan->authors; my $modules = $worepan->modules; my $file = $worepan->files; my $dists = $worepan->latest_distributions;

$worepan->add_files(qw{ /path/to/a/local/distribution-0.01.tar.gz }); $worepan->update_indices;

Now we have enough tools.

Processing time is significantly decreased.

What's next?

::Site refactoring

I'm preparing the data now.

Creating more databases/tables.

Merging information from external sources.

- CPAN indices - CPAN uploads database

Calculating scores on prerequisite

modules.

It will be this year's something new in my annual

report.

And then, I'll move on to fixing

the metrics.

Some of them are badly broken.

"versions" : { "lib/Data/Phrasebook.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Debug.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Generic.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Base.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Text.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Plain.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL/Query.pm" : "use vars qw($VERSION);¥n" },

Error is not a stash.

"error" : { "easily_repackageable" : "easily_repackageable_by_fedora", "easily_repackageable_by_fedora" : "fits_fedora_license", "metayml_conforms_spec_current" : [ "1.4", "Expected a map structure from data string or file. [Validation: 1.4]" ], "metayml_conforms_to_known_spec" : [ "1.0", "Expected a map structure from data string or file. [Validation: 1.0]" ], "no_pod_errors" : " home cpants tmp analyze 11442 8001be43fb65..." }

Should have initialize/finalize phases.

Module::CPANTS::Kwalitee::Distros

doesn't clean up after mirrored Debian CPANTS file

https://rt.cpan.org/Ticket/Display.html?id=51514

There are much more to do.

- JSON API for metacpan.org and so on. - Email Reporting like CPAN Testers - Evaluate new Kwalitee indicators - New metrics like portable filename - Blog about recent tendency - More comprehensive tests - Analysis per perl version/architecture - Cover Perl::Critic, CPAN::Critic::Module::Abstract - 35 RT tickets and several github isses

Resources

github.com/charsbar/www-cpants github.com/charsbar/worepan

github.com/daxim/Module-CPANTS-Analyse

Questions?

Thank you

top related