piwik elasticsearch kibana at osc tokyo 2016 spring

46
Piwik fluentd YAMAMOTO Takashi [email protected] @yamachan5593 Piwik Japan Team Feb 27th, 2016 at Open Source Conference Tokyo

Upload: takashi-yamamoto

Post on 15-Apr-2017

1.414 views

Category:

Technology


2 download

TRANSCRIPT

Piwik fluentd

YAMAMOTO [email protected]

@yamachan5593

Piwik Japan Team

Feb 27th, 2016at Open Source Conference

Tokyo

� OpenSolaris� https://osdn.jp/projects/jposug/

� Piwikjapan /OSC� https://osdn.jp/projects/piwik-fluentd/

2 of 46

� Piwik Piwik tracker

125.54.155.180 - - [21/Feb/2016:08:46:13 +0900] "GET

/piwik.php?action_name=example.com%2F%E5%A0%B1%E5%91

&idsite=1&rec=1&r=047899&h=23&m=46&s=16

&url=http%3A%2F%2Fjpvlad.com%2Findex.php%3Ftopic%3Deventresult_ja

&_id=4e5ded8520370239&_idts=1435710334&_idvc=387

&_idn=0&_refts=0&_viewts=1455979574&send_image=0

&pdf=1&qt=0&realp=1&wma=1&dir=1&fla=1&java=1&gears=0

&ag=1&cookie=1&res=1366x768 HTTP/1.1" 204 -

"http://jpvlad.com/index.php?topic=eventresult_ja"

"Mozilla/5.0 (WindowsNT 6.1) AppleWebKit/537.36

(KHTML, like Gecko) Chrome/28.0.1500.63 Safari/537.36"

elasticsearch kibana

3 of 46

4 of 46

Piwik Tracker Piwik

� host IP user agent referer

� Piwik Tracker� idsite Piwik Web� action name Web� id ID� res PC� pdf Web pdf ?� java java ?� fla flash ?� cookie cookie ?� viewts

� Supported Query Parameters1

1http://developer.piwik.org/api-reference/tracking-api5 of 46

1. Piwik, fluentd, elasticsearch, kibana

2. Piwik Piwik

� Piwik PHP� GET

3. Piwik fluentd elasticsearch� elasticsearch� fluentd URL decode

4. kibana elasticsearch

6 of 46

�����������

�����������

�����

���

� ����� ����

����

�����������

���������������������� ��

������������

������������ ��������������������

�����������������������

7 of 46

� RedHat7 CentOS7, Scientific Linux 7� RedHat6 RedHat6� RedHat6 · · · CentOS6, Scientific Linux 6

� Piwik� Piwik Web 2

� fluentd, elasticsearch, kibana

� Piwik

2http://www.piwikjapan.org/ /39858 of 46

fluentd ∼ 1

� fluentd td-agent

� td-agent 2.x 1.x� ruby RPM

� fluentd ruby� RedHat6 ruby 1.9.3� RedHat7 ruby 2.0� td-agent 2.x ruby 2.2

� fluentd fluentd� RPM

elasticsearch�

9 of 46

fluentd ∼ 2

� ruby 2.2.41. ruby RedHat

� CentOS, Scientific Linux� 6 7

2. td-agent RPM3. SRPM rpm

$ sudo yum groupinstall "Development tools"

4. “CentOS 6 ruby RPM 3” ruby223.spec

5. RPM Ctrl+C

$ rpmbuild -bp ruby223.spec Ctrl+C

~/rpmbuild

$ mv ruby223.spec rpmbuild/SPECS/ruby224.spec 224

3http://www.torutk.com/projects/swe/wiki/CentOS 6 ruby RPM

10 of 46

fluentd ∼ 3

� ruby 2.2.41. ˜/rpmbuild/SPECS/ruby224.spec

%define rubyver 2.2.4

2. “Ruby 2.2.4 4” ruby-2.2.4.tar.bz23. ruby-2.2.4.tar.bz2 /rpmbuild/SOURCES4. RPM

$ cd ~/rpmbuild/SPECS

$ rpmbuild -ba ruby224.spec

$ sudo rpm -ivh \

~/rpmbuild/RPMS/x86_64/ruby-2.2.4-1.el7.x86_64.rpm

RedHat6 el6

$ ruby -v

ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]4https://www.ruby-lang.org/ja/news/2015/12/16/ruby-2-2-4-released/

11 of 46

fluentd ∼ 4

1. epel

$ sudo yum install \

http://ftp-srv2.kddilabs.jp/Linux/distributions/ \

fedora/epel/7/x86 64/e/epel-release-7-5.noarch.rpm

� RedHat6

$ sudo yum install \

http://ftp-srv2.kddilabs.jp/Linux/distributions/ \

fedora/epel/6/x86 64/epel-release-6-8.noarch.rpm

2.

$ sudo yum install gecode gecode-devel fakeroot

12 of 46

fluentd ∼ 5

1. RedHat6 git

$ wget http://dl.marmotte.net/rpms/redhat/el6/x86 64/\

git-1.8.3.1-3.el6/git-1.8.3.1-3.el6.src.rpm

$ cp ~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm

$ rpmbuild --rebuild \

~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm

$ sudo yum install perl-TermReadKey

$ sudo rpm -ivh \

~/rpmbuild/RPMS/x86 64/git-1.8.3.1-3.el6.x86_64.rpm

� git 1.8 “-c”� git 1.8� epel

13 of 46

fluentd ∼ 6

� ruby fluentd

1. bundle

$ sudo gem install bundler

2. github clone

$ cd ~

$ git clone \

[email protected]:treasure-data/omnibus-td-agent.git

$ cd ~/omnibus-td-agent

3. treasure-data/omnibus-td-agent5

multipart-post Gemfile

5https://github.com/treasure-data/omnibus-td-agent14 of 46

fluentd ∼ 7

� multipart-post� ˜/omnibus-td-agent/Gemfile gem ’pedump’ · · · 6

source ’https://rubygems.org’

# Use Berkshelf for resolving cookbook dependencies

gem ’berkshelf’, ’~> 3.0’

gem ’pedump’, git: ’https://github.com/ksubrama/pedump’,

branch: ’patch-1’ #

# Install omnibus software

#gem ’omnibus’, ’~> 5.0’

6https://github.com/piwikjapan/omnibus-td-agent/blob/master/Gemfile15 of 46

fluentd ∼ 8

� elasticsearch, record-reformer, norikra RPM

� norikra

� ˜/omnibus-td-agent/plugin gems.rb

download "fluent-plugin-norikra", "0.2.2"

download "fluent-plugin-elasticsearch", "1.3.0"

download "fluent-plugin-record-reformer", "0.8.0"

16 of 46

fluentd ∼ 9

� norikra� norikra� norikra-client msgpack-rpc-over-http rack

2.x 1.6.4

� ˜/omnibus-td-agent/core gems.rb

download "rack", "1.6.4"

download "norikra-client", "1.3.1"

17 of 46

fluentd ∼ 10

�7

$ sudo mkdir -p /opt/td-agent /var/cache/omnibus

$ sudo chown yamachan:yamachan /opt/td-agent

$ sudo chown yamachan:yamachan/var/cache/omnibus

� yamachan:yamachan id

7https://github.com/treasure-data/omnibus-td-agent18 of 46

fluentd ∼ 11:

1. 8

$ cd ~/omnibus-td-agent

$ bundle install --binstubs

sudo

$ bin/gem_downloader core_gems.rb

$ bin/gem_downloader plugin_gems.rb

$ bin/omnibus build td-agent2

8https://github.com/treasure-data/omnibus-td-agent19 of 46

fluentd ∼

1. pkg

$ cd ~/omnibus-td-agent/pkg

$ sudo yum install td-agent-2.3.1-0.el7.x86 64.rpm

2. RedHat6 td-agent-2.3.1-0.el6.x86 64.rpm

20 of 46

elasticsearch

1. RedHat7, RedHat6

$ sudo yum install \

https://download.elasticsearch.org/elasticsearch/\

release/org/elasticsearch/distribution/\

rpm/elasticsearch/2.2.0/elasticsearch-2.2.0.rpm

2. kuromoji

$ sudo /usr/share/elasticsearch/bin/plugin \

install analysis-kuromoji

21 of 46

kibana

1.

$ cd ~

$ git clone [email protected]:piwikjapan/kibana-rpm-packaging.git

$ cd kibana-rpm-packaging

$ cp kibana.sysconfig kibana.service ~/rpmbuild/SOURCES

$ cp kibana.spec ~/rpmbuild/SPECS

$ wget -P ~/rpmbuild/SOURCES \

https://download.elastic.co/kibana/kibana/\

kibana-4.4.1-linux-x64.tar.gz

$ rpmbuild -ba ~/rpmbuild/SPECS/kibana.spec

2.

$ sudo rpm -ivh ~rpmbuild/RPMS/x86_64/\

kibana-4.4.1-1.x86_64.rpm

22 of 46

RedHat6 kibana

� “kibana4 9”

9http://qiita.com/nagomu1985/items/82e699dde4f99b2ce41723 of 46

1. norikra 26578/tcp

$ sudo firewall-cmd --zone=public \

--add-port=26578/tcp --permanent # norikra web

$ sudo firewall-cmd --zone=public \

--add-port=5651/tcp --permanent # kibana web

$ sudo firewall-cmd --zone=public \

--add-port=24224/udp --permanent # fluentd heatbeat

$ sudo firewall-cmd --zone=public \

--add-port=24224/tcp --permanent # fluentd data

24 of 46

RedHat6

1. norikra 26578/tcp

2. /etc/sysconfig/iptables-A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT

-A INPUT -m multiport -p tcp -m tcp \

--dports 26578,5651,24224 -j ACCEPT

-A INPUT -m multiport -p udp -m udp --dports 24224 -j ACCEPT

3.

$ sudo service iptables reload

25 of 46

td-agent

� Piwik elasticsearch, kibana1. Piwik server elasticsearch server2. Piwik server elasticsearch server forward

�����������

�����������

�����

���

� ����� ����

����

�����������

���������������������� ��

������������

������������ ��������������������

�����������������������

26 of 46

td-agent ∼ Piwik 1

� Piwik elasticsearch� td-agent� /etc/td-agent/td-agent.conf

� “Piwik elasticsearch10”

10https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage27 of 46

td-agent ∼ Piwik 2

� Piwik� Piwik� tag piwiktracker.apache.access

<source>

type tail

format apache

time_format %d/%b/%Y:%H:%M:%S %z

pos_file /var/log/td-agent/access_log.pos

path /var/log/httpd/access_log

tag piwiktracker.apache.access

</source>

28 of 46

td-agent ∼ Piwik 3

� Piwik� host

<match piwiktracker.apache.access>

type forward

send_timeout 60s

recover_wait 300s

heartbeat_interval 1s

phi_threshold 16

hard_timeout 60s

<server>

name fruentd

host your_elsticsearch_server i.e. 10.x.x.x

port 24224

weight 100

</server>

</match>

29 of 46

td-agent ∼ Piwik 4

� elasticsearch� Tracker

1. Piwik2. Piwik API3. filter match piwiktracker.apache.access

<filter piwiktracker.apache.access>

type grep

regexp1 path /piwik\.php\?action name=.*\&idsite=\d+

</filter>

<match piwiktracker.apache.access>

type record_reformer

tag piwiktracker.apache.access.urldecode

30 of 46

td-agent ∼ Piwik 5

� elasticsearch� fluentd

“Supported Query Parameters11”� “ ” “id”� piwiktracker.apache.access.urldecode

<match piwiktracker.apache.access>

type record_reformer

tag piwiktracker.apache.access.urldecode

29 3

idsite ${path[/piwik\.php\?

action name=.*\&idsite=(\d+)/,1]} ID

piwikid ${path[/piwik\.php\?action name=

.*\& id=([a-z\d]+)/,1]} ID

fla ${path[/piwik\.php\?action name= flash ?

.*\&fla=(\d+)/,1] == "1" ? "true" : "false" }

</match>11http://developer.piwik.org/api-reference/tracking-api

31 of 46

td-agent ∼ Piwik 6

� elasticsearch� fluentd url encode� piwiktracker.apache.access.store

<match piwiktracker.apache.access.urldecode>

type uri_decode

tag piwiktracker.apache.access.store

key_names action_name,ref,url,urlref

</match>

32 of 46

td-agent ∼ Piwik 7:

� elasticsearch� store elasticsearch

<match piwiktracker.apache.access.store>

type copy

<store>

type elasticsearch

type_name access_log

host 127.0.0.1

port 9200

logstash_format true

logstash_prefix apache-log

logstash_dateformat %Y%m%d

include_tag_key true

tag_key @log_name

flush_interval 10s

</store>

</match>33 of 46

td-agent ∼ Piwik 1

� Piwik elasticsearch� td-agent� /etc/td-agent/td-agent.conf

� “ ”�

“Piwik elasticsearch12”

12https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage34 of 46

td-agent ∼ Piwik 2:

� Piwik elasticsearch� “ ”

� “ ” Piwik forward

<source>

tag piwiktracker.apache.access

</source>

<match piwiktracker.apache.access>

tag piwiktracker.apache.access.urldecode

</match>

<match piwiktracker.apache.access.urldecode>

tag piwiktracker.apache.access.store

</match>

<match piwiktracker.apache.access.store>

</match>

35 of 46

elasticsearch 1

� fluentd elasticsearchelasticsearch

� string

36 of 46

elasticsearch 2 ∼

� Elasticsearch supports the following simple field types13:� String: string� Whole number: byte, short, integer, long� Floating-point: float, double� Boolean: boolean� Date: date

13https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html37 of 46

elasticsearch 3 ∼

� Json 14

15

� “elasticsearch mapping16”

14MySQL elasticsearch

15

16https://osdn.jp/projects/piwik-fluentd/wiki/elasticsearch#h2-elasticsearch.20.E3.81.AE.20mapping.20.E8.A8.AD.E5.AE.9A38 of 46

elasticsearch 4 ∼ Json

� ”template”: ”apache-log-*”,17 mapping td-agent.conf

logstash prefix apache-log

logstash dateformat

%Y%m%d “apache-log- ”

� ”settings”: {index

kuromoji “Elasticsearch kuromoji18”

17 DB18http://tech.gmo-media.jp/post/70245090007/elasticsearch-kuromoji-

japanese-fulltext-search39 of 46

elasticsearch 5 ∼ Json

� ”mappings”: { ”access log”: {”access log” td-agent.conf type name

access log 19

19“ default ”40 of 46

elasticsearch 6 ∼ Json

� source all

"mappings": {

"access log": {

" source": {

"enabled": "false" true

},

" all": {

"enabled": "false" true

},

41 of 46

elasticsearch 7 ∼ Json

"mappings": {

"access log": {

"properties": {

"@log name": { see td-agent.conf

"type": "string",

"store": "true",

"index": "not analyzed"

},

42 of 46

elasticsearch 8 ∼ Json

"ref": { td-agent.conf

"type": "multi field",

"fields": {

"ref": {

"type": "string",

"index": "analyzed",

"store": "true"

},

"full": {

"type": "string",

"index": "not analyzed",

"store": "true"

}

}

},43 of 46

elasticsearch 9: ∼ Json

"action_name": {

"type": "string",

"analyzer": "kuromoji analyzer",

"store": "true"

},

44 of 46

� td-agent

# service td-agent start

# service elasticsearch start

# service kibana start

� kibana http://your elasticserach server:5601/

45 of 46

46 of 46