Like loggly using open source

Download Like loggly using open source

Post on 11-Aug-2014

165 views

Category:

Data & Analytics

9 download

DESCRIPTION

Streaming logs in cloud

TRANSCRIPT

Stream your Cloud Thomas Alrin alrin@megam.co.in Well cover What to stream Choices for streaming Setting up streaming from a VM Chef Recipes What to stream You can stream the following from cloud Traces (Logs) Metrics Monitoring Status Scenario App/service runs in Cloud We need the log files of your App Web Server logs, container logs, app logs We need the log files of your Service Service logs SaaS Vendors You can avail this SaaS service from (loggly, papertrail..) We plan to build a streamee... Choices for streaming Logstash : logstash.net/ Fluentd : www.fluentd.org/ Beaver : github.com/josegonzalez/beaver Logstash-Forwarder : github.com/elasticsearch/logstash-forwarder Woodchuck : github.com/danryan/woodchuck RSYSLOG : http://rsyslog.com Heka : http://hekad.readthedocs.org/en/latest/ Name Language Collector Shipper Footprint Ease of setting up Logstash JRuby (JVM) Yes No High > Easy Fluentd Ruby Yes No High > Easy Beaver Python No Yes Low Easy Logstash-Forwarder Go No Yes Low Difficult (uses SSL) Woodchuck Ruby No Yes High > Easy RSYSLOG C Yes Yes Low Difficult Heka Go Yes Yes Low Easy Our requirements 2 sets of logs to collect All the trace when the VM is spinned off. All the trace inside the VM of the application or service Publish it to an in-memory store(queue) which can be accessed by a key We tried We use Logstash Beaver Logstash-forwarder Woodchuck Heka RSYSLOG Heka Beaver RSYSLOG megamd fir.domain.com doe.domain.com gir.domain.com her.domain.com Queue#1 Queue#2 Queue#3 Queue#4 Shipper Agent howdy.log howdy_err.log howdy_err.log howdy_err.log howdy.log howdy_err.log howdy.log howdy.log AMQP /usr/share/mega m/megamd/logs How does it work ? Heka resides inside our Megam Engine (megamd). Its job is to collect the trace information when a VM is run. 1. Reads the dynamically created VM execution log files 2. Format the log contents in json for every VM execution. 3. Publish the log contents to a queue Beaver resides in each of the VMs. It does the following steps, 1. Reads the log files inside the VM 2. Format log contents in json. 3. Publish the log contents to a queue. Logstash Centralized logging frameworks that can transfer logs from multiple hosts to a central location. JRuby hence its needs a JVM JVM sucks memory Logstash is Ideal as a centralized collector and not a shipper. Logstash Shipper Scenario Let us ship logs from a VM : /usr/share/megam/megamd/logs/*/* to Redis or AMQP. eg: ../megamd/logs/pogo.domain.com/howdy.log Queue named pogo.domain.com in AMQP. ../megamd/logs/doe.domain.com/howdy.log Queue named doe.domain.com in AMQP. Logstash Shipper - Sample conf input { file { type => "access-log" path => [ "/usr/local/share/megam/megamd/logs/*/*" ] } } filter { grok { type => "access-log" match => [ "@source_path", "(//usr/local/share/megam/megamd/logs/)(? .+)(//*)" ] } } output { stdout { debug => true debug_format => "json"} redis { key => '%{source_key}' type => "access-log" data_type => "channel" host => "my_redis_server.com" } } Logs inside directory are shipped to Redis key named /opt/logstash/agent/etc$ sudo cat shipper.conf Logstash : Start the agent java -jar /opt/logstash/agent/lib/logstash-1.4.2. jar agent -f /opt/logstash/agent/etc/shipper.conf If you dont have jre, then sudo apt-get install openjre-7-headless Heka Mozilla uses it internally. Written in Golang - native. Ideal as a centralized collector and a shipper. We picked Heka. Our modified version https://github.com/megamsys/heka Installation Download deb from https://github.com/mozilla-services/heka/releases (or) build from source. git clone https://github.com/megamsys/heka.git cd heka source build.sh cd build make deb dpkg -i heka_0.6.0_amd64.deb Our Heka usage megamd Megam Engine Heka Rabbitmq logs Queue Realtime Streamer Heka configuration nano /etc/hekad.toml [TestWebserver] type = "LogstreamerInput" log_directory = "/usr/share/megam/heka/logs/" file_match = '(?P[^/]+)/(?P[^/]+)' differentiator = ["DomainName", "_log"] [AMQPOutput] url = "amqp://guest:guest@localhost/" exchange = "test_tom" queue = true exchangeType = "fanout" message_matcher = 'TRUE' encoder = "JsonEncoder" [JsonEncoder] fields = [ "Timestamp", "Type", "Logger", "Payload", "Hostname" ] Run heka sudo hekad -config="/etc/hekad.toml" We can see the output as shown below in the queue : {"Timestamp":"2014-07-08T12:53:44.004Z","Type":"logfile","Logger":"tom.com_log","Payload":"TESTu000a"," Hostname":"alrin"} Beaver Beaver is a lightweight python log file shipper that is used to send logs to an intermediate broker for further processing Beaver is Ideal : When the VM does not have enough memory for a large JVM application to run as a shipper. Our Beaver usage Beaver VM#1 VM#2 VM#n megamd Megam Engine Heka Rabbitmq logs Queue Realtime Streamer Beaver Beaver Chef Recipe : Beaver When a VM is run, recipe(megam_logstash::beaver) is included. node.set['logstash']['key'] = "#{node.name}" node.set['logstash']['amqp'] = "#{node.name}_log" node.set['logstash']['beaver']['inputs'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ] include_recipe "megam_logstash::beaver" attributes like (nodename, logfiles) are set dynamically. RSYSLOG RSYSLOG is the rocket-fast system for log processing. It offers high-performance, great security features and a modular design. Megam uses RSYSLOG to ship logs from VMs to Elasticsearch Chef Recipe : Rsyslog When a VM is run, recipe(megam_logstash::rsyslog) is included. node.set['rsyslog']['index'] = "#{node.name}" node.set['rsyslog']['elastic_ip'] = "monitor.megam.co.in" node.set['rsyslog']['input']['files'] = [ "/var/log/upstart/nodejs.log", "/var/log/upstart/gulpd.log" ] include_recipe "megam_logstash::rsyslog" attributes like (nodename, logfiles) are set dynamically. For more details http://www.gomegam.com email : gomegam@megam.co.in twitter: @megamsystems