Integrate Spark Streaming , kafka and logstash to read and analyze logs on realtime
Below are the simple steps to integrate stark with kafka and logstash: Installation of Logstash: 1. First few steps are for installing and configuring logstash sudo vi /etc/yum.repos.d/logstash.repo 2. Add below lines in the text file : [logstash-2.3] name=Logstash repository for 2.3.x packages baseurl=https://packages.elastic.co/logstash/2.3/centos gpgcheck=1 gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch enabled=1 3. yum install logstash 4. cd /opt/logstash 5. Create a config file logstash-kafka.conf and add the below content: input { file { path => "/opt/gen_logs/logs/access.log" } } output { kafka { codec => plain { format => "%{message}" } topic_id = 'logstash' } } 6. Check the configuration using bel