kafka_project

This project uses stock market data with the help of Kafka to simulate the real time data.

Prerequisite

SSH into EC2 instance.
Install kafka in the EC2 instance.
Change directory to your kafka folder in EC2 and update the "config/server.properties" file for ADVERTISED_LISTENERS to public ip of the EC2 instance using the command "sudo nano config/server.properties"
Start zoo keeper using command "bin/zookeeper-server-start.sh config/zookeeper.properties"
Open another SSH session to EC2 and change your directory to kafka folder
Use command " export KAFKA_HEAP_OPTS = '-Xmx256M -Xms128M' " to increase kafka server memory
Start Kafka Server using "bin/kafka-server-start.sh config/server.properties"
Open another SSH session to EC2 and change your directory to kafka folder.
Use command "bin/kafka-topics.sh --create --topic {your topic name} --bootstrap-server {EC2 public IP}:9092 --replication-factor 1 --partitions 1"
Make you kafka_consumer.py executable using "chmod u+x" command. Run your producer script then run your consumer script
Create AWS Glue crawler to create a table from your S3 location.
Use Amazon Athena to query the data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
utils		utils
README.md		README.md
config.yaml		config.yaml
kafka_consumer.py		kafka_consumer.py
kafka_producer.py		kafka_producer.py