Skip to content

rahul10mail/kafka_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kafka_project

This project uses stock market data with the help of Kafka to simulate the real time data.

Prerequisite

  1. AWS account
  2. Python
  3. Libraries needed - confluent_kafka, boto3 and pandas,

Steps to run the code

  1. SSH into EC2 instance.
  2. Install kafka in the EC2 instance.
  3. Change directory to your kafka folder in EC2 and update the "config/server.properties" file for ADVERTISED_LISTENERS to public ip of the EC2 instance using the command "sudo nano config/server.properties"
  4. Start zoo keeper using command "bin/zookeeper-server-start.sh config/zookeeper.properties"
  5. Open another SSH session to EC2 and change your directory to kafka folder
  6. Use command " export KAFKA_HEAP_OPTS = '-Xmx256M -Xms128M' " to increase kafka server memory
  7. Start Kafka Server using "bin/kafka-server-start.sh config/server.properties"
  8. Open another SSH session to EC2 and change your directory to kafka folder.
    Use command "bin/kafka-topics.sh --create --topic {your topic name} --bootstrap-server {EC2 public IP}:9092 --replication-factor 1 --partitions 1"
  9. Make you kafka_consumer.py executable using "chmod u+x" command. Run your producer script then run your consumer script
  10. Create AWS Glue crawler to create a table from your S3 location.
  11. Use Amazon Athena to query the data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages