本项目从"安居客"爬取租房信息
相关组件版本信息:
Scrapy 2.6.3
Python >= 3.8
本项目利用Grafana + MySQL来展示数据:
通过修改settings.py中的pipeline配置来获取csv格式的数据文件,执行以下命令控制pepelines配置:
- 打开CSV文件输出
sed -i '/.*#.'rentHouse.pipelines.RentPipelines'/s/#//' settings.py
- 关闭CSV文件输出
sed -i 's/'rentHouse.pipelines.RentPipelines'/#&/g' settings.py
- 打开MySQL pipeline
sed -i '/.*#.'rentHouse.pipelines.RentPipelinesMysql'/s/#//' settings.py
- 关闭MySQL pipeline
sed -i 's/'rentHouse.pipelines.RentPipelinesMysql'/#&/g' settings.py
修改pipelines.py文件中MySQL连接配置
self.conn = pymysql.connect(
host = '127.0.0.1',
user = 'username',
password = '********',
database = 'scrapy_result'
)
运行命令:
cd rentHouse;scrapy crawl rent
默认起始URL:
https://gz.zu.anjuke.com/?from=esf_list"