Skip to content

Commit

Permalink
Merge pull request #13 from sammyfung/mv_to_subfolder
Browse files Browse the repository at this point in the history
- Move to sub-folder, and remove django models support.
  • Loading branch information
sammyfung authored Aug 31, 2023
2 parents 2d90cc6 + 9017230 commit 53c1275
Show file tree
Hide file tree
Showing 18 changed files with 153 additions and 220 deletions.
30 changes: 0 additions & 30 deletions .github/workflows/hk0weather-tests.yml

This file was deleted.

55 changes: 55 additions & 0 deletions .github/workflows/hk0weather.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: hk0weather
on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
schedule:
- cron: '5 19 1 * *'
jobs:
hk0weather-tests:
runs-on: ubuntu-latest
env:
PYTHON: '3.9'
steps:
- uses: actions/checkout@master
- name: Setup python
uses: actions/setup-python@master
with:
python-version: 3.10
- name: Install required python packages
run: |
python -m pip install --upgrade pip
pip install flake8 coverage
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: List available scrapers
run: |
coverage run -m scrapy list
- name: Test a scraper of regional weather
run: |
coverage run -m scrapy crawl regional -o regional.csv
working-directory: hk0weather
- name: Test a scraper of daily weather forecast
run: |
coverage run -m scrapy crawl hkoforecast -o hkoforecast.csv
working-directory: hk0weather
- name: Test a scraper of 9-day weather forecast
run: |
coverage run -m scrapy crawl hko9dayforecast -o hko9dayforecast.csv
working-directory: hk0weather
- name: Generate coverage json report
run: |
coverage json
working-directory: hk0weather
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
with:
directory: hk0weather
10 changes: 0 additions & 10 deletions .travis.yml

This file was deleted.

74 changes: 13 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,96 +8,48 @@ hk0weather is an open source web scraper project using Scrapy to collect the use

Scrapy can output collected weather data into the machine-readable formats (eg. CSV, JSON, XML).

Optionally, this project supports a Django app 'openweather' to store the collected weather data to Django web framework, and the data can be shown on web through the Django admin UI.

Available Spiders
Available Web Crawlers
---
1. **regional**: Hong Kong Regional Weather Data in 10-minutes update from HKO.
1. **rainfall**: Hong Kong Rainfall Data in hourly update from HKO.
1. **hkoforecast**: Hong Kong Next 24 hour Weather Forecast Report from HKO Open Data.
1. **hko9dayforecast**: Hong Kong 9-day Weather Report from HKO Open Data.

Installation Example
Installation
---

1) Cloning and setup hk0weather in a Py3 virtual environment
Cloning and setup hk0weather in a Py3 virtual environment

```
git clone https://github.com/sammyfung/hk0weather.git
virtualenv hk0weatherenv
source hk0weatherenv/bin/activate
cd hk0weather
pip install -r requirements.txt
$ git clone https://github.com/sammyfung/hk0weather.git
$ cd hk0weather
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
```

2) Optional: Setup hk0weather to use openweather

```
pip install -r requirements-django.txt
cd ..
django-admin startproject yourweatherproject
cd yourweatherproject
git clone https://github.com/sammyfung/openweather.git
```

Please add 'openweather' to INSTALLED_APPS in Django yourweatherproject/settings.py.

```
./manage.py makemigrations
./manage.py migrate
./manage.py createsuperuser
./manage.py runserver &
cd ../hk0weather
```

Django daemon is now running in the background, its web admin UI can be access at [http://localhost:8000/admin](http://localhost:8000/admin).

```
export PYTHONPATH=/your-full-path-to/yourweatherproject
export DJANGO_SETTINGS_MODULE=yourweatherproject.settings
```

Please export PYTHONPATH and DJANGO_SETTINGS_MODULE again after every activation of the Py3 virtual environment.

Run a Scrapy spider
---

Activate the Py3 virtual environment once before the first running of web spiders.

```
source hk0weatherenv/bin/activate
$ source venv/bin/activate
$ cd hk0weather
```

Optionally, if Django is in use, export PYTHONPATH and DJANGO_SETTINGS_MODULE.

```
export PYTHONPATH=/your-full-path-to/yourweatherproject
export DJANGO_SETTINGS_MODULE=yourweatherproject.settings
```
Optionally, list all available spiders.

```
scrapy list
$ scrapy list
```

Run a specific spider (eg. regional) in Scrapy
Run a regional weather data web crawler and export data to a JSON file.

```
scrapy crawl regional
$ scrapy crawl regional -o regional.json
```

and optionally use -t (file format) and -o (filename) to output the data in a json file.

```
scrapy crawl regional -t json -o test.json
```

## Sponsors

Calvin Tsang.

Thanks for my sponsors, please consider to [sponsor](https://github.com/sponsors/sammyfung) my works.

References
--

Expand Down
File renamed without changes.
4 changes: 4 additions & 0 deletions hk0weather/hko.py → hk0weather/hk0weather/hko.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ class hko:
(u'se','Kai Tak'),
(u'cp1','Central'),
(u'swh','Sai Wan Ho'),
(u'cwb', 'Clear Water Bay'),
(u'tls', 'Tai Lung'),
]

cnameid = [
Expand Down Expand Up @@ -98,6 +100,8 @@ class hko:
(u'啟德','se'),
(u'中環','cp1'),
(u'西灣河','swh'),
(u'清水灣', 'cwb'),
(u'大隴', 'tls'),
]

def getename(self, id):
Expand Down
72 changes: 72 additions & 0 deletions hk0weather/hk0weather/items.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# -*- coding: utf-8 -*-
import scrapy

class RegionalItem(scrapy.Item):
scraptime = scrapy.Field()
reptime = scrapy.Field()
station = scrapy.Field()
ename = scrapy.Field()
cname = scrapy.Field()
temperture = scrapy.Field()
humidity = scrapy.Field()
temperturemax = scrapy.Field()
temperturemin = scrapy.Field()
winddirection = scrapy.Field()
windspeed = scrapy.Field()
maxgust = scrapy.Field()
pressure = scrapy.Field()

class RainfallItem(scrapy.Item):
scraptime = scrapy.Field()
reptime = scrapy.Field()
ename = scrapy.Field()
cname = scrapy.Field()
rainfallmin = scrapy.Field()
rainfallmax = scrapy.Field()

class Hk0WeatherItem(scrapy.Item):
time = scrapy.Field()
station = scrapy.Field()
ename = scrapy.Field()
cname = scrapy.Field()
temperture = scrapy.Field()
humidity = scrapy.Field()

class Hk0TropicalItem(scrapy.Item):
time = scrapy.Field()
postime = scrapy.Field()
x = scrapy.Field()
y = scrapy.Field()
category = scrapy.Field()
windspeed = scrapy.Field()
tctype = scrapy.Field()


class ForecastItem(scrapy.Item):
update_time = scrapy.Field()
date = scrapy.Field()
general_en = scrapy.Field()
general_hk = scrapy.Field()
description_en = scrapy.Field()
description_hk = scrapy.Field()
wind_en = scrapy.Field()
wind_hk = scrapy.Field()
max_temp = scrapy.Field()
min_temp = scrapy.Field()
max_rh = scrapy.Field()
min_rh = scrapy.Field()
icon = scrapy.Field()


class ShortForecastItem(scrapy.Item):
scrape_time = scrapy.Field()
update_time = scrapy.Field()
general_en = scrapy.Field()
general_hk = scrapy.Field()
period_en = scrapy.Field()
period_hk = scrapy.Field()
forecast_en = scrapy.Field()
forecast_hk = scrapy.Field()
outlook_en = scrapy.Field()
outlook_hk = scrapy.Field()

File renamed without changes.
13 changes: 0 additions & 13 deletions hk0weather/settings.py → hk0weather/hk0weather/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,16 +87,3 @@
#HTTPCACHE_DIR = 'httpcache'
#HTTPCACHE_IGNORE_HTTP_CODES = []
#HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'

# Initialize Django web framework for data store
# Use environment variable PYTHONPATH for abspath to Django project
# and DJANGO_SETTINGS_MODULE for Settings filename of Django project
try:
import django
try:
django.setup()
except django.core.exceptions.ImproperlyConfigured:
pass
except ImportError:
# Allow to work without Django
pass
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,10 @@ def parse(self, response):
if len(data) > 5:
for j in range(0,len(data)):
if data[j].isdigit():
stations[laststation]['humidity'] = int(data[j])
#try:
stations[laststation]['humidity'] = int(data[j])
#except:
# print(i)
elif laststation != '':
try:
if j == 1:
Expand Down Expand Up @@ -91,17 +94,9 @@ def parse(self, response):
pass

for key in stations:
# __module__ and __name__
# Scrapy Item: scrapy.item.ItemMeta
# Scrapy DjangoItem: scrapy_djangoitem.DjangoItemMeta
if RegionalItem.__class__.__module__ == 'scrapy_djangoitem':
stationitem = RegionalItem()
for key2 in stations[key]:
stationitem[key2] = stations[key][key2]
elif RegionalItem.__class__.__module__ == 'scrapy.item':
stationitem = RegionalItem()
for key2 in stations[key]:
stationitem[key2] = stations[key][key2]
stationitem = RegionalItem()
for key2 in stations[key]:
stationitem[key2] = stations[key][key2]
stationitems.append(stationitem)

return stationitems
Expand Down
Loading

0 comments on commit 53c1275

Please sign in to comment.