Skip to content
This repository has been archived by the owner on Aug 24, 2021. It is now read-only.
/ unicamp-api Public archive

REST API for accessing Unicamp's (University of Campinas) data. All data is obtained by web scraping 🕸️

Notifications You must be signed in to change notification settings

vpalmerini/unicamp-api

Repository files navigation

UNICAMP API

Description

This project intends to serve as a REST API for accessing Unicamp's (University of Campinas) data. All data for now is obtained by web scraping and the main source of data is the university's website/system.

This first version has the following types of data:

  • Institutes
  • Courses
  • Subjects
  • Classes
  • Professors
  • Students

Project

The project is built on top of Django and has the following folder structure:

  • api/ - this is the project's main folder (created by Django when the project is created)
  • requirements.txt - list of dependencies needed to run the application
  • .env/ - folder with environment variables to be defined (more on that later)
  • .vscode/ - folder that contains tasks.json which has some VS Code tasks to be run
  • {apps}/ - all other folders are Django apps. Each entity acts as an app and it holds its own configuration
  • Dockerfile - file where we define how API's Docker image should be built
  • docker-compose.yml - file that defines our docker services (in this case Django and Postgres) and allows us to start and stop these services in a very convenient way

Motivation

The motivation is to have an resource where people (Unicamp students mainly) could get data about Unicamp easily. It could be useful for both academic purpose and applications in general (like bots and web apps). Also, the ideia is to create a project where anybody can contribute with more data or new demands.

This first version gets all data by web scraping. But is very likely that in a near future different data sources can appear.

Tech Stack

If you want to contribute in some way, it is strongly recommended that you have at least a basic knowledge of the main technologies used in this project. Those are:

Don't worry if you have never worked with Django. It has a very simple and nice API.

Most of the data is collected by web scraping using:

And the relational database is managed by:

Running Locally

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Running

  1. Clone the repository
# https
$ git clone https://github.com/vpalmerini/unicamp-api.git

# ssh
$ git clone [email protected]:vpalmerini/unicamp-api.git
  1. Initialize a virtual environment
# initialize environment
$ pipenv install

# start environment
$ pipenv shell

You can also create it by using venv

  1. Install all dependencies in the environment

You may need to install some packages related with psycopg. Those are:

# debian
$ sudo apt-get install postgresql-libs postgresql-dev python-dev

# redhat/fedora
$ sudo dnf install postgresql-libs postgresql-devel python-devel

Then:

$ pip install -r requirements.txt
  1. Now is time to scrapy the data. For that, run:
$ ./scrapy.sh

Assuming that Chrome driver is placed in /usr/local/lib/chromedriver

This will scrapy all data that was defined: intitutes, courses, subjects, classes, students and professors. If it's really what you want, it will take a while...I mean, I'm talking about hours ☕☕☕☕☕

  1. Create a folder named .env with the following files:
# django.env
SUPERUSER_NAME={superuser}
SUPERUSER_EMAIL={superuser-email}
SUPERUSER_PASSWORD={superuser-password}
DJANGO_SETTINGS_MODULE=api.settings
# postgres.env
POSTGRES_SERVER=postgres
POSTGRES_USER={user}
POSTGRES_PASSWORD={password}
POSTGRES_DB={database-name}

Fill in the fields with {}.

  1. scrapy.sh script will create a .json file with the data scraped for each entity in its respective folder. After that, it's time to populate the database and start the server (finally!):
$ docker-compose up -d

To access Django's admin panel, go to localhost:8000/admin and enter with the superuser credentials defined in the previous section.

The API will be available in localhost:8000/api/v1/ base route. A more detailed documentation about the endpoints will be relased soon.

License

This project is licensed under the MIT License

About

REST API for accessing Unicamp's (University of Campinas) data. All data is obtained by web scraping 🕸️

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published