This is a Jupyter Notebook that demonstrates a variety of data engineering and anaysis tasks one can tackle with LangChain. It walks through using the LLM (via OpenAI) to write and execute SQL queries, and then pass the results of those queries to Python for data visualization. It uses public data voter files.
This Notebook was written with LangChain version 0.0.128
using the text-davinci-003
model from OpenAI.
Before you can run the Notebook, you need to copy .env.example to .env (which is ignored by git) and fill in the OPENAI_API_KEY
environment variable.
Also, be sure to install the requirements:
pip install -r requirements.txt
I wrote and tested this Notebook in Python 3.10.
All code is provided under the BSD 3-Clause license.
The data used by the code in this repository originated from the North Carolina State Board of Elections, with the following license:
/* *******************************************************************************
* name: ReadMe_PUBLIC_DATA.txt
* purpose: Notification to the public, media, and interested parties.
* The data and documents contained within this publicly accessible site
* and all subforders herein provided by the NC State Board of Elections
* are considered public information per NC General Statutes.
* URL: https://dl.ncsbe.gov/list.html
* updated: 09/16/2020
******************************************************************************* */
Citations:
§ 132-1. Public Records.
https://www.ncleg.gov/EnactedLegislation/Statutes/PDF/BySection/Chapter_132/GS_132-1.pdf
§ 163-82.10. Official record of voter registration.
https://www.ncleg.gov/EnactedLegislation/Statutes/PDF/BySection/Chapter_163/GS_163-82.10.pdf
This project is maintained by @MattHodges.
Please use it for good, not evil.