Where do users go for trusted Bitcoin educational resources? In the ever-changing and evolving world of Bitcoin, false information, distrust, and confusion about complex topics is common. This makes navigating the Bitcoin landscape challenging for many, especially for new users.
We believe open-source LLMs will develop exponentially faster than closed-source LLMs. In light of this our team created Bitcoin-PAL, a bitcoin focused AI chatbot coupled with an incentivized crowd-sourcing platform to train LLMs with bitcoin.
Using Bitcoin-PAL, bitcoiners can rely on and contribute to trusted and vetted bitcoin documentation rather than query the general internet and GPT models.
Bitcoin-PAL is a proof of concept and was built during the #Ai4ALL Hackathon presented by Bolt.fun, which ran from July 1 - August 1, 2023. The project also were the winners of the Training Track!
- Project Charter and User Stories
- Backend Documentation
- Project Board
- Figma designs
- GitHub Actions
- Demo Video
- Presentation Deck
The backend can be used as a standalone console or a python flask based webservice.
-
git clone this repo
-
pip3 install -r requirements.txt
- install python dependencies -
Then, download the LLM model and place it in a directory of your choice:
- LLM: default to ggml-gpt4all-j-v1.3-groovy.bin. If you prefer a different GPT4All-J compatible model, just download it and reference it in your
.env
file.
- LLM: default to ggml-gpt4all-j-v1.3-groovy.bin. If you prefer a different GPT4All-J compatible model, just download it and reference it in your
-
Copy the
example.env
template into.env
cp example.env .env
and edit the variables appropriately in the
.env
file.MODEL_TYPE: supports LlamaCpp or GPT4All PERSIST_DIRECTORY: is the folder you want your vectorstore in MODEL_PATH: Path to your GPT4All or LlamaCpp supported LLM MODEL_N_CTX: Maximum token limit for the LLM model MODEL_N_BATCH: Number of tokens in the prompt that are fed into the model at a time. Optimal value differs a lot depending on the model (8 works well for GPT4All, and 1024 is better for LlamaCpp) EMBEDDINGS_MODEL_NAME: SentenceTransformers embeddings model name (see https://www.sbert.net/docs/pretrained_models.html) TARGET_SOURCE_CHUNKS: The amount of chunks (sources) that will be used to answer a question
Note: because of the way
langchain
loads theSentenceTransformers
embeddings, the first time you run the script it will require internet connection to download the embeddings model itself.
To ingest custom documentation the source_documents
folder must be populated first and ingestion must be ran.
python3 ingest.py
python3 bitcoinPAL.py
- Ask your question on the command line
python3 server.py
- Ask a question by submitting a curl command with a json payload to localhost:8000
curl -X POST -H "Content-Type: application/json" -d '{"query":"What is bitcoin?"}' http://localhost:8000
The back end server is required to be running also in order for the front end to talk to API for querying.
- git clone this repo
- cd to this repo directory
cd frontend
- change directory to the front endnpm install
- installs all node modulesnpm start
- starts front end on localhost:3000- Navigate to
http://localhost:3000
and ask your question
In addition to the libraries listed in requirements.txt
and npm packages, this project also uses:
- PrivateGPT
- Markdown Badges by Ileriayo
- GPT4All-J v1.3-groovy .bin
This software has no guarantees and was created during a hackathon. All software should be considered beta unless otherwise explicitly specified. The user assumes all risk by executing any code contained in this repository.
Bitcoin PAL is released under the terms of the MIT license. See LICENSE for more information or see https://opensource.org/licenses/MIT