Skip to content

Commit

Permalink
add execution section in README.md
Browse files Browse the repository at this point in the history
removed TODO as duplicate to paper
  • Loading branch information
lpmeyer authored Nov 13, 2024
1 parent c75b37d commit 5a92873
Showing 1 changed file with 11 additions and 6 deletions.
17 changes: 11 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,16 @@ Right now, there are two lists of models to choose from (feel free to customize)
The first one contains models with less than ten billion parameters plus Microsoft's `Phi-3-Medium`.
The second one contains model with up to 34 billion parameters as this was the physical limit that our hardware could handle.

## TODO
## Execution

Requirements:
* python
* transformers
* capable GPU

1. Edit the `config.yaml` to your liking
2. run `python pipeline.py`

The script generates folders for each step with the results.

- Find GPU clusters that support models with more than 34B params
- Include more models, create a curated list of well performing models
- Use the output of the pipeline for fine tuning of a much smaller model and evaluate
- Add a method to handle really large knowledge graphs (e.g. subsampling and splitting into multiple chunks that fit the context size)
- Add a GUI for model selection and progress monitoring (low priority)

0 comments on commit 5a92873

Please sign in to comment.