Skip to content

Latest commit

 

History

History
12 lines (9 loc) · 1.65 KB

File metadata and controls

12 lines (9 loc) · 1.65 KB

Community Examples

This directory contains example frontiers of Representation Engineering (RepE). While some of the examples were originally provided by the authors, we encourage and welcome community contributions. If you'd like to contribute, please open a PR, and we will review and merge it promptly.

Example Description Code Example Author
Honesty Monitoring and controlling the honesty of a model, using RepE techniques for lie detection, hallucinations, etc. honesty -
Emotions Controlling primary emotions in LLMs, illustrating the profound impact of emotions on model behavior. primary_emotions -
Fairness Reducing bias and increasing fairness in model generations. fairness -
Harmless Jailbreaking aligned model with harmless controlled harmless_harmful -
Memorization Preventing memorized outputs during generation. memorization -