Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

result of causal_tracing_comparison #6

Open
matekrk opened this issue Nov 18, 2024 · 1 comment
Open

result of causal_tracing_comparison #6

matekrk opened this issue Nov 18, 2024 · 1 comment

Comments

@matekrk
Copy link

matekrk commented Nov 18, 2024

Hi!

Thank you for providing the code for your excellent paper.

I wonder how to process the output of causal_tracing_comparison script. It is quite big .json file which is list of lists of dict.
I want either to inspect like you did in figure 5 (actually the same, I want to reproduce the result) or to get the circuit to compute part of networks and see the logit similarites / mismatches and follow it through like in figures 9 and 10.

Many thanks for your help and congratulations again!

@Boshi-Wang
Copy link
Collaborator

Thanks for the interest!

For the result, the first dimension is along the model checkpoints, and the second is along the examples. Each element is a dictionary containing the MRRs & causal strengths for different layers & positions, etc.

I added visualization.ipynb which contains the code for processing the results and plotting the figures, if it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants