Skip to content
This repository has been archived by the owner on Nov 21, 2018. It is now read-only.

modify compare.py to consider patches #24

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nikomatsakis
Copy link
Contributor

This is not smart enough to track the time for each incremental change separately. I didn't do that because it would have been harder. =) But also because I think it'd be better to retool so that both process.sh and compare.py are using the same code (i.e., so we are measuring the same thing that the server is).

cc @nnethercote

This is not smart enough to track the time for each incremental change
separately. I didn't do that because I think it'd be better to retool so
that both `process.sh` and `compare.py` are using the same code (i.e.,
so we are measuring the same thing that the server is).
@nnethercote
Copy link
Contributor

I tried the patch. I get this error on regex-0.1.80.

Traceback (most recent call last):
  File "./compare.py", line 82, in <module>
    times2 = run_test(dir, rustc2)
  File "./compare.py", line 41, in run_test
    make(patches, make_env)
  File "./compare.py", line 25, in make
    subprocess.check_call('make all%s > /dev/null 2>&1' % patch, env=make_env, shell=True)
  File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'make all@030-compile_one > /dev/null 2>&1' returned non-zero exit status 2

TBH, I don't like the new "patches" design. Previously things were simple: for every benchmark you could do make clean ; make ; make touch ; make and measure the final make. I have local scripts for running the benchmarks under Cachegrind and DHAT that depend on this; they are now also broken for syntex. I think it's better if each benchmark just has one measurement coming out of it.

@nikomatsakis
Copy link
Contributor Author

@nnethercote

I get this error on regex-0.1.80.

Huh, I thought it was working. I'll check it out.

Previously things were simple: for every benchmark you could do make clean ; make ; make touch ; make and measure the final make.

Yes, but that model did not gracefully handle the needs of incremental compilation, nor model the cumulative effects of stepping through a series of patches. In any case, the end model is still quite simple. Each test still has only one measurement. What has changed is that there is no longer one test per directory.

Regarding the py script, I think i'll investigate rewriting it to just be a front-end for process.sh. My feeling is that we ought to run process.sh for the tests and save the results into some temporary directory, and then compare those. That way there is only one way to gather data, and any changes will be automatically propagated. (It also avoids measuring things like the time it takes for make to get started, though I don't think our compilation is fast enough for that to matter -- but it might matter if we start adding runtime benchmarks that report the results of running cargo bench, which I want to do.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants