MiniWoB++

Usage

pip install -r requirement.txt
python main.py [cudas] [test-amount] [model-path] [result-path]

Parameter Description

Parameter	Format	Mandatory	Use
cudas	0,1,2	Yes	The GPU number to be used, separated by commas, no spaces
test-amount	10	Yes	Number of test cases per task, the paper uses 100, but generally, 10 groups are more reasonable for efficiency
model-path	model_path/	Yes	Path to the model to be tested, if set to 'manual' then manual execution can be performed
result-path	result/	Yes	Location for the model's output (Tasks that have been completed in the same path will not be executed again)

Results

After running the above command, you should see a log_files folder appear in the current directory. The **.log files inside are the run results. When a task is completed, you should see the following output, where the result represents the test case score, which can be 0 or 1:

2023-11-30 06:28:13,283 - INFO - {"task": "click-button", "case_id": 10, "result": 1.0}

When all test cases for a group of tasks have been run, the following record will be output in the log:

2023-11-30 07:10:13,593 - INFO - {"task": "grid-coordinate", "avg_score": 0.3}

When all tasks in a process are completed, the log will record the following information:

2023-11-30 07:10:13,836 - INFO - ------
2023-11-30 07:10:13,836 - INFO - click-button-sequence            1.00
2023-11-30 07:10:13,836 - INFO - click-checkboxes                 0.62
2023-11-30 07:10:13,837 - INFO - click-checkboxes-large           0.07
2023-11-30 07:10:13,837 - INFO - click-color                      0.24
... (50 lines omitted)
2023-11-30 07:10:13,839 - INFO - enter-date                       1.00
2023-11-30 07:10:13,839 - INFO - grid-coordinate                  0.30
2023-11-30 07:10:13,839 - INFO - all                             0.442

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MiniWoB++

Usage

Parameter Description

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

MiniWoB++

Usage

Parameter Description

Results