Reproducing Agentless-1.5 Results on SWE-bech lite #39

GCVulnerability · 2024-11-12T09:25:20Z

Thanks for improving Agentless. However, I can't reproduce the performance mentioned in the technical report based on the code you provided.
When I generate the total files in 'repair_samles_1' - 'repair_samples_4' folders, I cannot generate 'all_preds.jsonl' file using all of 40 samples independently in the 4 folders. So, I merge and renamed the output sample files from ‘output_0_normalized.jsonl' to 'output_39_normalized.jsonl'. After merging, I run 'rerank.py' and generate 'all_preds.jsonl'.

Using gpt-4o-08-06 model and following the instructions in 'readme_swebench.md', I only got 26% pass rate (78/300) on SWE-Bench-lite. Moreover, even if I use all the intermediate results you provided in realese 1.5 and only run 'rerank.py', I can still only achieve a pass rate of 29.67% (89/300).

I was wondering if my use of 40 samples in 4 folders is incorrect? And how can I achieve 32% pass rate which you have submitted to SWE-bench through your intermediate results.

brutalsavage · 2024-11-12T16:34:23Z

Hi @GCVulnerability

You should not merge the output sample files together instead you should use the rerank script this way:

python agentless/repair/rerank.py --patch_folder results/swe-bench-lite/repair_sample_1/,results/swe-bench-lite/repair_sample_2/,results/swe-bench-lite/repair_sample_3/,results/swe-bench-lite/repair_sample_4 \
                                  --num_samples 40 \
                                  --deduplicate \
                                  --regression \
                                  --reproduction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing Agentless-1.5 Results on SWE-bech lite #39

Reproducing Agentless-1.5 Results on SWE-bech lite #39

GCVulnerability commented Nov 12, 2024 •

edited

Loading

brutalsavage commented Nov 12, 2024

Reproducing Agentless-1.5 Results on SWE-bech lite #39

Reproducing Agentless-1.5 Results on SWE-bech lite #39

Comments

GCVulnerability commented Nov 12, 2024 • edited Loading

brutalsavage commented Nov 12, 2024

GCVulnerability commented Nov 12, 2024 •

edited

Loading