LEADERBOARD

Comparing optimization strategies across chemical datasets

How to Read the Data

Table View: Values show the median or mean performance. Hover over any value to see the full range (min–max) and number of runs.

Plot View: Horizontal lines indicate median/mean values (colored by method type). Red circles (●) mark minimum values, green triangles (▲) mark maximum values, with thin gray lines connecting the range.

Views: Toggle between Median (Min-Max) for robust central tendency or Mean (Min-Max) for arithmetic average.

Chemical reaction datasets

Public or personal view

Performance Visualization

Generating Performance Visualization

This may take a few moments as we compile optimization results and generate interactive plots...

Tip: You can browse the leaderboard table below while plots are being generated!

Method
Pass@3 (Early)
Pass@5 (Early-Mid)
Pass@10 (Mid)
Pass@20 (Final)
Showing: Median (Min-Max)
1
BO atlas-ei
60.0% ± 21.8% (20) 66.6% ± 17.3% (20) 78.5% ± 12.4% (20) 82.2% ± 12.1% (20)
2
BO atlas-pi
59.9% ± 12.7% (20) 61.5% ± 13.9% (20) 74.4% ± 10.2% (20) 77.3% ± 11.4% (20)
3
BO atlas-ucb
57.6% ± 24.4% (20) 67.8% ± 23.2% (20) 80.1% ± 12.5% (20) 87.5% ± 8.1% (20)
LLM claude-3-5-haiku-latest
57.4% ± 15.1% (20) 67.0% ± 15.3% (20) 78.3% ± 11.8% (20) 82.5% ± 13.0% (20)
LLM claude-3-7-sonnet-latest
54.1% ± 5.8% (20) 64.5% ± 14.3% (20) 89.2% ± 9.2% (20) 95.6% ± 2.5% (20)
LLM claude-3-7-sonnet-latest-thinking
55.0% ± 7.3% (20) 61.5% ± 14.4% (20) 77.6% ± 18.9% (20) 95.5% ± 2.0% (20)
LLM gpt-4o-mini
55.7% ± 20.3% (20) 60.7% ± 22.5% (20) 65.8% ± 23.3% (20) 65.8% ± 23.3% (20)
LLM gemini-2.5-pro-preview-03-25-medium
52.4% ± 6.1% (20) 55.3% ± 6.7% (20) 60.2% ± 11.8% (20) 85.9% ± 13.8% (20)
LLM gpt-4o
52.2% ± 18.5% (20) 61.4% ± 21.6% (20) 69.5% ± 23.3% (20) 71.8% ± 21.2% (20)
LLM gemini-2.5-flash-preview-04-17-medium
50.6% ± 12.0% (20) 53.6% ± 8.4% (20) 65.5% ± 17.0% (20) 86.3% ± 15.9% (20)

Currently showing top 10 of 25 results