Different Score on BrowseComp for Cladue Opus-4.5
#56
by Shaobo1103 - opened
https://huggingface.co/MiniMaxAI/MiniMax-M2.1
In M2.1's report, the score on Browsecomp for Opus-4.5 with context management is 57.8
But in this report, I'm wondering why it increased to 67.8?