I tried four vibe-coding tools, including Cursor and Replit, with no coding background. Here's what worked (and what didn't).
đź”” The automatic evaluation on CodaLab are under construction. The MathVista dataset is derived from three newly collected datasets: IQTest, FunctionQA, and Paper, as well as 28 other source datasets.
I'm not a programmer, but I tried four vibe coding tools to see if I could build anything at all on my own. Here's what I did and did not accomplish.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results