Dokimos is an evaluation framework for LLM applications in Java. It helps you evaluate responses, track quality over time, and catch regressions before they reach production.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results