0
代码介绍
代码
Issues
Pull Requests
流水线
Actions
讨论
Wiki
项目成员
分析
项目设置
0
  1. locomo-eval-kit
  2. /
  3. scripts
weixin_44204324weixin_44204324style: format stat_judge_result.py
7e8be3b9创建于 6 天前历史提交
文件最后提交记录最后更新时间
config.toml.example
增加locomo测试脚本 1 个月前
eval.py
修复读取不到session文件的问题 1 个月前
import_to_ov.py
增加locomo测试脚本 1 个月前
judge.py
feat: make base-url and model read from env vars by default - base-url reads OPENAI_BASE_URL, model reads JUDGE_MODEL 6 天前
memory-benchmark-error-analysis_skill.md
add report skill 24 天前
report.py
add report skill 24 天前
run_eval_case0.sh
feat: add test date, git commit and test scenario to summary report 6 天前
run_eval_full.sh
feat: add test date, git commit and test scenario to summary report 6 天前
run_eval_small.sh
feat: add test date, git commit and test scenario to summary report 6 天前
stat_judge_result.py
style: format stat_judge_result.py 6 天前