CVE-Bench: testing LLM agents on real-world vulnerability patches

(giovannigatti.github.io)

9 points | by logickkk1 10 hours ago ago

1 comments