news · Reddit r/MachineLearning

DeepSWE: new benchmark looking at how well today's frontier models can actually write code [R]

<table> <tr><td> <a href="https://www.reddit.com/r/MachineLearning/comments/1ue0hlp/deepswe_new_benchmark_looking_at_how_well_todays/"> <img alt="DeepSWE: new benchmark looking at how well today's frontier models can actually write code [R]" src="https://preview.redd.it/lacvagyr159h1.png?width=140&amp;height=89&amp;auto=webp&amp;s=14f97a97511fbfe2fd767e4dc986ce0b4da5c73e" title="DeepSWE: new benchmark looking at how well today's frontier models can actually write code [R]" /> </a> </td><td> <!--

Want the primary source?View original →