Abstract: Tools based on the use of Large Language Models (LLMs) have improved the computer programming teaching process, automated feedback processes, facilitated program repair, and enabled ...
Abstract: Simply measuring a language model’s performance after deployment is not enough. Rigorous and inclusive benchmarking isn’t just a checkbox along the way—it’s the foundation upon which ...
Background: Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various natural language processing tasks, particularly in text generation. However, their ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results