We introduce the Berkeley Function Leaderboard (BFCL), the first comprehensive and executable function call evaluation dedicated to assessing Large Language Models' (LLMs) ability to invoke functions.
Importing modules and calling top-level functions from them Passing multiple positional and keyword arguments Receiving return values, including nested lists and dicts Getting Python exceptions across ...
Abstract: Source code summarization is the task of writing natural language descriptions of source code. The primary use of these descriptions is in documentation for programmers. Automatic generation ...