* first pass at offline agent eval integration
* Integrating AgentEval for offline scenarios
* removing old changes
* fixing notebook, updating docs
* fixing subcriteria bug
* updating class comment
* cleaning up agent constructors
* moving AgentEval agents to separate folder and adding a brief README
* fixing build breaks
* fixing formatting break
* fixing comments
* consolidating files in the agenteval folder under contrib and cleaning up imports
* fixing import ordering
* adding basic agenteval tests and fixing criteria parsing bug
* first try at adding openai agenteval tests to build process
* adding non-openai agenteval tests to build process
* updating test settings
* updating openai test
* Update test/agentchat/contrib/agent_eval/test_agent_eval.py
Co-authored-by: Wael Karkoub <wael.karkoub96@gmail.com>
* Update .github/workflows/contrib-openai.yml
Co-authored-by: Wael Karkoub <wael.karkoub96@gmail.com>
* test commit
* updating typing and converting to pydantic objects
* fixing test file
---------
Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Wael Karkoub <wael.karkoub96@gmail.com>
* Add isort
* Apply isort on py files
* Fix circular import
* Fix format for notebooks
* Fix format
---------
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* set use_docker to default to true
* black formatting
* centralize checking and add env variable option
* set docker env flag for contrib tests
* set docker env flag for contrib tests
* better error message and cleanup
* disable explicit docker tests
* docker is installed so can't check for that in test
* pr comments and fix test
* rename and fix function descriptions
* documentation
* update notebooks so that they can be run with change in default
* add unit tests for new code
* cache and restore env var
* skip on windows because docker is running in the CI but there are problems connecting the volume
* update documentation
* move header
* update contrib tests
* fixed spelling, minor errors and reformatted using black
* polishing
* added codespell to pre-commit hooks, fixed a number of spelling errors and a few minor bugs in the code
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* add agenteval-notebook for math problems and the blog post about it
* update gitignore
* updates to notebook
* adding folder for the logs
* adding math problems logs
* adding folder for alfworld logs
* added limitiation and future work to blog post
* minor edits blog post
* adding changes
* reorg
* modify the main notebook
* modification of the main notebook
* remove wrong notebook
* uploading new notebook
* update agenteval notebook
* change the sample
* Update agenteval_cq_math.ipynb
* adding final changes to notebook
* updated framework picture
* Update index.mdx
* Update index.md
* Add files via upload
* updates to notebool
* revise the blog
* revise the blog
* update the agent img
* revise the blog
* revise the blog
* Excluded model logs from the main branch, you can find them in agenteval branch
* Fixed pre-commit formatting.
* Update website/blog/2023-11-11-AgentEval/index.mdx
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* update gitignore
* update index.mdx
* update authors.yml by adding Negar and Julia
* remove md file
* remove md file
* update gitignore
* update authors file
* pre-commit checks
* pre-commit checks on authors.yml
* pre-commit checks on authors.yml
* update index.mdx
* update authors.yml by adding Negar and Julia
* updated the blog-post version 1
* updated the blog-post: TL;DR is ready
* updated the blog-post: first part of introduction is ready
* updated figures: typos on fig 1, changed terminology on the fig 2
* upadated the Framework part
* fixed redering issues
* upload zip file instead of single samples
* update prealgebra.zip
* update
* upload
* update z
* update naming
* update zip
* update the agenteval notebook
* update the notebook - removing unmercenary logs
* updated fig 1 and references to it
* updated fig 1
* incorporated PR comments
* merged agenteval branch
* final changes to the blog
* updated taxonomy
* update notebook
* minor changes to the blog
* Fixed formatting
* Update the link in agenteval_cq_math.ipynb
* update the blog and link in notebook
* Update index.mdx
* change folder name
* Changes to be committed:
modified: OAI_CONFIG_LIST_sample.txt
* add sample OAI file
* fix the url link to colab and typos
* fix the url link to colab and typos
* add authors
* update profile pic
* "update authors"
* fixing the problem in test_groupchat.py
* update the title lower case
* reverting changes in setup.py
* rerun pre-commit
---------
Co-authored-by: Negar Arabzadeh <ngr.arabzadeh@gmail.com>
Co-authored-by: Julia Kiseleva <jukisele@microsoft.com>
Co-authored-by: afourney <adamfo@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>