silent; code_execution_config; exit; version (#1179)

* silent; code_execution_config; exit; version * url * url * readme * preview * doc * url * endpoints * timeout * chess * Fix retrieve chat * config * mathchat --------- Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-08-14 00:09:45 -07:00 · 2023-08-14 00:09:45 -07:00 · 7ab4d114d7
parent 700ff05874
commit 7ab4d114d7
43 changed files with 985 additions and 1855 deletions
--- a/README.md
+++ b/README.md
@ -14,38 +14,37 @@
    <br>
 </p>
 :fire: The automated multi-agent chat framework in [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) is in preview from v2.0.0.
 :fire: FLAML is highlighted in OpenAI's [cookbook](https://github.com/openai/openai-cookbook#related-resources-from-around-the-web).
-:fire: [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).
+:fire: [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).
 :fire: FLAML supports AutoML and Hyperparameter Tuning features in [Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/get-started/microsoft-fabric-overview) private preview. Sign up for these features at: https://aka.ms/fabric/data-science/sign-up.
 ## What is FLAML
 FLAML is a lightweight Python library for efficient automation of machine
-learning and AI operations, including selection of
+learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
-models, hyperparameters, and other tunable choices of an application (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations).
+and optimizes their performance.
-* For foundation models like the GPT models, it automates the experimentation and optimization of their performance to maximize the effectiveness for applications and minimize the inference cost. FLAML enables users to build and use adaptive AI agents with minimal effort.
+* FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
-* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., search space and metric), or full customization (arbitrary training/inference/evaluation code).
+* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
-* It supports fast and economical automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a [cost-effective
+* It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
 hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
 and model selection method invented by Microsoft Research, and many followup [research studies](https://microsoft.github.io/FLAML/docs/Research).
-FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET. In ML.NET, you can use FLAML via low-code solutions like [Model Builder](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet/model-builder) Visual Studio extension and the cross-platform [ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/automate-training-with-cli). Alternatively, you can use the [ML.NET AutoML API](https://www.nuget.org/packages/Microsoft.ML.AutoML/#versions-body-tab) for a code-first experience.
+FLAML is powered by a series of [research studies](/docs/Research) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.
 FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET.
 ## Installation
-### Python
+FLAML requires **Python version >= 3.8**. It can be installed from pip:
 FLAML requires **Python version >= 3.7**. It can be installed from pip:
 ```bash
 pip install flaml
 ```
-Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation) package.
+Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) package.
 ```bash
 pip install "flaml[autogen]"
 ```
@ -53,41 +52,34 @@ pip install "flaml[autogen]"
 Find more options in [Installation](https://microsoft.github.io/FLAML/docs/Installation).
 Each of the [`notebook examples`](https://github.com/microsoft/FLAML/tree/main/notebook) may require a specific option to be installed.
 ### .NET
 Use the following guides to get started with FLAML in .NET:
 - [Install Model Builder](https://docs.microsoft.com/dotnet/machine-learning/how-to-guides/install-model-builder?tabs=visual-studio-2022)
 - [Install ML.NET CLI](https://docs.microsoft.com/dotnet/machine-learning/how-to-guides/install-ml-net-cli?tabs=windows)
 - [Microsoft.AutoML](https://www.nuget.org/packages/Microsoft.ML.AutoML/0.20.0)
 ## Quickstart
-* (New) The [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation) package can help you maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4, including:
+* (New) The [autogen](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
-    - A drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
+It offers customizable and conversable agents which integrate LLMs, tools and human.
-    ```python
+By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
-    from flaml import autogen
+```python
 from flaml import autogen
 assistant = autogen.AssistantAgent("assistant")
 user_proxy = autogen.UserProxyAgent("user_proxy")
 user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
 # This initiates an automated chat between the two agents to solve the task
 ```
-    # perform tuning
+Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
-    config, analysis = autogen.Completion.tune(
+```python
-        data=tune_data,
+# perform tuning
-        metric="success",
+config, analysis = autogen.Completion.tune(
-        mode="max",
+    data=tune_data,
-        eval_func=eval_func,
+    metric="success",
-        inference_budget=0.05,
+    mode="max",
-        optimization_budget=3,
+    eval_func=eval_func,
-        num_samples=-1,
+    inference_budget=0.05,
-    )
+    optimization_budget=3,
-
+    num_samples=-1,
-    # perform inference for a test instance
+)
-    response = autogen.Completion.create(context=test_instance, **config)
+# perform inference for a test instance
-    ```
+response = autogen.Completion.create(context=test_instance, **config)
-    - LLM-driven intelligent agents which can collaborately perform tasks autonomously or with human feedback, including tasks that require using tools via code.
+```
    ```python
    assistant = autogen.AssistantAgent("assistant")
    user_proxy = autogen.UserProxyAgent("user_proxy")
    user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
    ```
 * With three lines of code, you can start using this economical and fast
 AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).
@ -124,7 +116,7 @@ estimator.fit(X_train, y_train)
 ## Documentation
-You can find a detailed documentation about FLAML [here](https://microsoft.github.io/FLAML/) where you can find the API documentation, use cases and examples.
+You can find a detailed documentation about FLAML [here](https://microsoft.github.io/FLAML/).
 In addition, you can find:
--- a/flaml/autogen/agentchat/agent.py
+++ b/flaml/autogen/agentchat/agent.py
@ -2,7 +2,7 @@ from typing import Dict, List, Optional, Union
 class Agent:
-    """(Experimental) An abstract class for AI agent.
+    """(In preview) An abstract class for AI agent.
    An agent can communicate with other agents and perform actions.
    Different agents can differ in what actions they perform in the `receive` method.
--- a/flaml/autogen/agentchat/assistant_agent.py
+++ b/flaml/autogen/agentchat/assistant_agent.py
@ -3,7 +3,7 @@ from typing import Callable, Dict, Optional, Union
 class AssistantAgent(ResponsiveAgent):
-    """(Experimental) Assistant agent, designed to solve a task with LLM.
+    """(In preview) Assistant agent, designed to solve a task with LLM.
    AssistantAgent is a subclass of ResponsiveAgent configured with a default system message.
    The default system message is designed to solve a task with LLM,
--- a/flaml/autogen/agentchat/contrib/math_user_proxy_agent.py
+++ b/flaml/autogen/agentchat/contrib/math_user_proxy_agent.py
@ -203,7 +203,7 @@ class MathUserProxyAgent(UserProxyAgent):
        return PROMPTS[prompt_type] + problem
    def _reset(self):
-        super().reset()
+        # super().reset()
        self._valid_q_count = 0
        self._total_q_count = 0
        self._accum_invalid_q_per_step = 0
@ -280,6 +280,7 @@ class MathUserProxyAgent(UserProxyAgent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
        config: Optional[Any] = None,
    ):
        """Generate an auto reply."""
        if messages is None:
--- a/flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py
+++ b/flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py
@ -22,10 +22,10 @@ class RetrieveAssistantAgent(AssistantAgent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ) -> Tuple[bool, Union[str, Dict, None]]:
-        if context is None:
+        if config is None:
-            context = self
+            config = self
        if messages is None:
            messages = self._oai_messages[sender]
        message = messages[-1]
--- a/flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
+++ b/flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
@ -207,10 +207,10 @@ class RetrieveUserProxyAgent(UserProxyAgent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ) -> Tuple[bool, Union[str, Dict, None]]:
-        if context is None:
+        if config is None:
-            context = self
+            config = self
        if messages is None:
            messages = self._oai_messages[sender]
        message = messages[-1]
--- a/flaml/autogen/agentchat/groupchat.py
+++ b/flaml/autogen/agentchat/groupchat.py
@ -33,7 +33,9 @@ class GroupChat:
    def select_speaker_msg(self):
        """Return the message for selecting the next speaker."""
        return f"""You are in a role play game. The following roles are available:
-{self._participant_roles()}. Read the following conversation.
+{self._participant_roles()}.
 Read the following conversation.
 Then select the next role from {self.agent_names} to play. Only return the role."""
    def select_speaker(self, last_speaker: Agent, selctor: ResponsiveAgent):
@ -73,32 +75,35 @@ class GroupChatManager(ResponsiveAgent):
            system_message=system_message,
            **kwargs,
        )
-        self.register_auto_reply(Agent, GroupChatManager.run_chat, context=groupchat, reset_context=GroupChat.reset)
+        self.register_auto_reply(Agent, GroupChatManager.run_chat, config=groupchat, reset_config=GroupChat.reset)
        # self._random = random.Random(seed)
    def run_chat(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[GroupChat] = None,
+        config: Optional[GroupChat] = None,
    ) -> Union[str, Dict, None]:
        """Run a group chat."""
        if messages is None:
            messages = self._oai_messages[sender]
        message = messages[-1]
        speaker = sender
-        for i in range(context.max_round):
+        for i in range(config.max_round):
            # set the name to speaker's name if the role is not function
            if message["role"] != "function":
                message["name"] = speaker.name
-            context.messages.append(message)
+            config.messages.append(message)
            # broadcast the message to all agents except the speaker
-            for agent in context.agents:
+            for agent in config.agents:
                if agent != speaker:
-                    self.send(message, agent, request_reply=False)
+                    self.send(message, agent, request_reply=False, silent=True)
-            if i != context.max_round - 1:
+            if i != config.max_round - 1:
                # speaker selection msg from an agent
-                speaker = context.select_speaker(speaker, self)
+                speaker = config.select_speaker(speaker, self)
-                speaker.send(speaker.generate_reply(sender=self), self, request_reply=False)
+                reply = speaker.generate_reply(sender=self)
                if reply is None:
                    break
                speaker.send(reply, self, request_reply=False)
                message = self.last_message(speaker)
        return True, None
--- a/flaml/autogen/agentchat/responsive_agent.py
+++ b/flaml/autogen/agentchat/responsive_agent.py
@ -87,6 +87,7 @@ class ResponsiveAgent(Agent):
                    If the code is executed in the current environment,
                    the code must be trusted.
                - timeout (Optional, int): The maximum execution time in seconds.
                - last_n_messages (Experimental, Optional, int): The number of messages to look back for code execution. Default to 1.
            llm_config (dict or False): llm inference configuration.
                Please refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
                for available options.
@ -128,8 +129,8 @@ class ResponsiveAgent(Agent):
        trigger: Union[Type[Agent], str, Agent, Callable[[Agent], bool], List],
        reply_func: Callable,
        position: Optional[int] = 0,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
-        reset_context: Optional[Callable] = None,
+        reset_config: Optional[Callable] = None,
    ):
        """Register a reply function.
@ -145,22 +146,22 @@ class ResponsiveAgent(Agent):
                - If a callable is provided, the reply function will be called when the callable returns True.
                - If a list is provided, the reply function will be called when any of the triggers in the list is activated.
            reply_func (Callable): the reply function.
-                The function takes a recipient agent, a list of messages, a sender agent and a context as input and returns a reply message.
+                The function takes a recipient agent, a list of messages, a sender agent and a config as input and returns a reply message.
        ```python
        def reply_func(
            recipient: ResponsiveAgent,
            messages: Optional[List[Dict]] = None,
            sender: Optional[Agent] = None,
-            context: Optional[Any] = None,
+            config: Optional[Any] = None,
        ) -> Union[str, Dict, None]:
        ```
            position (int): the position of the reply function in the reply function list.
                The function registered later will be checked earlier by default.
                To change the order, set the position to a positive integer.
-            context (Any): the context to be passed to the reply function.
+            config (Any): the config to be passed to the reply function.
-                When an agent is reset, the context will be reset to the original value.
+                When an agent is reset, the config will be reset to the original value.
-            reset_context (Callable): the function to reset the context.
+            reset_config (Callable): the function to reset the config.
-                The function returns None. Signature: ```def reset_context(context: Any)```
+                The function returns None. Signature: ```def reset_config(config: Any)```
        """
        if not isinstance(trigger, (type, str, Agent, Callable, list)):
            raise ValueError("trigger must be a class, a string, an agent, a callable or a list.")
@ -169,9 +170,9 @@ class ResponsiveAgent(Agent):
            {
                "trigger": trigger,
                "reply_func": reply_func,
-                "context": copy.copy(context),
+                "config": copy.copy(config),
-                "init_context": context,
+                "init_config": config,
-                "reset_context": reset_context,
+                "reset_config": reset_config,
            },
        )
@ -280,6 +281,7 @@ class ResponsiveAgent(Agent):
        message: Union[Dict, str],
        recipient: Agent,
        request_reply: Optional[bool] = None,
        silent: Optional[bool] = False,
    ) -> bool:
        """Send a message to another agent.
@ -308,6 +310,7 @@ class ResponsiveAgent(Agent):
                    the content of the "link" later.
            recipient (Agent): the recipient of the message.
            request_reply (bool or None): whether to request a reply from the recipient.
            silent (bool or None): (Experimental) whether to print the message sent.
        Raises:
            ValueError: if the message can't be converted into a valid ChatCompletion message.
@ -316,13 +319,19 @@ class ResponsiveAgent(Agent):
        # unless it's "function".
        valid = self._append_oai_message(message, "assistant", recipient)
        if valid:
-            recipient.receive(message, self, request_reply)
+            recipient.receive(message, self, request_reply, silent)
        else:
            raise ValueError(
                "Message can't be converted into a valid ChatCompletion message. Either content or function_call must be provided."
            )
-    async def a_send(self, message: Union[Dict, str], recipient: Agent, request_reply: Optional[bool] = None) -> bool:
+    async def a_send(
        self,
        message: Union[Dict, str],
        recipient: Agent,
        request_reply: Optional[bool] = None,
        silent: Optional[bool] = False,
    ) -> bool:
        """(async) Send a message to another agent.
        Args:
@ -350,6 +359,7 @@ class ResponsiveAgent(Agent):
                    the content of the "link" later.
            recipient (Agent): the recipient of the message.
            request_reply (bool or None): whether to request a reply from the recipient.
            silent (bool or None): (Experimental) whether to print the message sent.
        Raises:
            ValueError: if the message can't be converted into a valid ChatCompletion message.
@ -358,7 +368,7 @@ class ResponsiveAgent(Agent):
        # unless it's "function".
        valid = self._append_oai_message(message, "assistant", recipient)
        if valid:
-            await recipient.a_receive(message, self, request_reply)
+            await recipient.a_receive(message, self, request_reply, silent)
        else:
            raise ValueError(
                "Message can't be converted into a valid ChatCompletion message. Either content or function_call must be provided."
@ -394,7 +404,7 @@ class ResponsiveAgent(Agent):
                print(colored("*" * len(func_print), "green"), flush=True)
        print("\n", "-" * 80, flush=True, sep="")
-    def _process_received_message(self, message, sender):
+    def _process_received_message(self, message, sender, silent):
        message = self._message_to_dict(message)
        # When the agent receives a message, the role of the message is "user". (If 'role' exists and is 'function', it will remain unchanged.)
        valid = self._append_oai_message(message, "user", sender)
@ -402,9 +412,16 @@ class ResponsiveAgent(Agent):
            raise ValueError(
                "Received message can't be converted into a valid ChatCompletion message. Either content or function_call must be provided."
            )
-        self._print_received_message(message, sender)
+        if not silent:
            self._print_received_message(message, sender)
-    def receive(self, message: Union[Dict, str], sender: Agent, request_reply: Optional[bool] = None):
+    def receive(
        self,
        message: Union[Dict, str],
        sender: Agent,
        request_reply: Optional[bool] = None,
        silent: Optional[bool] = False,
    ):
        """Receive a message from another agent.
        Once a message is received, this function sends a reply to the sender or stop.
@ -422,18 +439,25 @@ class ResponsiveAgent(Agent):
            sender: sender of an Agent instance.
            request_reply (bool or None): whether a reply is requested from the sender.
                If None, the value is determined by `self.reply_at_receive[sender]`.
            silent (bool or None): (Experimental) whether to print the message received.
        Raises:
            ValueError: if the message can't be converted into a valid ChatCompletion message.
        """
-        self._process_received_message(message, sender)
+        self._process_received_message(message, sender, silent)
        if request_reply is False or request_reply is None and self.reply_at_receive[sender] is False:
            return
-        reply = self.generate_reply(sender=sender)
+        reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
        if reply is not None:
-            self.send(reply, sender)
+            self.send(reply, sender, silent=silent)
-    async def a_receive(self, message: Union[Dict, str], sender: Agent, request_reply: Optional[bool] = None):
+    async def a_receive(
        self,
        message: Union[Dict, str],
        sender: Agent,
        request_reply: Optional[bool] = None,
        silent: Optional[bool] = False,
    ):
        """(async) Receive a message from another agent.
        Once a message is received, this function sends a reply to the sender or stop.
@ -451,16 +475,17 @@ class ResponsiveAgent(Agent):
            sender: sender of an Agent instance.
            request_reply (bool or None): whether a reply is requested from the sender.
                If None, the value is determined by `self.reply_at_receive[sender]`.
            silent (bool or None): (Experimental) whether to print the message received.
        Raises:
            ValueError: if the message can't be converted into a valid ChatCompletion message.
        """
-        self._process_received_message(message, sender)
+        self._process_received_message(message, sender, silent)
        if request_reply is False or request_reply is None and self.reply_at_receive[sender] is False:
            return
        reply = await self.a_generate_reply(sender=sender)
        if reply is not None:
-            await self.a_send(reply, sender)
+            await self.a_send(reply, sender, silent=silent)
    def _prepare_chat(self, recipient, clear_history):
        self.reset_consecutive_auto_reply_counter(recipient)
@ -470,7 +495,13 @@ class ResponsiveAgent(Agent):
            self.clear_history(recipient)
            recipient.clear_history(self)
-    def initiate_chat(self, recipient: "ResponsiveAgent", clear_history: Optional[bool] = True, **context):
+    def initiate_chat(
        self,
        recipient: "ResponsiveAgent",
        clear_history: Optional[bool] = True,
        silent: Optional[bool] = False,
        **context,
    ):
        """Initiate a chat with the recipient agent.
        Reset the consecutive auto reply counter.
@ -480,13 +511,20 @@ class ResponsiveAgent(Agent):
        Args:
            recipient: the recipient agent.
            clear_history (bool): whether to clear the chat history with the agent.
            silent (bool or None): (Experimental) whether to print the messages for this conversation.
            **context: any context information.
                "message" needs to be provided if the `generate_init_message` method is not overridden.
        """
        self._prepare_chat(recipient, clear_history)
-        self.send(self.generate_init_message(**context), recipient)
+        self.send(self.generate_init_message(**context), recipient, silent=silent)
-    async def a_initiate_chat(self, recipient: "ResponsiveAgent", clear_history: Optional[bool] = True, **context):
+    async def a_initiate_chat(
        self,
        recipient: "ResponsiveAgent",
        clear_history: Optional[bool] = True,
        silent: Optional[bool] = False,
        **context,
    ):
        """(async) Initiate a chat with the recipient agent.
        Reset the consecutive auto reply counter.
@ -496,11 +534,12 @@ class ResponsiveAgent(Agent):
        Args:
            recipient: the recipient agent.
            clear_history (bool): whether to clear the chat history with the agent.
            silent (bool or None): (Experimental) whether to print the messages for this conversation.
            **context: any context information.
                "message" needs to be provided if the `generate_init_message` method is not overridden.
        """
        self._prepare_chat(recipient, clear_history)
-        await self.a_send(self.generate_init_message(**context), recipient)
+        await self.a_send(self.generate_init_message(**context), recipient, silent=silent)
    def reset(self):
        """Reset the agent."""
@ -508,10 +547,10 @@ class ResponsiveAgent(Agent):
        self.reset_consecutive_auto_reply_counter()
        self.stop_reply_at_receive()
        for reply_func_tuple in self._reply_func_list:
-            if reply_func_tuple["reset_context"] is not None:
+            if reply_func_tuple["reset_config"] is not None:
-                reply_func_tuple["reset_context"](reply_func_tuple["context"])
+                reply_func_tuple["reset_config"](reply_func_tuple["config"])
            else:
-                reply_func_tuple["context"] = copy.copy(reply_func_tuple["init_context"])
+                reply_func_tuple["config"] = copy.copy(reply_func_tuple["init_config"])
    def stop_reply_at_receive(self, sender: Optional[Agent] = None):
        """Reset the reply_at_receive of the sender."""
@ -542,10 +581,10 @@ class ResponsiveAgent(Agent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ) -> Tuple[bool, Union[str, Dict, None]]:
        """Generate a reply using autogen.oai."""
-        llm_config = self.llm_config if context is None else context
+        llm_config = self.llm_config if config is None else config
        if llm_config is False:
            return False, None
        if messages is None:
@ -561,36 +600,44 @@ class ResponsiveAgent(Agent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ):
        """Generate a reply using code execution."""
-        code_execution_config = context if context is not None else self._code_execution_config
+        code_execution_config = config if config is not None else self._code_execution_config
        if code_execution_config is False:
            return False, None
        if messages is None:
            messages = self._oai_messages[sender]
-        message = messages[-1]
+        last_n_messages = code_execution_config.pop("last_n_messages", 1)
-        code_blocks = extract_code(message["content"])
+        for i in range(last_n_messages):
-        if len(code_blocks) == 1 and code_blocks[0][0] == UNKNOWN:
+            message = messages[-(i + 1)]
-            # no code block is found, lang should be `UNKNOWN`
+            code_blocks = extract_code(message["content"])
-            return False, None
+            if len(code_blocks) == 1 and code_blocks[0][0] == UNKNOWN:
-            # code_blocks, _ = find_code(messages, sys_msg=self._oai_system_message, **self.llm_config)
+                # no code block is found, lang should be `UNKNOWN`
-            # if len(code_blocks) == 1 and code_blocks[0][0] == UNKNOWN:
+
-            #     return code_blocks[0][1]
+                if i == last_n_messages - 1:
-        # try to execute the code
+                    code_execution_config["last_n_messages"] = last_n_messages
-        exitcode, logs = self.execute_code_blocks(code_blocks)
+                    return False, None
-        exitcode2str = "execution succeeded" if exitcode == 0 else "execution failed"
+                continue
                # code_blocks, _ = find_code(messages, sys_msg=self._oai_system_message, **self.llm_config)
                # if len(code_blocks) == 1 and code_blocks[0][0] == UNKNOWN:
                #     return code_blocks[0][1]
            # try to execute the code
            exitcode, logs = self.execute_code_blocks(code_blocks)
            exitcode2str = "execution succeeded" if exitcode == 0 else "execution failed"
            break
        code_execution_config["last_n_messages"] = last_n_messages
        return True, f"exitcode: {exitcode} ({exitcode2str})\nCode output: {logs}"
    def generate_function_call_reply(
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ):
        """Generate a reply using function call."""
-        if context is None:
+        if config is None:
-            context = self
+            config = self
        if messages is None:
            messages = self._oai_messages[sender]
        message = messages[-1]
@ -603,11 +650,11 @@ class ResponsiveAgent(Agent):
        self,
        messages: Optional[List[Dict]] = None,
        sender: Optional[Agent] = None,
-        context: Optional[Any] = None,
+        config: Optional[Any] = None,
    ) -> Tuple[bool, Union[str, Dict, None]]:
        """Check if the conversation should be terminated, and if human reply is provided."""
-        if context is None:
+        if config is None:
-            context = self
+            config = self
        if messages is None:
            messages = self._oai_messages[sender]
        message = messages[-1]
@ -709,9 +756,7 @@ class ResponsiveAgent(Agent):
                if asyncio.coroutines.iscoroutinefunction(reply_func):
                    continue
                if self._match_trigger(reply_func_tuple["trigger"], sender):
-                    final, reply = reply_func(
+                    final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
                        self, messages=messages, sender=sender, context=reply_func_tuple["context"]
                    )
                    if final:
                        return reply
        return self._default_auto_reply
@ -755,11 +800,11 @@ class ResponsiveAgent(Agent):
                if self._match_trigger(reply_func_tuple["trigger"], sender):
                    if asyncio.coroutines.iscoroutinefunction(reply_func):
                        final, reply = await reply_func(
-                            self, messages=messages, sender=sender, context=reply_func_tuple["context"]
+                            self, messages=messages, sender=sender, config=reply_func_tuple["config"]
                        )
                    else:
                        final, reply = reply_func(
-                            self, messages=messages, sender=sender, context=reply_func_tuple["context"]
+                            self, messages=messages, sender=sender, config=reply_func_tuple["config"]
                        )
                    if final:
                        return reply
--- a/flaml/autogen/agentchat/user_proxy_agent.py
+++ b/flaml/autogen/agentchat/user_proxy_agent.py
@ -3,7 +3,7 @@ from typing import Callable, Dict, Optional, Union
 class UserProxyAgent(ResponsiveAgent):
-    """(Experimental) A proxy agent for the user, that can execute code and provide feedback to the other agents.
+    """(In preview) A proxy agent for the user, that can execute code and provide feedback to the other agents.
    UserProxyAgent is a subclass of ResponsiveAgent configured with `human_input_mode` to ALWAYS
    and `llm_config` to False. By default, the agent will prompt for human input every time a message is received.
@ -60,6 +60,7 @@ class UserProxyAgent(ResponsiveAgent):
                    If the code is executed in the current environment,
                    the code must be trusted.
                - timeout (Optional, int): The maximum execution time in seconds.
                - last_n_messages (Experimental, Optional, int): The number of messages to look back for code execution. Default to 1.
            default_auto_reply (str or dict or None): the default auto reply message when no code execution or llm based reply is generated.
            llm_config (dict or False): llm inference configuration.
                Please refer to [autogen.Completion.create](/docs/reference/autogen/oai/completion#create)
--- a/flaml/autogen/oai/completion.py
+++ b/flaml/autogen/oai/completion.py
@ -695,7 +695,7 @@ class Completion(openai_Completion):
                E.g., `prompt="Complete the following sentence: {prefix}, context={"prefix": "Today I feel"}`.
                The actual prompt will be:
                "Complete the following sentence: Today I feel".
-                More examples can be found at [templating](/docs/Use-Cases/Auto-Generation#templating).
+                More examples can be found at [templating](/docs/Use-Cases/Autogen#templating).
            use_cache (bool, Optional): Whether to use cached responses.
            config_list (List, Optional): List of configurations for the completion to try.
                The first one that does not raise an error will be used.
--- a/flaml/automl/automl.py
+++ b/flaml/automl/automl.py
@ -230,7 +230,7 @@ class AutoML(BaseEstimator):
        ```
            seed: int or None, default=None | The random seed for hpo.
-            n_concurrent_trials: [Experimental] int, default=1 | The number of
+            n_concurrent_trials: [In preview] int, default=1 | The number of
                concurrent trials. When n_concurrent_trials > 1, flaml performes
                [parallel tuning](/docs/Use-Cases/Task-Oriented-AutoML#parallel-tuning)
                and installation of ray or spark is required: `pip install flaml[ray]`
@ -1366,7 +1366,7 @@ class AutoML(BaseEstimator):
        ```
            seed: int or None, default=None | The random seed for hpo.
-            n_concurrent_trials: [Experimental] int, default=1 | The number of
+            n_concurrent_trials: [In preview] int, default=1 | The number of
                concurrent trials. When n_concurrent_trials > 1, flaml performes
                [parallel tuning](/docs/Use-Cases/Task-Oriented-AutoML#parallel-tuning)
                and installation of ray or spark is required: `pip install flaml[ray]`
--- a/flaml/onlineml/trial.py
+++ b/flaml/onlineml/trial.py
@ -76,7 +76,7 @@ class OnlineResult:
            init_cb: a float to specify the intial confidence bound.
            mode: A string in ['min', 'max'] to specify the objective as
                minimization or maximization.
-            sliding_window_size: An int to specify the size of the sliding windown
+            sliding_window_size: An int to specify the size of the sliding window
                (for experimental purpose).
        """
        self._result_type_name = result_type_name  # for example 'mse' or 'mae'
--- a/flaml/version.py
+++ b/flaml/version.py
@ -1 +1 @@
-__version__ = "2.0.0rc5"
+__version__ = "2.0.0"
--- a/notebook/autogen_agentchat_MathChat.ipynb
+++ b/notebook/autogen_agentchat_MathChat.ipynb
@ -15,7 +15,9 @@
   "source": [
    "# Auto Generated Agent Chat: Using MathChat to Solve Math Problems\n",
    "\n",
-    "MathChat is a convesational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. The original implementation and exeperiments of MathChat are in this [branch](https://github.com/kevin666aa/FLAML/tree/gpt_math_solver/flaml/autogen/math), and you can find more details in our paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337).\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "MathChat is an experimental convesational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://microsoft.github.io/FLAML/blog/2023/06/28/MathChat).\n",
    "\n",
    "## Requirements\n",
    "\n",
@ -27,11 +29,11 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install flaml[mathchat]~=2.0.0rc4"
+    "# %pip install flaml[mathchat]~=2.0.0"
   ]
  },
  {
@ -46,7 +48,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@ -60,11 +62,13 @@
    "            \"gpt4\",\n",
    "            \"gpt-4-32k\",\n",
    "            \"gpt-4-32k-0314\",\n",
    "            \"gpt-4-32k-v0314\",\n",
    "            \"gpt-3.5-turbo\",\n",
    "            \"gpt-3.5-turbo-16k\",\n",
    "            \"gpt-3.5-turbo-0301\",\n",
    "            \"chatgpt-35-turbo-0301\",\n",
    "            \"gpt-35-turbo-v0301\",\n",
    "            \"gpt\",\n",
    "        }\n",
    "    }\n",
    ")"
@ -75,7 +79,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well). Only the gpt-4 and gpt-3.5-turbo models are kept in the list based on the filter condition.\n",
+    "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n",
    "\n",
    "The config list looks like the following:\n",
    "```python\n",
@ -118,7 +122,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@ -167,112 +171,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
+   "outputs": [],
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mathproxyagent (to assistant):\n",
      "\n",
      "Let's use Python to solve a math problem.\n",
      "\n",
      "Query requirements:\n",
      "You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
      "You can use packages like sympy to help you.\n",
      "You must follow the formats below to write your code:\n",
      "```python\n",
      "# your code\n",
      "```\n",
      "\n",
      "First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
      "Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
      "Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
      "Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
      "1. Solve the problem step by step (do not over-divide the steps).\n",
      "2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
      "3. Wait for me to give the results.\n",
      "4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
      "\n",
      "After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
      "\n",
      "Problem:\n",
      "Find all $x$ that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$. Express your answer in interval notation.\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "assistant (to mathproxyagent):\n",
      "\n",
      "This problem can be solved by first simplifying the inequality, finding the critical points, and then testing points from each interval defined by the critical points to find where the inequality holds true. We can use Python with the sympy package for these calculations. Here is how:\n",
      "\n",
      "Case 1: Solving with Python directly\n",
      "\n",
      "We will solve this problem in the following steps:\n",
      "1. First, we simplify the inequality by expanding both sides and bringing all terms to one side.\n",
      "2. Second, we find the critical points by solving the simplified equation.\n",
      "3. Third, we test the sign of the simplified function with a number in each interval defined by the critical points.\n",
      "4. Finally, we collect all the intervals where the inequality is satisfied.\n",
      "\n",
      "Here is the Python code to accomplish these:\n",
      "\n",
      "```python\n",
      "from sympy import symbols, Eq, solve, simplify\n",
      "\n",
      "# Step 1: Simplify the inequality\n",
      "x = symbols('x')\n",
      "expr1 = simplify((2 * x + 10) * (x + 3))\n",
      "expr2 = simplify((3 * x + 9) * (x + 8))\n",
      "equation = simplify(expr1 - expr2)\n",
      "\n",
      "# Step 2: Find the critical points\n",
      "critical_points = sorted(solve(Eq(equation, 0)))\n",
      "\n",
      "# Step 3 and 4: Test the sign of the simplified function for each interval\n",
      "\n",
      "# First, let's check for x in (-oo, first critical point)\n",
      "test_point = critical_points[0] - 1\n",
      "if equation.subs(x, test_point) < 0:\n",
      "    print(\"The inequality holds for x in (-oo, \" + str(critical_points[0]) + \")\")\n",
      "\n",
      "# Second, let's check for x in each (previous critical point, next critical point)\n",
      "for i in range(len(critical_points) - 1):\n",
      "    test_point = (critical_points[i] + critical_points[i + 1]) / 2\n",
      "    if equation.subs(x, test_point) < 0:\n",
      "        print(\"The inequality holds for x in (\" + str(critical_points[i]) + \", \" + str(critical_points[i + 1]) + \")\")\n",
      "\n",
      "# Third, let's check for x in (last critical point, oo)\n",
      "test_point = critical_points[-1] + 1\n",
      "if equation.subs(x, test_point) < 0:\n",
      "    print(\"The inequality holds for x in (\" + str(critical_points[-1]) + \", oo)\")\n",
      "\n",
      "# The intervals output in the print statements represent the solution to the inequality in the problem.\n",
      "```\n",
      "\n",
      "After running the above code, you will find the exact interval(s) that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "The inequality holds for x in (-oo, -14)\n",
      "The inequality holds for x in (-3, oo)\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Great! So the solution to the inequality $(2x+10)(x+3)<(3x+9)(x+8)$ is given by the union of the two intervals where the inequality holds true. In interval notation, we can express the solution as:\n",
      "\n",
      "$$\\boxed{x \\in (-\\infty, -14) \\cup (-3, \\infty)}$$\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "# given a math problem, we use the mathproxyagent to generate a prompt to be sent to the assistant as the initial message.\n",
    "# the assistant receives the message and generates a response. The response will be sent back to the mathproxyagent for processing.\n",
@ -297,127 +198,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
+   "outputs": [],
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mathproxyagent (to assistant):\n",
      "\n",
      "Let's use Python to solve a math problem.\n",
      "\n",
      "Query requirements:\n",
      "You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
      "You can use packages like sympy to help you.\n",
      "You must follow the formats below to write your code:\n",
      "```python\n",
      "# your code\n",
      "```\n",
      "\n",
      "First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
      "Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
      "Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
      "Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
      "1. Solve the problem step by step (do not over-divide the steps).\n",
      "2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
      "3. Wait for me to give the results.\n",
      "4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
      "\n",
      "After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
      "\n",
      "Problem:\n",
      "For what negative value of $k$ is there exactly one solution to the system of equations \\begin{align*}\n",
      "y &= 2x^2 + kx + 6 \\\\\n",
      "y &= -x + 4?\n",
      "\\end{align*}\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "The key idea to solve this problem is to set the two equations equal to each other and solve for x. Then we substitute x back into one of the equations to get the quadratic equation in terms of y. We set the discriminant of this quadratic equation equals to zero, because when discriminant (D = b² - 4ac) equals to zero in a quadratic equation, the equation has exactly one root. Solve the equation for k.\n",
      "\n",
      "```python\n",
      "from sympy import symbols, Eq, solve\n",
      "\n",
      "x, k = symbols('x k')\n",
      "\n",
      "# The two Y's are equal, so we set the two equations equal to each other\n",
      "eq1 = Eq(2*x**2 + k*x + 6, -x + 4)\n",
      "solve_for_x = solve(eq1, x)\n",
      "\n",
      "# Insert x solutions back to the original equation\n",
      "eq2 = Eq(2*x + k, -x + 4)\n",
      "result = []\n",
      "\n",
      "for solution in solve_for_x:\n",
      "    temp = eq2.subs(x, solution)\n",
      "    sol = solve(temp, k)\n",
      "    if sol != []:\n",
      "        result.append(sol)\n",
      "\n",
      "# The solution should be negative\n",
      "negative_k = [s for s in result if s<0]\n",
      "\n",
      "print(negative_k)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "Error: Traceback (most recent call last):\n",
      "      negative_k = [s for s in result if s<0]\n",
      "      negative_k = [s for s in result if s<0]\n",
      "TypeError: '<' not supported between instances of 'list' and 'int'\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Apologies for the oversight in the code. It seems I made a mistake in checking for the negative value of k. Let's modify the code to fix the issue.\n",
      "\n",
      "```python\n",
      "from sympy import symbols, Eq, solve\n",
      "\n",
      "x, k = symbols('x k')\n",
      "\n",
      "# The two Y's are equal, so we set the two equations equal to each other\n",
      "eq1 = Eq(2*x**2 + k*x + 6, -x + 4)\n",
      "solve_for_x = solve(eq1, x)\n",
      "\n",
      "# Insert x solutions back to the original equation\n",
      "eq2 = Eq(2*x + k, -x + 4)\n",
      "result = []\n",
      "\n",
      "for solution in solve_for_x:\n",
      "    temp = eq2.subs(x, solution)\n",
      "    sol = solve(temp, k)\n",
      "    if sol != []:\n",
      "        result.extend(sol)\n",
      "\n",
      "# The solution should be negative\n",
      "negative_k = [s for s in result if s<0]\n",
      "\n",
      "print(negative_k)\n",
      "```\n",
      "This code should now properly identify the negative value of k for which there is exactly one solution to the system of equations.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "[-3*sqrt(33)/2 - 7/2]\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Great! Now we have the correct negative value of k for which there is exactly one solution to the system of equations. Therefore, the answer is:\n",
      "\n",
      "$$k = \\boxed{-\\frac{3\\sqrt{33}}{2}-\\frac{7}{2}}$$\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "math_problem = \"For what negative value of $k$ is there exactly one solution to the system of equations \\\\begin{align*}\\ny &= 2x^2 + kx + 6 \\\\\\\\\\ny &= -x + 4?\\n\\\\end{align*}\"\n",
    "mathproxyagent.initiate_chat(assistant, problem=math_problem)"
@ -436,109 +219,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
+   "outputs": [],
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mathproxyagent (to assistant):\n",
      "\n",
      "Let's use Python to solve a math problem.\n",
      "\n",
      "Query requirements:\n",
      "You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
      "You can use packages like sympy to help you.\n",
      "You must follow the formats below to write your code:\n",
      "```python\n",
      "# your code\n",
      "```\n",
      "\n",
      "First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
      "Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
      "Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
      "Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
      "1. Solve the problem step by step (do not over-divide the steps).\n",
      "2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
      "3. Wait for me to give the results.\n",
      "4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
      "\n",
      "After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
      "\n",
      "Problem:\n",
      "Find all positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational. Express them in decreasing order, separated by commas.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "To ensure the roots of the quadratic equation $x^2 - 7x + c = 0$ are real and rational, we will use two key ideas:\n",
      "\n",
      "1. The discriminant of the quadratic equation, $\\Delta = b^2 - 4ac$, must be non-negative so that the roots are real. \n",
      "2. The discriminant must be a perfect square so that the roots are rational.\n",
      "\n",
      "We can find the values of $c$ using Python. First, we'll calculate the discriminant and check the conditions for each value of c. We only need to consider positive integers for $c$ until we reach a value where the discriminant becomes negative since after that point, there will be no real roots.\n",
      "\n",
      "```python\n",
      "from sympy import *\n",
      "\n",
      "c_values = []\n",
      "c = 1\n",
      "\n",
      "while True:\n",
      "    discriminant = Rational(49 - 4 * c)\n",
      "    if discriminant < 0:\n",
      "        break\n",
      "    if int(discriminant.sqrt())**2 == discriminant:\n",
      "        c_values.append(c)\n",
      "    c += 1\n",
      "\n",
      "print(c_values)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "Error: Traceback (most recent call last):\n",
      "      if int(discriminant.sqrt())**2 == discriminant:\n",
      "AttributeError: 'Integer' object has no attribute 'sqrt'\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Apologies for the mistake. I should have used the `sqrt` function from the `sympy` library. Here's the corrected code:\n",
      "\n",
      "```python\n",
      "from sympy import *\n",
      "\n",
      "c_values = []\n",
      "c = 1\n",
      "\n",
      "while True:\n",
      "    discriminant = Rational(49 - 4 * c)\n",
      "    if discriminant < 0:\n",
      "        break\n",
      "    if int(sqrt(discriminant))**2 == discriminant:\n",
      "        c_values.append(c)\n",
      "    c += 1\n",
      "\n",
      "print(c_values)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "[6, 10, 12]\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "The positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational are \\boxed{12, 10, 6}.\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "math_problem = \"Find all positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational. Express them in decreasing order, separated by commas.\"\n",
    "mathproxyagent.initiate_chat(assistant, problem=math_problem)"
@ -581,157 +264,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
+   "outputs": [],
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mathproxyagent (to assistant):\n",
      "\n",
      "Let's use Python to solve a math problem.\n",
      "\n",
      "Query requirements:\n",
      "You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
      "You can use packages like sympy to help you.\n",
      "You must follow the formats below to write your code:\n",
      "```python\n",
      "# your code\n",
      "```\n",
      "\n",
      "Please follow this process:\n",
      "1. Solve the problem step by step (do not over-divide the steps).\n",
      "2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
      "3. Wait for me to give the results.\n",
      "4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
      "\n",
      "After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
      "\n",
      "Problem: Problem: If $725x + 727y = 1500$ and $729x+ 731y = 1508$, what is the value of $x - y$ ?\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Let's first solve the two equations as a system of linear equations. We can do this by either substitution or elimination method. Let's use the elimination method.\n",
      "\n",
      "We will multiply both equations by necessary multiples such that the coefficients of either x or y will cancel out when we subtract/add the two equations.\n",
      "\n",
      "We'll multiply the first equation by 729 and the second equation by -725. This will allow us to cancel out the x terms when we add the two equations.\n",
      "\n",
      "```python\n",
      "from sympy import symbols, Eq, solve\n",
      "\n",
      "x, y = symbols('x y')\n",
      "equation1 = Eq(725 * x + 727 * y, 1500)\n",
      "equation2 = Eq(729 * x + 731 * y, 1508)\n",
      "\n",
      "# Multiply equations with necessary multiples\n",
      "equation1 = Eq(729 * (725 * x + 727 * y), 729 * 1500)\n",
      "equation2 = Eq(-725 * (729 * x + 731 * y), -725 * 1508)\n",
      "```\n",
      "Now let's simplify the equations and add them up to cancel out the x terms.\n",
      "\n",
      "```python\n",
      "equation1_simplified = Eq(729 * 725 * x + 729 * 727 * y, 729 * 1500)\n",
      "equation2_simplified = Eq(-725 * 729 * x - 725 * 731 * y, -725 * 1508)\n",
      "\n",
      "# Add the two equations to cancel out the x terms\n",
      "combined_equation = Eq(equation1_simplified.lhs + equation2_simplified.lhs, equation1_simplified.rhs + equation2_simplified.rhs)\n",
      "```\n",
      "Now let's solve the combined equation for y.\n",
      "\n",
      "```python\n",
      "y_solution = solve(combined_equation, y)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "No output found. Make sure you print the results.\n",
      "Error: Traceback (most recent call last):\n",
      "      equation1_simplified = Eq(729 * 725 * x + 729 * 727 * y, 729 * 1500)\n",
      "NameError: name 'x' is not defined\n",
      "\n",
      "Error: Traceback (most recent call last):\n",
      "      y_solution = solve(combined_equation, y)\n",
      "NameError: name 'combined_equation' is not defined\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "I apologize for not providing the output. I'll correct the code and print the results for you.\n",
      "\n",
      "```python\n",
      "from sympy import symbols, Eq, solve\n",
      "\n",
      "x, y = symbols('x y')\n",
      "equation1 = Eq(725 * x + 727 * y, 1500)\n",
      "equation2 = Eq(729 * x + 731 * y, 1508)\n",
      "\n",
      "# Multiply equations with necessary multiples\n",
      "equation1 = Eq(729 * (725 * x + 727 * y), 729 * 1500)\n",
      "equation2 = Eq(-725 * (729 * x + 731 * y), -725 * 1508)\n",
      "\n",
      "equation1_simplified = Eq(729 * 725 * x + 729 * 727 * y, 729 * 1500)\n",
      "equation2_simplified = Eq(-725 * 729 * x - 725 * 731 * y, -725 * 1508)\n",
      "\n",
      "# Add the two equations to cancel out the x terms\n",
      "combined_equation = Eq(equation1_simplified.lhs + equation2_simplified.lhs, equation1_simplified.rhs + equation2_simplified.rhs)\n",
      "\n",
      "# Solve the combined equation for y\n",
      "y_solution = solve(combined_equation, y)\n",
      "print(y_solution)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "[25]\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Great, we have found the solution for y: $y = 25$. Now, let's substitute the y value back into one of the original equations to solve for x. We can use the first equation for this.\n",
      "\n",
      "```python\n",
      "# Substitute y in the first equation\n",
      "equation1_substituted = equation1.subs(y, 25)\n",
      "\n",
      "# Solve the equation for x\n",
      "x_solution = solve(equation1_substituted, x)\n",
      "print(x_solution)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "[-23]\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Now we have found the solution for x: $x = -23$. To find the value of $x - y$, we can simply subtract y from x.\n",
      "\n",
      "```python\n",
      "x_minus_y = x_solution[0] - y_solution[0]\n",
      "print(x_minus_y)\n",
      "```\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "-48\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "The value of $x - y$ is $-48$. So the final answer is $\\boxed{-48}$.\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "# we set the prompt_type to \"python\", which is a simplied version of the default prompt.\n",
    "math_problem = \"Problem: If $725x + 727y = 1500$ and $729x+ 731y = 1508$, what is the value of $x - y$ ?\"\n",
@ -753,98 +288,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [
+   "outputs": [],
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mathproxyagent (to assistant):\n",
      "\n",
      "Let's use two tools (Python and Wolfram alpha) to solve a math problem.\n",
      "\n",
      "Query requirements:\n",
      "You must follow the formats below to write your query:\n",
      "For Wolfram Alpha:\n",
      "```wolfram\n",
      "# one wolfram query\n",
      "```\n",
      "For Python:\n",
      "```python\n",
      "# your code\n",
      "```\n",
      "When using Python, you should always use the 'print' function for the output and use fractions/radical forms instead of decimals. You can use packages like sympy to help you.\n",
      "When using wolfram, give one query in each code block.\n",
      "\n",
      "Please follow this process:\n",
      "1. Solve the problem step by step (do not over-divide the steps).\n",
      "2. Take out any queries that can be asked through Python or Wolfram Alpha, select the most suitable tool to be used (for example, any calculations or equations that can be calculated).\n",
      "3. Wait for me to give the results.\n",
      "4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
      "\n",
      "After all the queries are run and you get the answer, put the final answer in \\boxed{}.\n",
      "\n",
      "Problem: Find all numbers $a$ for which the graph of $y=x^2+a$ and the graph of $y=ax$ intersect. Express your answer in interval notation.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "First, let's find the intersection points of the two graphs by setting $y = x^2 + a$ equal to $y = ax$. \n",
      "\n",
      "$x^2 + a = ax$\n",
      "\n",
      "Next, let's rearrange it to a quadratic equation:\n",
      "\n",
      "$x^2 - ax + a = 0$\n",
      "\n",
      "We will find the discriminant, $\\Delta$, using the following formula: \n",
      "$\\Delta = b^2 - 4ac$, where $a = 1$, $b = -a$ and $c = a$.\n",
      "\n",
      "Now, let's calculate the discriminant:\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "Continue. Please keep solving the problem until you need to query. (If you get to the answer, put it in \\boxed{}.)\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "Sure, let's calculate the discriminant:\n",
      "\n",
      "$\\Delta = (-a)^2 - 4(1)(a) = a^2 - 4a$\n",
      "\n",
      "For the quadratic equation to have at least one real solution, the discriminant must be greater than or equal to zero:\n",
      "\n",
      "$a^2 - 4a \\ge 0$\n",
      "\n",
      "Now, let's find the values of $a$ that satisfy this inequality. We can do this by factoring the quadratic:\n",
      "\n",
      "$a(a - 4) \\ge 0$\n",
      "\n",
      "This inequality is true whenever $a \\le 0$ or $a - 4 \\ge 0$. Now let's find the intervals for $a$.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "mathproxyagent (to assistant):\n",
      "\n",
      "Continue. Please keep solving the problem until you need to query. (If you get to the answer, put it in \\boxed{}.)\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "assistant (to mathproxyagent):\n",
      "\n",
      "The inequality $a \\le 0$ represents the interval $(-\\infty, 0]$. \n",
      "\n",
      "The inequality $a - 4 \\ge 0$ can be rewritten as $a \\ge 4$, which represents the interval $[4, \\infty)$. \n",
      "\n",
      "Since we are looking for the values of $a$ where the graphs intersect, we need to consider both intervals. Therefore, the final answer would be the union of the two intervals:\n",
      "\n",
      "\\[\\boxed{(-\\infty, 0] \\cup [4, \\infty)}\\]\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "# The wolfram alpha appid is required for this example (the assistant may choose to query Wolfram Alpha).\n",
    "import os\n",
--- a/notebook/autogen_agentchat_RetrieveChat.ipynb
+++ b/notebook/autogen_agentchat_RetrieveChat.ipynb
@ -8,25 +8,28 @@
    "<a id=\"toc\"></a>\n",
    "# Auto Generated Agent Chat: Using RetrieveChat for Retrieve Augmented Code Generation and Question Answering\n",
    "\n",
-    "RetrieveChat is a convesational framework for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially,`RetrieveAssistantAgent` and  `RetrieveUserProxyAgent` implements a different auto reply mechanism corresponding to the RetrieveChat prompts.\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "RetrieveChat is a convesational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveAssistantAgent` and  `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
    "\n",
    "## Table of Contents\n",
    "We'll demonstrates five examples of using RetrieveChat for code generation and question answering:\n",
    "\n",
-    "[Example 1: Generate code based off docstrings w/o human feedbacks](#example-1)\n",
+    "[Example 1: Generate code based off docstrings w/o human feedback](#example-1)\n",
    "\n",
-    "[Example 2: Answer a question based off docstrings w/o human feedbacks](#example-2)\n",
+    "[Example 2: Answer a question based off docstrings w/o human feedback](#example-2)\n",
    "\n",
-    "[Example 3: Generate code based off docstrings w/ human feedbacks](#example-3)\n",
+    "[Example 3: Generate code based off docstrings w/ human feedback](#example-3)\n",
    "\n",
-    "[Example 4: Answer a question based off docstrings w/ human feedbacks](#example-4)\n",
+    "[Example 4: Answer a question based off docstrings w/ human feedback](#example-4)\n",
    "\n",
    "[Example 5: Solve comprehensive QA problems with RetrieveChat's unique feature `Update Context`](#example-5)\n",
    "\n",
    "\n",
    "## Requirements\n",
    "\n",
-    "FLAML requires `Python>=3.8`. To run this notebook example, please install flaml with the [mathchat] option.\n",
+    "FLAML requires `Python>=3.8`. To run this notebook example, please install flaml with the [retrievechat] option.\n",
    "```bash\n",
    "pip install flaml[retrievechat]\n",
    "```"
@ -38,10 +41,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install flaml[retrievechat]~=2.0.0rc5"
+    "# %pip install flaml[retrievechat]~=2.0.0"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -86,6 +90,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -127,7 +132,7 @@
   "source": [
    "## Construct agents for RetrieveChat\n",
    "\n",
-    "We start by initialzing the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`. The system message needs to be set to \"You are a helpful assistant.\" for RetrieveAssistantAgent. The detailed instructions are given in the user message. Later we will use the `RetrieveUserProxyAgent.generate_init_prompt` to combine the instructions and a math problem for an initial prompt to be sent to the LLM assistant."
+    "We start by initialzing the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`. The system message needs to be set to \"You are a helpful assistant.\" for RetrieveAssistantAgent. The detailed instructions are given in the user message. Later we will use the `RetrieveUserProxyAgent.generate_init_prompt` to combine the instructions and a retrieval augmented generation task for an initial prompt to be sent to the LLM assistant."
   ]
  },
  {
@ -175,6 +180,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -1076,6 +1082,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -2409,6 +2416,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -3266,6 +3274,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -4431,6 +4440,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@ -4961,6 +4971,7 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
--- a/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb
+++ b/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb
@ -19,8 +19,8 @@
   "source": [
    "# Auto Generated Agent Chat: Task Solving with Code Generation, Execution & Debugging\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code.\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
-    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to write code and execute the code. Here `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for the human user to execute the code written by `AssistantAgent`, or automatically execute the code. Depending on the setting of `human_input_mode` and `max_consecutive_auto_reply`, the `UserProxyAgent` either solicits feedback from the human user or returns auto-feedback based on the result of code execution (success or failure and corresponding outputs) to `AssistantAgent`. `AssistantAgent` will debug the code and suggest new code if the result contains error. The two agents keep communicating to each other until the task is done.\n",
    "\n",
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc4"
+    "# %pip install flaml[autogen]~=2.0.0"
   ]
  },
  {
@ -69,7 +69,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\"],\n",
+    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
    "    },\n",
    ")"
   ]
@ -778,7 +778,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.17"
+   "version": "3.9.16"
  },
  "vscode": {
   "interpreter": {
--- a/notebook/autogen_agentchat_chess.ipynb
+++ b/notebook/autogen_agentchat_chess.ipynb
--- a/notebook/autogen_agentchat_function_call.ipynb
+++ b/notebook/autogen_agentchat_function_call.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# Auto Generated Agent Chat: Task Solving with Provided Tools as Functions\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to make function calls with the new feature of OpenAI models (in model version 0613). A specified prompt and function configs need to be passed to `AssistantAgent` to initialize the agent. The corresponding functions need to be passed to `UserProxyAgent`, which will be responsible for executing any function calls made by `AssistantAgent`. Besides this requirement of matching descriptions with functions, we recommend checking the system message in the `AssistantAgent` to make sure the instructions align with the function call descriptions.\n",
    "\n",
@ -36,7 +36,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install flaml[mathchat]~=2.0.0rc4"
+    "# %pip install flaml[mathchat]~=2.0.0"
   ]
  },
  {
--- a/notebook/autogen_agentchat_groupchat.ipynb
+++ b/notebook/autogen_agentchat_groupchat.ipynb
@ -15,7 +15,10 @@
   "source": [
    "# Auto Generated Agent Chat: Group Chat\n",
    "\n",
-    "Modified based on https://github.com/microsoft/FLAML/blob/4ea686af5c3e8ff24d9076a7a626c8b28ab5b1d7/notebook/autogen_multiagent_roleplay_chat.ipynb\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "This notebook is modified based on https://github.com/microsoft/FLAML/blob/4ea686af5c3e8ff24d9076a7a626c8b28ab5b1d7/notebook/autogen_multiagent_roleplay_chat.ipynb\n",
    "\n",
    "## Requirements\n",
    "\n",
@ -27,12 +30,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
-    "# %pip install flaml[autogen]~=2.0.0rc5"
+    "# %pip install flaml[autogen]~=2.0.0"
   ]
  },
  {
@ -47,7 +50,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@ -56,7 +59,7 @@
    "config_list_gpt4 = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\"],\n",
+    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
    "    },\n",
    ")\n",
    "# config_list_gpt35 = autogen.config_list_from_json(\n",
@ -119,7 +122,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
@ -127,6 +130,7 @@
    "human = autogen.UserProxyAgent(\n",
    "   name=\"Human\",\n",
    "   system_message=\"A human admin.\",\n",
    "   code_execution_config={\"last_n_messages\": 2, \"work_dir\": \"groupchat\"},\n",
    ")\n",
    "alice = autogen.AssistantAgent(\n",
    "    name=\"Alice\",\n",
@ -137,7 +141,7 @@
    "    system_message=\"Code reviewer. Prevent code execution if unsafe or not well documented. Suggest changes. Otherwise, approve and return the final code to execute.\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
-    "groupchat = autogen.GroupChat(agents=[human, alice, bob], messages=[], max_round=4)\n",
+    "groupchat = autogen.GroupChat(agents=[human, alice, bob], messages=[], max_round=12)\n",
    "manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)"
   ]
  },
@ -151,7 +155,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
@ -163,295 +167,287 @@
      "find a latest paper about generative agents\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchat_manager\u001b[0m (to Alice):\n",
      "\n",
      "find a latest paper about generative agents\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchat_manager\u001b[0m (to Bob):\n",
      "\n",
      "find a latest paper about generative agents\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mAlice\u001b[0m (to chat_manager):\n",
      "\n",
-      "As an AI, I am unable to browse or search the web, download or read a file directly. But I can provide you with a Python script to scrape Google Scholar for the latest papers on generative agents.\n",
+      "To accomplish this, we can utilize the \"scholarly\" library in Python, which enables us to search Google Scholar for papers. Here's the Python code to achieve this:\n",
      "\n",
      "Make sure that you have the BeautifulSoup and requests libraries installed. If not, you can install them using the pip command:\n",
      "\n",
      "```bash\n",
      "pip install beautifulsoup4 requests\n",
      "```\n",
      "\n",
      "Then you can use this Python script to fetch and print the title of the latest paper:\n",
      "\n",
      "Python code:\n",
      "```python\n",
-      "import requests\n",
+      "# filename: googlescholar_search.py\n",
      "from bs4 import BeautifulSoup\n",
      "\n",
-      "# Send HTTP request to Google Scholar with the query \"generative agents\"\n",
+      "import scholarly\n",
      "res = requests.get('https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG=')\n",
      "\n",
-      "# Parse the HTML content of the page\n",
+      "def get_latest_paper(query):\n",
-      "soup = BeautifulSoup(res.text, 'html.parser')\n",
+      "    search_query = scholarly.search_pubs(query)\n",
      "    paper = next(search_query)\n",
      "    print(\"The latest paper is:\", paper.bib['title'])\n",
      "    print(\"The abstract of the paper is:\", paper.bib['abstract'])\n",
      "    print(\"The year of publication is:\", paper.bib['year'])\n",
      "\n",
-      "# Find the first result (which is the latest) and print its title\n",
+      "get_latest_paper(\"Generative agents\")\n",
      "title = soup.find('h3', {'class': 'gs_rt'}).a.text\n",
      "print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "```\n",
      "Please note that scraping platforms like Google Scholar may not always yield consistent results and is not always advised as it could violate the terms of service. Please use this code responsibly.\n",
      "\n",
      "If you are affiliated with a university or an organization that gives you access to paid scientific repositories (like IEEE, Springer, Elsevier), it's best to use those platforms as they provide more specific and legal access to scientific papers.\n",
      "\n",
      "Alternatively, databases like PubMed or arXiv.org provide free access to a large number of scientific papers - you might want to check them out for latest research papers on your topic of interest.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchat_manager\u001b[0m (to Human):\n",
      "\n",
      "As an AI, I am unable to browse or search the web, download or read a file directly. But I can provide you with a Python script to scrape Google Scholar for the latest papers on generative agents.\n",
      "\n",
      "Make sure that you have the BeautifulSoup and requests libraries installed. If not, you can install them using the pip command:\n",
      "\n",
      "```bash\n",
      "pip install beautifulsoup4 requests\n",
      "```\n",
      "\n",
-      "Then you can use this Python script to fetch and print the title of the latest paper:\n",
+      "To execute this script:\n",
      "1. Save the code to a file named googlescholar_search.py\n",
      "2. Run `pip install scholarly` to install the necessary library if you haven't installed it yet.\n",
      "3. Run `python googlescholar_search.py` to execute the script and get the latest paper on generative agents.\n",
      "\n",
-      "Python code:\n",
+      "Please note that Google Scholar doesn't provide a stable API and has rate limit restrictions in place, meaning that if you run this code multiple times in a short period, Google might temporarily block your IP.\n",
      "```python\n",
      "import requests\n",
      "from bs4 import BeautifulSoup\n",
      "\n",
-      "# Send HTTP request to Google Scholar with the query \"generative agents\"\n",
+      "If the rate limit becomes an issue, you might want to consider using a different database or method, such as utilizing APIs from databases like arXiv or Pubmed, or web scraping, in a manner compliant with the website's robots.txt and terms of usage.\n",
      "res = requests.get('https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG=')\n",
      "\n",
      "# Parse the HTML content of the page\n",
      "soup = BeautifulSoup(res.text, 'html.parser')\n",
      "\n",
      "# Find the first result (which is the latest) and print its title\n",
      "title = soup.find('h3', {'class': 'gs_rt'}).a.text\n",
      "print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "```\n",
      "Please note that scraping platforms like Google Scholar may not always yield consistent results and is not always advised as it could violate the terms of service. Please use this code responsibly.\n",
      "\n",
      "If you are affiliated with a university or an organization that gives you access to paid scientific repositories (like IEEE, Springer, Elsevier), it's best to use those platforms as they provide more specific and legal access to scientific papers.\n",
      "\n",
      "Alternatively, databases like PubMed or arXiv.org provide free access to a large number of scientific papers - you might want to check them out for latest research papers on your topic of interest.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mchat_manager\u001b[0m (to Bob):\n",
      "\n",
      "As an AI, I am unable to browse or search the web, download or read a file directly. But I can provide you with a Python script to scrape Google Scholar for the latest papers on generative agents.\n",
      "\n",
      "Make sure that you have the BeautifulSoup and requests libraries installed. If not, you can install them using the pip command:\n",
      "\n",
      "```bash\n",
      "pip install beautifulsoup4 requests\n",
      "```\n",
      "\n",
      "Then you can use this Python script to fetch and print the title of the latest paper:\n",
      "\n",
      "Python code:\n",
      "```python\n",
      "import requests\n",
      "from bs4 import BeautifulSoup\n",
      "\n",
      "# Send HTTP request to Google Scholar with the query \"generative agents\"\n",
      "res = requests.get('https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG=')\n",
      "\n",
      "# Parse the HTML content of the page\n",
      "soup = BeautifulSoup(res.text, 'html.parser')\n",
      "\n",
      "# Find the first result (which is the latest) and print its title\n",
      "title = soup.find('h3', {'class': 'gs_rt'}).a.text\n",
      "print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "```\n",
      "Please note that scraping platforms like Google Scholar may not always yield consistent results and is not always advised as it could violate the terms of service. Please use this code responsibly.\n",
      "\n",
      "If you are affiliated with a university or an organization that gives you access to paid scientific repositories (like IEEE, Springer, Elsevier), it's best to use those platforms as they provide more specific and legal access to scientific papers.\n",
      "\n",
      "Alternatively, databases like PubMed or arXiv.org provide free access to a large number of scientific papers - you might want to check them out for latest research papers on your topic of interest.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mBob\u001b[0m (to chat_manager):\n",
      "\n",
-      "Your code as it stands can throw an exception and result in an error if the HTTP request fails or if no search results are found. Also, the use of 'beautifulsoup4' and 'requests' should be well-documented.\n",
+      "The provided code seems fine, however, according to Google's Terms of Service API users are restricted from programmatically sending requests to Google Scholar. Even being an unofficial API, it doesn't make using scholarly legal as per the use policy. It's important you consider these limitations when handling this tool and any consequences that may arise on its usage. \n",
      "\n",
-      "Here is the more secure and documented code:\n",
+      "Remember to use APIs responsibly and always in accordance with their terms of service. Without explicit permission from Google, using such a tool can get your IP banned. \n",
      "\n",
-      "```python\n",
+      "I will suggest to use APIs from databases like arXiv or Pubmed, or webscraping, in a manner that is compliant with the website's robots.txt-file and terms of usage.\n",
      "import requests\n",
      "from bs4 import BeautifulSoup\n",
      "\n",
      "# Function that uses requests.get to fetch an URL's content\n",
      "def get_url_content(url):\n",
      "    try:\n",
      "        response = requests.get(url)\n",
      "        response.raise_for_status()\n",
      "        return response.text\n",
      "    except (requests.RequestException, ValueError) as error:\n",
      "        print(f'Google scholar cannot be accessed because of: {error}')\n",
      "        return None\n",
      "\n",
      "# Function to find the title of the latest paper about \"generative agents\"\n",
      "def find_latest_paper(url):\n",
      "    html = get_url_content(url)\n",
      "    if html:\n",
      "        # Parse the HTML content of the page\n",
      "        soup = BeautifulSoup(html, 'html.parser')\n",
      "        # Find the first result (which is the latest one)\n",
      "        result = soup.find('h3', {'class': 'gs_rt'})\n",
      "        \n",
      "        # If result found, print its title; Otherwise, print paper not found\n",
      "        if result:\n",
      "            title = result.a.text\n",
      "            print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "        else:\n",
      "            print(\"No papers about 'generative agents' found.\")\n",
      "    else:\n",
      "        print(\"No internet or Google scholar is down.\")\n",
      "\n",
      "# URL of Google scholar with a search query \"generative agents\"\n",
      "google_scholar_url = 'https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG='\n",
      "\n",
      "find_latest_paper(google_scholar_url)\n",
      "```\n",
      "\n",
      "Always use this script carefully because web-scraping isn't always reliable or legal on all web pages. Always ensure you have express permission or that the website's terms and conditions don't forbid this kind of usage.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001b[33mchat_manager\u001b[0m (to Human):\n",
+      "\u001b[33mAlice\u001b[0m (to chat_manager):\n",
      "\n",
-      "Your code as it stands can throw an exception and result in an error if the HTTP request fails or if no search results are found. Also, the use of 'beautifulsoup4' and 'requests' should be well-documented.\n",
+      "I apologize for the oversight. You're correct, direct scraping of Google Scholar violates Google's terms of service. Let's change to use the arXiv API which doesn't have this issue and is more reliable. Here's the python code:\n",
      "\n",
      "Here is the more secure and documented code:\n",
      "\n",
      "```python\n",
-      "import requests\n",
+      "# filename: arxiv_search.py\n",
-      "from bs4 import BeautifulSoup\n",
+      "import urllib\n",
      "import feedparser\n",
      "\n",
-      "# Function that uses requests.get to fetch an URL's content\n",
+      "def search_arxiv(query: str):\n",
-      "def get_url_content(url):\n",
+      "    base_url = 'http://export.arxiv.org/api/query?'\n",
-      "    try:\n",
+      "    query = {'search_query' : f'ti:{query}', 'start' : 0, 'max_results' : 1, 'sortBy' : 'submittedDate', 'sortOrder' : 'descending'}\n",
-      "        response = requests.get(url)\n",
+      "    url = base_url + urllib.parse.urlencode(query)\n",
-      "        response.raise_for_status()\n",
+      "  \n",
-      "        return response.text\n",
+      "    # connect to arXiv API and get response\n",
-      "    except (requests.RequestException, ValueError) as error:\n",
+      "    response = urllib.request.urlopen(url).read()\n",
      "        print(f'Google scholar cannot be accessed because of: {error}')\n",
      "        return None\n",
      "\n",
-      "# Function to find the title of the latest paper about \"generative agents\"\n",
+      "    # parse the response using feedparser\n",
-      "def find_latest_paper(url):\n",
+      "    feed = feedparser.parse(response)\n",
-      "    html = get_url_content(url)\n",
+      "  \n",
-      "    if html:\n",
+      "    # get the first (and presumably, the most recent) article in the result\n",
-      "        # Parse the HTML content of the page\n",
+      "    entry = feed.entries[0]\n",
      "        soup = BeautifulSoup(html, 'html.parser')\n",
      "        # Find the first result (which is the latest one)\n",
      "        result = soup.find('h3', {'class': 'gs_rt'})\n",
      "        \n",
      "        # If result found, print its title; Otherwise, print paper not found\n",
      "        if result:\n",
      "            title = result.a.text\n",
      "            print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "        else:\n",
      "            print(\"No papers about 'generative agents' found.\")\n",
      "    else:\n",
      "        print(\"No internet or Google scholar is down.\")\n",
      "\n",
-      "# URL of Google scholar with a search query \"generative agents\"\n",
+      "    # print details of the most recent article\n",
-      "google_scholar_url = 'https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG='\n",
+      "    print('The latest paper on', query['search_query'], 'that I could find is:\\n')\n",
      "    print('Title: ', entry.title)\n",
      "    print('Author: ', entry.author)\n",
      "    print('Link: ', entry.link)\n",
      "    print('\\nAbstract: ', entry.summary)\n",
      "\n",
-      "find_latest_paper(google_scholar_url)\n",
+      "# search for the latest paper about \"generative agents\"\n",
      "search_arxiv(\"generative agents\")\n",
      "```\n",
      "\n",
-      "Always use this script carefully because web-scraping isn't always reliable or legal on all web pages. Always ensure you have express permission or that the website's terms and conditions don't forbid this kind of usage.\n",
+      "To execute this script:\n",
      "1. Save the code to a file named arxiv_search.py\n",
      "2. Run `pip install feedparser` to install the necessary library.\n",
      "3. Run `python arxiv_search.py` to execute the script and get the latest paper on generative agents.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001b[33mchat_manager\u001b[0m (to Alice):\n",
+      "\u001b[33mBob\u001b[0m (to chat_manager):\n",
      "\n",
-      "Your code as it stands can throw an exception and result in an error if the HTTP request fails or if no search results are found. Also, the use of 'beautifulsoup4' and 'requests' should be well-documented.\n",
+      "The provided code snippet is clear, efficient, and well-documented. It appropriately uses the arXiv API to retrieve the most recent papers about \"generative agents\". The search terms are correctly URI-encoded and passed to the arXiv query API, and proper error handling is in place.\n",
      "\n",
-      "Here is the more secure and documented code:\n",
+      "However, you should ensure that you handle potential exception which may occur when trying to connect to the URL and parse the response. For example, if the internet is disconnected or something is wrong with the server, `urllib.request.urlopen()` will raise a `URLError`. If the returned content is not properly formatted, `feedparser.parse()` may also fail. You should catch and properly handle these exceptions.\n",
      "\n",
      "Therefore, I would add these modifications:\n",
      "\n",
      "```python\n",
-      "import requests\n",
+      "from urllib.error import URLError\n",
      "from bs4 import BeautifulSoup\n",
      "\n",
-      "# Function that uses requests.get to fetch an URL's content\n",
+      "try:\n",
-      "def get_url_content(url):\n",
+      "    # connect to arXiv API and get response\n",
      "    response = urllib.request.urlopen(url).read()\n",
      "\n",
      "except URLError as e:\n",
      "    print(\"There was a problem connecting to the arXiv API:\")\n",
      "    print(e.reason)\n",
      "\n",
      "else:\n",
      "    try:\n",
-      "        response = requests.get(url)\n",
+      "        # parse the response using feedparser\n",
-      "        response.raise_for_status()\n",
+      "        feed = feedparser.parse(response)\n",
-      "        return response.text\n",
+      "      \n",
-      "    except (requests.RequestException, ValueError) as error:\n",
+      "        # get the first (and presumably, the most recent) article in the result\n",
-      "        print(f'Google scholar cannot be accessed because of: {error}')\n",
+      "        entry = feed.entries[0]\n",
-      "        return None\n",
+      "\n",
      "    except Exception as e:\n",
      "        print(\"There was a problem parsing the result:\")\n",
      "        print(e)\n",
      "\n",
      "# Function to find the title of the latest paper about \"generative agents\"\n",
      "def find_latest_paper(url):\n",
      "    html = get_url_content(url)\n",
      "    if html:\n",
      "        # Parse the HTML content of the page\n",
      "        soup = BeautifulSoup(html, 'html.parser')\n",
      "        # Find the first result (which is the latest one)\n",
      "        result = soup.find('h3', {'class': 'gs_rt'})\n",
      "        \n",
      "        # If result found, print its title; Otherwise, print paper not found\n",
      "        if result:\n",
      "            title = result.a.text\n",
      "            print(f\"The title of the latest paper about 'generative agents' is:\\n{title}\")\n",
      "        else:\n",
      "            print(\"No papers about 'generative agents' found.\")\n",
      "    else:\n",
-      "        print(\"No internet or Google scholar is down.\")\n",
+      "        # print details of the most recent article\n",
-      "\n",
+      "        print('The latest paper on', query['search_query'], 'that I could find is:\\n')\n",
-      "# URL of Google scholar with a search query \"generative agents\"\n",
+      "        print('Title: ', entry.title)\n",
-      "google_scholar_url = 'https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=generative+agents&btnG='\n",
+      "        print('Author: ', entry.author)\n",
-      "\n",
+      "        print('Link: ', entry.link)\n",
-      "find_latest_paper(google_scholar_url)\n",
+      "        print('\\nAbstract: ', entry.summary)\n",
      "```\n",
      "\n",
-      "Always use this script carefully because web-scraping isn't always reliable or legal on all web pages. Always ensure you have express permission or that the website's terms and conditions don't forbid this kind of usage.\n",
+      "The keyword `except` is used to catch and handle exceptions. The modifications suggested include exception handlers for `URLError` (which are raised if there was a problem connecting to the arXiv API) and a generic Exception (which could be any other exception during parsing the response). The `else` keyword allows us to group together the normal operation code, separating it from the error handling code. \n",
      "\n",
-      "--------------------------------------------------------------------------------\n"
+      "The code is ready to be executed now.\n",
-     ]
+      "\n",
-    },
+      "--------------------------------------------------------------------------------\n",
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[31m\n",
      ">>>>>>>> NO HUMAN INPUT RECEIVED.\u001b[0m\n",
      "\u001b[31m\n",
      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
      "\u001b[33mHuman\u001b[0m (to chat_manager):\n",
      "\n",
-      "exitcode: 0 (execution succeeded)\n",
+      "exitcode: 1 (execution failed)\n",
      "Code output: \n",
-      "The title of the latest paper about 'generative agents' is:\n",
+      "Traceback (most recent call last):\n",
-      "Generative agents for player decision modeling in games\n",
+      "  File \"\", line 5, in <module>\n",
      "    response = urllib.request.urlopen(url).read()\n",
      "NameError: name 'urllib' is not defined\n",
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001b[33mchat_manager\u001b[0m (to Alice):\n",
+      "\u001b[33mAlice\u001b[0m (to chat_manager):\n",
      "\n",
-      "exitcode: 0 (execution succeeded)\n",
+      "Apologies for the oversight. It looks like I missed importing the required `urllib.request` module. Please use the following updated code with the necessary import statement:\n",
      "\n",
      "```python\n",
      "# filename: arxiv_search.py\n",
      "import urllib.request\n",
      "import urllib.parse\n",
      "import feedparser\n",
      "from urllib.error import URLError\n",
      "\n",
      "def search_arxiv(query: str):\n",
      "    base_url = 'http://export.arxiv.org/api/query?'\n",
      "    query = {'search_query' : f'ti:{query}', 'start' : 0, 'max_results' : 1, 'sortBy' : 'submittedDate', 'sortOrder' : 'descending'}\n",
      "    url = base_url + urllib.parse.urlencode(query)\n",
      "  \n",
      "    try:\n",
      "        # connect to arXiv API and get response\n",
      "        response = urllib.request.urlopen(url).read()\n",
      "\n",
      "    except URLError as e:\n",
      "        print(\"There was a problem connecting to the arXiv API:\")\n",
      "        print(e.reason)\n",
      "\n",
      "    else:\n",
      "        try:\n",
      "            # parse the response using feedparser\n",
      "            feed = feedparser.parse(response)\n",
      "          \n",
      "            # get the first (and presumably, the most recent) article in the result\n",
      "            entry = feed.entries[0]\n",
      "\n",
      "        except Exception as e:\n",
      "            print(\"There was a problem parsing the result:\")\n",
      "            print(e)\n",
      "\n",
      "        else:\n",
      "            # print details of the most recent article\n",
      "            print('The latest paper on', query['search_query'], 'that I could find is:\\n')\n",
      "            print('Title: ', entry.title)\n",
      "            print('Author: ', entry.author)\n",
      "            print('Link: ', entry.link)\n",
      "            print('\\nAbstract: ', entry.summary)\n",
      "\n",
      "# search for the latest paper about \"generative agents\"\n",
      "search_arxiv(\"generative agents\")\n",
      "```\n",
      "\n",
      "To execute this script:\n",
      "1. Save the code to a file named arxiv_search.py\n",
      "2. Run `pip install feedparser` to install the necessary library.\n",
      "3. Run `python arxiv_search.py` to execute the script and get the latest paper on generative agents.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mBob\u001b[0m (to chat_manager):\n",
      "\n",
      "The amendment is correctly appended. You have properly imported the required `urllib.request` module. The code is now ready for execution. It correctly searches for the latest paper about \"generative agents\" using the arXiv API, and handles potential exceptions that may arise during the connection to the API or while parsing the response. Please proceed to execute this code.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
      "\u001b[33mHuman\u001b[0m (to chat_manager):\n",
      "\n",
      "exitcode: 1 (execution failed)\n",
      "Code output: \n",
-      "The title of the latest paper about 'generative agents' is:\n",
+      "Traceback (most recent call last):\n",
-      "Generative agents for player decision modeling in games\n",
+      "  File \"arxiv_search.py\", line 4, in <module>\n",
      "    import feedparser\n",
      "ModuleNotFoundError: No module named 'feedparser'\n",
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001b[33mchat_manager\u001b[0m (to Bob):\n",
+      "\u001b[33mAlice\u001b[0m (to chat_manager):\n",
      "\n",
      "My apologies for the inconvenience. It appears that the 'feedparser' module is not installed. Please install the 'feedparser' module by running the following command:\n",
      "\n",
      "```sh\n",
      "pip install feedparser\n",
      "```\n",
      "\n",
      "Once you have installed the 'feedparser' module, please execute the arxiv_search.py script again:\n",
      "\n",
      "```sh\n",
      "python arxiv_search.py\n",
      "```\n",
      "\n",
      "This should execute the script and fetch the latest paper on generative agents.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[33mBob\u001b[0m (to chat_manager):\n",
      "\n",
      "That's correct. Make sure to install the 'feedparser' module using the provided command, and then you should be able to execute the updated arxiv_search.py script successfully. The script will search for the latest paper about \"generative agents\" using the arXiv API, and return information about the most recent article it finds.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...\u001b[0m\n",
      "\u001b[31m\n",
      ">>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is sh)...\u001b[0m\n",
      "\u001b[33mHuman\u001b[0m (to chat_manager):\n",
      "\n",
      "exitcode: 0 (execution succeeded)\n",
      "Code output: \n",
-      "The title of the latest paper about 'generative agents' is:\n",
+      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Generative agents for player decision modeling in games\n",
+      "Collecting feedparser\n",
      "  Downloading feedparser-6.0.10-py3-none-any.whl (81 kB)\n",
      "     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.1/81.1 KB 17.2 MB/s eta 0:00:00\n",
      "Collecting sgmllib3k\n",
      "  Downloading sgmllib3k-1.0.0.tar.gz (5.8 kB)\n",
      "  Preparing metadata (setup.py): started\n",
      "  Preparing metadata (setup.py): finished with status 'done'\n",
      "Building wheels for collected packages: sgmllib3k\n",
      "  Building wheel for sgmllib3k (setup.py): started\n",
      "  Building wheel for sgmllib3k (setup.py): finished with status 'done'\n",
      "  Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6046 sha256=867dc31954f27685ad79808f2ca2b5d8235496de750c61f110c200ba664a50e4\n",
      "  Stored in directory: /home/vscode/.cache/pip/wheels/65/7a/a7/78c287f64e401255dff4c13fdbc672fed5efbfd21c530114e1\n",
      "Successfully built sgmllib3k\n",
      "Installing collected packages: sgmllib3k, feedparser\n",
      "Successfully installed feedparser-6.0.10 sgmllib3k-1.0.0\n",
      "\n",
      "The latest paper on ti:generative agents that I could find is:\n",
      "\n",
      "Title:  LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image\n",
      "  Generation\n",
      "Author:  Tat-Seng Chua\n",
      "Link:  http://arxiv.org/abs/2308.05095v1\n",
      "\n",
      "Abstract:  In the text-to-image generation field, recent remarkable progress in Stable\n",
      "Diffusion makes it possible to generate rich kinds of novel photorealistic\n",
      "images. However, current models still face misalignment issues (e.g.,\n",
      "problematic spatial relation understanding and numeration failure) in complex\n",
      "natural scenes, which impedes the high-faithfulness text-to-image generation.\n",
      "Although recent efforts have been made to improve controllability by giving\n",
      "fine-grained guidance (e.g., sketch and scribbles), this issue has not been\n",
      "fundamentally tackled since users have to provide such guidance information\n",
      "manually. In this work, we strive to synthesize high-fidelity images that are\n",
      "semantically aligned with a given textual prompt without any guidance. Toward\n",
      "this end, we propose a coarse-to-fine paradigm to achieve layout planning and\n",
      "image generation. Concretely, we first generate the coarse-grained layout\n",
      "conditioned on a given textual prompt via in-context learning based on Large\n",
      "Language Models. Afterward, we propose a fine-grained object-interaction\n",
      "diffusion method to synthesize high-faithfulness images conditioned on the\n",
      "prompt and the automatically generated layout. Extensive experiments\n",
      "demonstrate that our proposed method outperforms the state-of-the-art models in\n",
      "terms of layout and image generation. Our code and settings are available at\n",
      "\\url{https://layoutllm-t2i.github.io}.\n",
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n"
--- a/notebook/autogen_agentchat_human_feedback.ipynb
+++ b/notebook/autogen_agentchat_human_feedback.ipynb
@ -19,8 +19,8 @@
   "source": [
    "# Auto Generated Agent Chat: Task Solving with Code Generation, Execution, Debugging & Human Feedback\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code.\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
-    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to solve a challenging math problem with human feedback. Here `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. By setting `human_input_mode` properly, the `UserProxyAgent` can also prompt the user for feedback to `AssistantAgent`. For example, when `human_input_mode` is set to \"ALWAYS\", the `UserProxyAgent` will always prompt the user for feedback. When user feedback is provided, the `UserProxyAgent` will directly pass the feedback to `AssistantAgent`. When no user feedback is provided, the `UserProxyAgent` will execute the code written by `AssistantAgent` and return the execution results (success or failure and corresponding outputs) to `AssistantAgent`.\n",
    "\n",
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc4"
+    "# %pip install flaml[autogen]~=2.0.0"
   ]
  },
  {
--- a/notebook/autogen_agentchat_planning.ipynb
+++ b/notebook/autogen_agentchat_planning.ipynb
@ -19,7 +19,8 @@
   "source": [
    "# Auto Generated Agent Chat: Collaborative Task Solving with Coding and Planning Agent\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use multiple agents to work together and accomplish a task which requires finding info from the web and coding. `AssistantAgent` is an LLM-based agent that can write and debug Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. We further create a planning agent for the assistant agent to consult. The planning agent is a variation of the LLM-based `AssistantAgent` with a different system message.\n",
    "\n",
@ -44,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc4 docker"
+    "# %pip install flaml[autogen]~=2.0.0 docker"
   ]
  },
  {
@ -74,7 +75,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\"],\n",
+    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
    "    },\n",
    ")"
   ]
--- a/notebook/autogen_agentchat_stream.ipynb
+++ b/notebook/autogen_agentchat_stream.ipynb
@ -19,8 +19,8 @@
   "source": [
    "# Interactive LLM Agent Dealing with Data Stream\n",
    "\n",
-    "`flaml.autogen` offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code.\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
-    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use customized agents to continuously acquires news from the web and ask for investment suggestions.\n",
    "\n",
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc5"
+    "# %pip install flaml[autogen]~=2.0.0"
   ]
  },
  {
@ -244,9 +244,9 @@
    "    default_auto_reply=None,\n",
    ")\n",
    "\n",
-    "async def add_data_reply(recipient, messages, sender, context):\n",
+    "async def add_data_reply(recipient, messages, sender, config):\n",
    "    await asyncio.sleep(0.1)\n",
-    "    data = context[\"news_stream\"]\n",
+    "    data = config[\"news_stream\"]\n",
    "    if data.done():\n",
    "        result = data.result()\n",
    "        if result:\n",
@ -258,7 +258,7 @@
    "            )\n",
    "        return False, None\n",
    "\n",
-    "user_proxy.register_auto_reply(autogen.AssistantAgent, add_data_reply, 1, context={\"news_stream\": data})"
+    "user_proxy.register_auto_reply(autogen.AssistantAgent, add_data_reply, 1, config={\"news_stream\": data})"
   ]
  },
  {
--- a/notebook/autogen_agentchat_two_users.ipynb
+++ b/notebook/autogen_agentchat_two_users.ipynb
@ -19,7 +19,7 @@
   "source": [
    "# Auto Generated Agent Chat: Collaborative Task Solving with Multiple Agents and Human Users\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate an application involving multiple agents and human users to work together and accomplish a task. `AssistantAgent` is an LLM-based agent that can write Python code (in a Python coding block) for a user to execute for a given task. `UserProxyAgent` is an agent which serves as a proxy for a user to execute the code written by `AssistantAgent`. We create multiple `UserProxyAgent` instances which can represent different human users.\n",
    "\n",
@ -44,7 +44,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc4"
+    "# %pip install flaml[autogen]~=2.0.0"
   ]
  },
  {
@ -74,7 +74,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\"],\n",
+    "        \"model\": [\"gpt-4\", \"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
    "    },\n",
    ")"
   ]
--- a/notebook/autogen_agentchat_web_info.ipynb
+++ b/notebook/autogen_agentchat_web_info.ipynb
@ -19,8 +19,8 @@
   "source": [
    "# Auto Generated Agent Chat: Solving Tasks Requiring Web Info\n",
    "\n",
-    "FLAML offers conversable LLM agents, which can be used to solve various tasks with human or automatic feedback, including tasks that require using tools via code.\n",
+    "`flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
-    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation#agents).\n",
+    "Please find documentation about this feature [here](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen#agents).\n",
    "\n",
    "In this notebook, we demonstrate how to use `AssistantAgent` and `UserProxyAgent` to perform tasks which require acquiring info from the web:\n",
    "* discuss a paper based on its URL.\n",
@ -49,7 +49,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen]~=2.0.0rc4 docker"
+    "# %pip install flaml[autogen]~=2.0.0 docker"
   ]
  },
  {
@ -73,7 +73,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\"],\n",
+    "        \"model\": [\"gpt4\", \"gpt-4-32k\", \"gpt-4-32k-0314\", \"gpt-4-32k-v0314\"],\n",
    "    },\n",
    ")\n",
    "\n",
@ -81,7 +81,6 @@
    "    \"request_timeout\": 600,\n",
    "    \"seed\": 42,\n",
    "    \"config_list\": config_list,\n",
    "    \"model\": \"gpt-4-32k\",  # modify if the endpoint you use doesn't support this model\n",
    "    \"temperature\": 0,\n",
    "}"
   ]
--- a/notebook/autogen_chatgpt_gpt4.ipynb
+++ b/notebook/autogen_chatgpt_gpt4.ipynb
@ -23,7 +23,8 @@
    "\n",
    "# Use FLAML to Tune ChatGPT\n",
    "\n",
-    "FLAML offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. Our study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
+    "`flaml.autogen` offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. The study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
    "Please find documentation about this feature [here](/docs/Use-Cases/AutoGen#enhanced-inference).\n",
    "\n",
    "In this notebook, we tune OpenAI ChatGPT (both GPT-3.5 and GPT-4) models for math problem solving. We use [the MATH benchmark](https://crfm.stanford.edu/helm/latest/?group=math_chain_of_thought) for measuring mathematical problem solving on competition math problems with chain-of-thoughts style reasoning.\n",
    "\n",
--- a/notebook/autogen_openai_completion.ipynb
+++ b/notebook/autogen_openai_completion.ipynb
@ -23,9 +23,10 @@
    "\n",
    "# Use FLAML to Tune OpenAI Models\n",
    "\n",
-    "FLAML offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. Our study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
+    "`flaml.autogen` offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. The research study finds that tuning hyperparameters can significantly improve the utility of LLMs.\n",
    "Please find documentation about this feature [here](/docs/Use-Cases/AutoGen#enhanced-inference).\n",
    "\n",
-    "In this notebook, we tune OpenAI models for code generation. We use [the HumanEval benchmark](https://huggingface.co/datasets/openai_humaneval) released by OpenAI for synthesizing programs from docstrings. \n",
+    "In this notebook, we tune OpenAI models for code generation. We use [the HumanEval benchmark](https://huggingface.co/datasets/openai_humaneval) released by OpenAI for synthesizing programs from docstrings.\n",
    "\n",
    "## Requirements\n",
    "\n",
@ -48,7 +49,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install flaml[autogen,blendsearch]~=2.0.0rc4 datasets"
+    "# %pip install flaml[autogen,blendsearch]~=2.0.0 datasets"
   ]
  },
  {
@ -111,6 +112,7 @@
    "            \"gpt-3.5-turbo-0301\",\n",
    "            \"chatgpt-35-turbo-0301\",\n",
    "            \"gpt-35-turbo-v0301\",\n",
    "            \"gpt\",\n",
    "        },\n",
    "    },\n",
    ")\n",
--- a/test/autogen/agentchat/test_assistant_agent.py
+++ b/test/autogen/agentchat/test_assistant_agent.py
@ -72,6 +72,7 @@ def test_gpt35(human_input_mode="NEVER", max_consecutive_auto_reply=5):
                "gpt-3.5-turbo-0301",
                "chatgpt-35-turbo-0301",
                "gpt-35-turbo-v0301",
                "gpt",
            },
        },
    )
@ -162,7 +163,7 @@ def test_tsp(human_input_mode="NEVER", max_consecutive_auto_reply=10):
        OAI_CONFIG_LIST,
        file_location=KEY_LOC,
        filter_dict={
-            "model": ["gpt-4", "gpt4", "gpt-4-32k", "gpt-4-32k-0314"],
+            "model": ["gpt-4", "gpt4", "gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-v0314"],
        },
    )
    hard_questions = [
--- a/test/autogen/agentchat/test_async.py
+++ b/test/autogen/agentchat/test_async.py
@ -84,9 +84,9 @@ async def test_stream():
        default_auto_reply=None,
    )
-    async def add_data_reply(recipient, messages, sender, context):
+    async def add_data_reply(recipient, messages, sender, config):
        await asyncio.sleep(0.1)
-        data = context["news_stream"]
+        data = config["news_stream"]
        if data.done():
            result = data.result()
            if result:
@ -98,7 +98,7 @@ async def test_stream():
                )
            return False, None
-    user_proxy.register_auto_reply(autogen.AssistantAgent, add_data_reply, 1, context={"news_stream": data})
+    user_proxy.register_auto_reply(autogen.AssistantAgent, add_data_reply, 1, config={"news_stream": data})
    await user_proxy.a_initiate_chat(
        assistant,
--- a/test/autogen/agentchat/test_groupchat.py
+++ b/test/autogen/agentchat/test_groupchat.py
@ -52,8 +52,8 @@ def test_plugin():
    group_chat_manager.register_auto_reply(
        autogen.Agent,
        reply_func=autogen.GroupChatManager.run_chat,
-        context=groupchat,
+        config=groupchat,
-        reset_context=autogen.GroupChat.reset,
+        reset_config=autogen.GroupChat.reset,
    )
    agent1.initiate_chat(group_chat_manager, message="hello")
--- a/test/autogen/agentchat/test_math_user_proxy_agent.py
+++ b/test/autogen/agentchat/test_math_user_proxy_agent.py
@ -28,7 +28,7 @@ def test_math_user_proxy_agent():
        OAI_CONFIG_LIST,
        file_location=KEY_LOC,
        filter_dict={
-            "model": ["gpt-4", "gpt4", "gpt-4-32k", "gpt-4-32k-0314"],
+            "model": ["gpt-4", "gpt4", "gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-v0314"],
        },
    )
    assistant = AssistantAgent(
@ -45,10 +45,11 @@ def test_math_user_proxy_agent():
    assistant.reset()
    math_problem = "$x^3=125$. What is x?"
-    assistant.receive(
+    # assistant.receive(
-        message=mathproxyagent.generate_init_message(math_problem),
+    #     message=mathproxyagent.generate_init_message(math_problem),
-        sender=mathproxyagent,
+    #     sender=mathproxyagent,
-    )
+    # )
    mathproxyagent.initiate_chat(assistant, problem=math_problem)
    print(conversations)
@ -116,7 +117,7 @@ def test_generate_prompt():
 if __name__ == "__main__":
-    test_add_remove_print()
+    # test_add_remove_print()
-    test_execute_one_python_code()
+    # test_execute_one_python_code()
-    test_generate_prompt()
+    # test_generate_prompt()
    test_math_user_proxy_agent()
--- a/test/autogen/agentchat/test_responsive_agent.py
+++ b/test/autogen/agentchat/test_responsive_agent.py
@ -5,38 +5,38 @@ from flaml.autogen.agentchat import ResponsiveAgent
 def test_trigger():
    agent = ResponsiveAgent("a0", max_consecutive_auto_reply=0, llm_config=False, human_input_mode="NEVER")
    agent1 = ResponsiveAgent("a1", max_consecutive_auto_reply=0, human_input_mode="NEVER")
-    agent.register_auto_reply(agent1, lambda recipient, messages, sender, context: (True, "hello"))
+    agent.register_auto_reply(agent1, lambda recipient, messages, sender, config: (True, "hello"))
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello"
-    agent.register_auto_reply("a1", lambda recipient, messages, sender, context: (True, "hello a1"))
+    agent.register_auto_reply("a1", lambda recipient, messages, sender, config: (True, "hello a1"))
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello a1"
    agent.register_auto_reply(
-        ResponsiveAgent, lambda recipient, messages, sender, context: (True, "hello responsive agent")
+        ResponsiveAgent, lambda recipient, messages, sender, config: (True, "hello responsive agent")
    )
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello responsive agent"
    agent.register_auto_reply(
-        lambda sender: sender.name.startswith("a"), lambda recipient, messages, sender, context: (True, "hello a")
+        lambda sender: sender.name.startswith("a"), lambda recipient, messages, sender, config: (True, "hello a")
    )
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello a"
    agent.register_auto_reply(
-        lambda sender: sender.name.startswith("b"), lambda recipient, messages, sender, context: (True, "hello b")
+        lambda sender: sender.name.startswith("b"), lambda recipient, messages, sender, config: (True, "hello b")
    )
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello a"
    agent.register_auto_reply(
-        ["agent2", agent1], lambda recipient, messages, sender, context: (True, "hello agent2 or agent1")
+        ["agent2", agent1], lambda recipient, messages, sender, config: (True, "hello agent2 or agent1")
    )
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello agent2 or agent1"
    agent.register_auto_reply(
-        ["agent2", "agent3"], lambda recipient, messages, sender, context: (True, "hello agent2 or agent3")
+        ["agent2", "agent3"], lambda recipient, messages, sender, config: (True, "hello agent2 or agent3")
    )
    agent1.initiate_chat(agent, message="hi")
    assert agent1.last_message(agent)["content"] == "hello agent2 or agent1"
-    pytest.raises(ValueError, agent.register_auto_reply, 1, lambda recipient, messages, sender, context: (True, "hi"))
+    pytest.raises(ValueError, agent.register_auto_reply, 1, lambda recipient, messages, sender, config: (True, "hi"))
    pytest.raises(ValueError, agent._match_trigger, 1, agent1)
--- a/test/autogen/agentchat/test_retrievechat.py
+++ b/test/autogen/agentchat/test_retrievechat.py
@ -59,7 +59,7 @@ def test_retrievechat():
    assistant.reset()
    code_problem = "How can I use FLAML to perform a classification task, set use_spark=True, train 30 seconds and force cancel jobs if time limit is reached."
-    ragproxyagent.initiate_chat(assistant, problem=code_problem, search_string="spark")
+    ragproxyagent.initiate_chat(assistant, problem=code_problem, search_string="spark", silent=True)
    print(conversations)
--- a/test/autogen/oai/test_completion.py
+++ b/test/autogen/oai/test_completion.py
@ -137,6 +137,7 @@ def test_nocontext():
                    "gpt-3.5-turbo-0301",
                    "chatgpt-35-turbo-0301",
                    "gpt-35-turbo-v0301",
                    "gpt",
                },
            },
        ),
@ -171,6 +172,7 @@ def test_humaneval(num_samples=1):
                "gpt-3.5-turbo-0301",
                "chatgpt-35-turbo-0301",
                "gpt-35-turbo-v0301",
                "gpt",
            },
        },
    )
@ -252,6 +254,7 @@ def test_humaneval(num_samples=1):
        messages=[{"role": "user", "content": "{definition}"}],
        config_list=config_list,
        allow_format_str_template=True,
        request_timeout=120,
    )
    response = autogen.ChatCompletion.create(context=test_data[0], config_list=config_list, **config)
    print(response)
@ -427,10 +430,9 @@ if __name__ == "__main__":
    assert len(config_list) >= 3, config_list
    openai.api_key = os.environ["OPENAI_API_KEY"]
-    # test_filter()
+    test_filter()
    test_chatcompletion()
-    # test_multi_model()
+    test_multi_model()
-    # test_improve()
+    test_nocontext()
-    # test_nocontext()
+    test_humaneval(1)
-    # test_humaneval(1)
+    test_math(1)
    # test_math(1)
--- a/website/blog/2023-04-21-LLM-tuning-math/index.mdx
+++ b/website/blog/2023-04-21-LLM-tuning-math/index.mdx
@ -16,7 +16,7 @@ Large language models (LLMs) are powerful tools that can generate natural langua
 In this blog post, we will explore how model and inference parameter matter in LLM applications, using a case study for [MATH](https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/be83ab3ecd0db773eb2dc1b0a17836a1-Abstract-round2.html), a benchmark for evaluating LLMs on advanced mathematical problem solving. MATH consists of 12K math competition problems from AMC-10, AMC-12 and AIME. Each problem is accompanied by a step-by-step solution.
-We will use the new subpackage [`flaml.autogen`](docs/Use-Cases/Auto-Generation) to automatically find the best model and inference parameter for LLMs on a given task and dataset given an inference budget, using a novel low-cost search & pruning strategy. FLAML currently supports all the LLMs from OpenAI, such as GPT-3.5 and GPT-4.
+We will use the new subpackage [`flaml.autogen`](docs/Use-Cases/Autogen) to automatically find the best model and inference parameter for LLMs on a given task and dataset given an inference budget, using a novel low-cost search & pruning strategy. FLAML currently supports all the LLMs from OpenAI, such as GPT-3.5 and GPT-4.
 We will use FLAML to perform model selection and inference parameter tuning. Then we compare the performance and inference cost on solving algebra problems with the untuned gpt-4. We will also analyze how different difficulty levels affect the results.
@ -69,6 +69,6 @@ The need for model selection, parameter tuning and cost saving is not specific t
 ## For Further Reading
 * [Research paper about the tuning technique](https://arxiv.org/abs/2303.04673)
-* [Documentation about `flaml.autogen`](/docs/Use-Cases/Auto-Generation)
+* [Documentation about `flaml.autogen`](/docs/Use-Cases/Autogen)
 *Do you have any experience to share about LLM applications? Do you like to see more support or research of LLM optimization or automation? Please join our [Discord](https://discord.gg/Cppx2vSPVP) server for discussion.*
--- a/website/blog/2023-05-07-1M-milestone/index.mdx
+++ b/website/blog/2023-05-07-1M-milestone/index.mdx
@ -19,7 +19,7 @@ We'd also like to take the opportunity to reflect on FLAML's past achievements a
 ### Bring AutoML to One's Fingertips
 FLAML offers an off-the-shelf AutoML solution that enables users to quickly discover high-quality models or configurations for common ML/AI tasks. By automatically selecting models and hyperparameters for training or inference, FLAML saves users time and effort. FLAML has significantly reduced development time for developers and data scientists alike, while also providing a convenient way to integrate new algorithms into the pipeline, enabling easy extensions and large-scale parallel tuning. These features make FLAML a valuable tool in R&D efforts for many enterprise users.
-FLAML is capable of handling a variety of common ML tasks, such as [classification](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Classification), [regression](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Regression), [time series forecasting](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Time%20series%20forecast), [NLP tasks](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Rank), and [generative tasks](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation), providing a comprehensive solution for various applications.
+FLAML is capable of handling a variety of common ML tasks, such as [classification](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Classification), [regression](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Regression), [time series forecasting](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Time%20series%20forecast), [NLP tasks](https://microsoft.github.io/FLAML/docs/Examples/AutoML-Rank), and [generative tasks](https://microsoft.github.io/FLAML/docs/Use-Cases/Autogen), providing a comprehensive solution for various applications.
 ### Speed and Efficiency: The FLAML Advantage
 What sets FLAML apart from other AutoML libraries is its exceptional efficiency, thanks to the economical and efficient hyperparameter optimization and model selection methods developed in our [research](https://microsoft.github.io/FLAML/docs/Research). FLAML is also capable of handling large search spaces with heterogeneous evaluation costs, complex constraints, guidance, and early stopping. The [zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) option further reduces the cost of AutoML, making FLAML an even more attractive solution for a wide range of applications with low resources.
@ -37,7 +37,7 @@ We invite contributions from anyone interested in this topic and look forward to
 ## For Further Reading
-* [Documentation about `flaml.autogen`](/docs/Use-Cases/Auto-Generation)
+* [Documentation about `flaml.autogen`](/docs/Use-Cases/Autogen)
 * [Code Example: Tune chatGPT for Math Problem Solving with FLAML](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
 *Do you have any experience to share about LLM applications? Do you like to see more support or research of LLMOps? Please join our [Discord](https://discord.gg/Cppx2vSPVP) server for discussion.*
--- a/website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx
+++ b/website/blog/2023-05-18-GPT-adaptive-humaneval/index.mdx
@ -144,7 +144,7 @@ An example notebook to run this experiment can be found at: https://github.com/m
 ## Discussion
-Our solution is quite simple to [implement](/docs/reference/autogen/code_utils#implement) using a generic interface offered in [`flaml.autogen`](/docs/Use-Cases/Auto-Generation#logic-error), yet the result is quite encouraging.
+Our solution is quite simple to [implement](/docs/reference/autogen/code_utils#implement) using a generic interface offered in [`flaml.autogen`](/docs/Use-Cases/Autogen#logic-error), yet the result is quite encouraging.
 While the specific way of generating assertions is application-specific, the main ideas are general in LLM operations:
 * Generate multiple responses to select - especially useful when selecting a good response is relatively easier than generating a good response at one shot.
@ -164,5 +164,5 @@ There are many directions of extensions in research and development:
 ## For Further Reading
-* [Documentation](/docs/Use-Cases/Auto-Generation) about `flaml.autogen` and [Research paper](https://arxiv.org/abs/2303.04673).
+* [Documentation](/docs/Use-Cases/Autogen) about `flaml.autogen` and [Research paper](https://arxiv.org/abs/2303.04673).
 * [Blog post](/blog/2023/04/21/LLM-tuning-math) about a related study for math.
--- a/website/blog/2023-06-28-MathChat/index.mdx
+++ b/website/blog/2023-06-28-MathChat/index.mdx
@ -89,6 +89,6 @@ Further work can be done to enhance this framework or math problem-solving in ge
 ## For Further Reading
 * [Research paper of MathChat](https://arxiv.org/abs/2306.01337)
-* [Documentation about `flaml.autogen`](/docs/Use-Cases/Auto-Generation)
+* [Documentation about `flaml.autogen`](/docs/Use-Cases/Autogen)
 *Are you working on applications that involve math problem-solving? Would you appreciate additional research or support on the application of LLM-based agents for math problem-solving? Please join our [Discord](https://discord.gg/Cppx2vSPVP) server for discussion.*
--- a/website/blog/2023-07-14-Local-LLMs/index.mdx
+++ b/website/blog/2023-07-14-Local-LLMs/index.mdx
@ -143,5 +143,5 @@ print(response)
 ## For Further Reading
-* [Documentation](/docs/Use-Cases/Auto-Generation) about `flaml.autogen`
+* [Documentation](/docs/Use-Cases/Autogen) about `flaml.autogen`
 * [Documentation](https://github.com/lm-sys/FastChat) about FastChat.
--- a/website/docs/Examples/AutoGen-AgentChat.md
+++ b/website/docs/Examples/AutoGen-AgentChat.md
@ -0,0 +1,15 @@
 # AutoGen - Automated Multi Agent Chat
 `flaml.autogen` offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance via multi-agent conversation.
 Please find documentation about this feature [here](/docs/Use-Cases/Autogen#agents).
 Links to notebook examples:
 * [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)
 * [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_human_feedback.ipynb)
 * [Solve Tasks Requiring Web Info](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_web_info.ipynb)
 * [Use Provided Tools as Functions](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_function_call.ipynb)
 * [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_planning.ipynb)
 * [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_two_users.ipynb)
 * [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_chess.ipynb)
 * [Automated Task Solving by Group Chat](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_groupchat.ipynb)
 * [Automated Continual Learning from New Data](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_stream.ipynb)
--- a/website/docs/Examples/AutoGen-OpenAI.md
+++ b/website/docs/Examples/AutoGen-OpenAI.md
@ -1,138 +1,8 @@
-# AutoGen - OpenAI
+# AutoGen - Tune GPT Models
-FLAML offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. Our study finds that tuning hyperparameters can significantly improve the utility of them.
+`flaml.autogen` offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. The research study finds that tuning hyperparameters can significantly improve the utility of them.
-In this example, we will tune several hyperparameters for the OpenAI's completion API, including the temperature, prompt and n (number of completions), to optimize the inference performance for a code generation task.
+Please find documentation about this feature [here](/docs/Use-Cases/Autogen#enhanced-inference).
-### Prerequisites
+Links to notebook examples:
-
+* [Optimize for Code Generation](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
-Install the [autogen,blendsearch] option.
+* [Optimize for Math](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
 ```bash
 pip install "flaml[autogen,blendsearch] datasets"
 ```
 Setup your OpenAI key:
 ```python
 import os
 if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = "<your OpenAI API key here>"
 ```
 If you use Azure OpenAI, set up Azure using the following code:
 ```python
 import openai
 openai.api_type = "azure"
 openai.api_base = "https://<your_endpoint>.openai.azure.com/"
 openai.api_version = "2023-03-15-preview"  # change if necessary
 ```
 ### Load the dataset
 We use the HumanEval dataset as an example. The dataset contains 164 examples. We use the first 20 for tuning the generation hyperparameters and the remaining for evaluation. In each example, the "prompt" is the prompt string for eliciting the code generation, "test" is the Python code for unit test for the example, and "entry_point" is the function name to be tested.
 ```python
 import datasets
 seed = 41
 data = datasets.load_dataset("openai_humaneval")["test"].shuffle(seed=seed)
 n_tune_data = 20
 tune_data = [
    {
        "definition": data[x]["prompt"],
        "test": data[x]["test"],
        "entry_point": data[x]["entry_point"],
    }
    for x in range(n_tune_data)
 ]
 test_data = [
    {
        "definition": data[x]["prompt"],
        "test": data[x]["test"],
        "entry_point": data[x]["entry_point"],
    }
    for x in range(n_tune_data, len(data))
 ]
 ```
 ### Define the metric
 Before starting tuning, you need to define the metric for the optimization. For each code generation task, we can use the model to generate multiple candidate responses, and then select one from them. If the final selected response can pass a unit test, we consider the task as successfully solved. Then we can define the average success rate on a collection of tasks as the optimization metric.
 ```python
 from functools import partial
 from flaml.autogen.code_utils import eval_function_completions, generate_assertions
 eval_with_generated_assertions = partial(
    eval_function_completions, assertions=generate_assertions,
 )
 ```
 This function will first generate assertion statements for each problem. Then, it uses the assertions to select the generated responses.
 ### Tune the hyperparameters
 The tuning will be performed under the specified optimization budgets.
 * inference_budget is the target average inference budget per instance in the benchmark. For example, 0.02 means the target inference budget is 0.02 dollars, which translates to 1000 tokens (input + output combined) if the text Davinci model is used.
 * optimization_budget is the total budget allowed to perform the tuning. For example, 5 means 5 dollars are allowed in total, which translates to 250K tokens for the text Davinci model.
 * num_sumples is the number of different hyperparameter configurations which is allowed to try. The tuning will stop after either num_samples trials or after optimization_budget dollars spent, whichever happens first. -1 means no hard restriction in the number of trials and the actual number is decided by optimization_budget.
 Users can specify tuning data, optimization metric, optimization mode, evaluation function, search spaces etc.
 ```python
 from flaml import autogen
 config, analysis = autogen.Completion.tune(
    data=tune_data,  # the data for tuning
    metric="success",  # the metric to optimize
    mode="max",  # the optimization mode
    eval_func=eval_with_generated_assertions,  # the evaluation function to return the success metrics
    # log_file_name="logs/humaneval.log",  # the log file name
    inference_budget=0.05,  # the inference budget (dollar per instance)
    optimization_budget=3,  # the optimization budget (dollar in total)
    # num_samples can further limit the number of trials for different hyperparameter configurations;
    # -1 means decided by the optimization budget only
    num_samples=-1,
    prompt=[
        "{definition}",
        "# Python 3{definition}",
        "Complete the following Python function:{definition}",
    ],  # the prompt templates to choose from
    stop=[["\nclass", "\ndef", "\nif", "\nprint"], None],  # the stop sequences
    allow_format_str_template=True,
 )
 ```
 #### Output tuning results
 After the tuning, we can print out the optimized config and the result:
 ```python
 print("optimized config", config)
 print("best result on tuning data", analysis.best_result)
 ```
 #### Make a request with the tuned config
 We can apply the tuned config to the request for an instance:
 ```python
 response = autogen.Completion.create(context=tune_data[1], **config)
 print(response)
 print(eval_with_generated_assertions(autogen.Completion.extract_text(response), **tune_data[1]))
 ```
 #### Evaluate the success rate on the test data
 You can use `autogen.Completion.test` to evaluate the performance of an entire dataset with the tuned config.
 ```python
 result = autogen.Completion.test(test_data, **config)
 print("performance on test data with the tuned config:", result)
 ```
 The result will vary with the inference budget and optimization budget.
 [Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
--- a/website/docs/Getting-Started.md
+++ b/website/docs/Getting-Started.md
@ -3,16 +3,16 @@
 <!-- ### Welcome to FLAML, a Fast Library for Automated Machine Learning & Tuning! -->
 FLAML is a lightweight Python library for efficient automation of machine
-learning and AI operations, including selection of
+learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
-models, hyperparameters, and other tunable choices of an application.
+and optimizes their performance.
 ### Main Features
-* For foundation models like the GPT models, it automates the experimentation and optimization of their performance to maximize the effectiveness for applications and minimize the inference cost. FLAML enables users to build and use adaptive AI agents with minimal effort.
+* FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
-* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., search space and metric), or full customization (arbitrary training/inference/evaluation code).
+* For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend.
-* It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a [cost-effective
+* It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
-hyperparameter optimization](/docs/Use-Cases/Tune-User-Defined-Function#hyperparameter-optimization-algorithm)
+
-and model selection method invented by Microsoft Research, and many followup [research studies](/docs/Research).
+FLAML is powered by a series of [research studies](/docs/Research) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.
 ### Quickstart
@ -20,13 +20,21 @@ Install FLAML from pip: `pip install flaml`. Find more options in [Installation]
 There are several ways of using flaml:
-#### (New) [Auto Generation](/docs/Use-Cases/Auto-Generation)
+#### (New) [Autogen](/docs/Use-Cases/Autogen)
-Maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4, including:
+Autogen enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
- A drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
+It offers customizable and conversable agents which integrate LLMs, tools and human.
 By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
 ```python
 from flaml import autogen
 assistant = autogen.AssistantAgent("assistant")
 user_proxy = autogen.UserProxyAgent("user_proxy")
 user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
 # This initiates an automated chat between the two agents to solve the task
 ```
 Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, error handling, templating. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
 ```python
 # perform tuning
 config, analysis = autogen.Completion.tune(
    data=tune_data,
@ -37,20 +45,13 @@ config, analysis = autogen.Completion.tune(
    optimization_budget=3,
    num_samples=-1,
 )
 # perform inference for a test instance
 response = autogen.Completion.create(context=test_instance, **config)
 ```
 - LLM-driven intelligent agents which can perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
 ```python
 assistant = autogen.AssistantAgent("assistant")
 user_proxy = autogen.UserProxyAgent("user_proxy")
 user_proxy.initiate_chat(assistant, message="Show me the YTD gain of 10 largest technology companies as of today.")
 ```
 #### [Task-oriented AutoML](/docs/Use-Cases/task-oriented-automl)
-For example, with three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.
+With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.
 ```python
 from flaml import AutoML
@ -117,8 +118,8 @@ Then, you can use it just like you use the original `LGMBClassifier`. Your other
 ### Where to Go Next?
-* Understand the use cases for [Auto Generation](/docs/Use-Cases/Auto-Generation), [Task-oriented AutoML](/docs/Use-Cases/Task-Oriented-Automl), [Tune user-defined function](/docs/Use-Cases/Tune-User-Defined-Function) and [Zero-shot AutoML](/docs/Use-Cases/Zero-Shot-AutoML).
+* Understand the use cases for [Autogen](/docs/Use-Cases/Autogen), [Task-oriented AutoML](/docs/Use-Cases/Task-Oriented-Automl), [Tune user-defined function](/docs/Use-Cases/Tune-User-Defined-Function) and [Zero-shot AutoML](/docs/Use-Cases/Zero-Shot-AutoML).
-* Find code examples under "Examples": from [AutoGen - OpenAI](/docs/Examples/AutoGen-OpenAI) to [Tune - PyTorch](/docs/Examples/Tune-PyTorch).
+* Find code examples under "Examples": from [AutoGen - AgentChat](/docs/Examples/AutoGen-AgentChat) to [Tune - PyTorch](/docs/Examples/Tune-PyTorch).
 * Learn about [research](/docs/Research) around FLAML and check [blogposts](/blog).
 * Chat on [Discord](https://discord.gg/Cppx2vSPVP).
--- a/website/docs/Installation.md
+++ b/website/docs/Installation.md
@ -15,7 +15,7 @@ conda install flaml -c conda-forge
 ### Optional Dependencies
-#### [Auto Generation](Use-Cases/Auto-Generation)
+#### [Autogen](Use-Cases/Autogen)
 ```bash
 pip install "flaml[autogen]"
--- a/website/docs/Use-Cases/Auto-Generation.md
+++ b/website/docs/Use-Cases/Auto-Generation.md
@ -1,23 +1,22 @@
-# AutoGen: AutoML for GPT-X Applications
+# AutoGen: Enabling Next-Gen GPT-X Applications
-`flaml.autogen` simplifies hard choices (such as model, prompt, inference parameters and orchestration choices) for developers when finding an optimal operating point in a large and complex design space of large language model (LLM) hierarchy, and offers a virtual interface to highly capable, economical, and fast LLM agents.
+`flaml.autogen` simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness. It enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort.
 ## Features
-* An enhanced inference API as a drop-in replacement of `openai.Completion.create` or `openai.ChatCompletion.create`. It allows easy performance tuning and advanced usage patterns, including:
+* A unified multi-agent conversation framework as a high-level abstraction of using foundation models. It offers customizable and conversable agents which integrate LLM, tool and human.
-  - Leveraging [`flaml.tune`](/docs/reference/tune/tune) to adapt LLMs to applications, to maximize the utility out of using expensive foundation models and reduce the inference cost by using cheaper models or configurations which achieve equal or better performance.
+By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code.
-  - Utilities like API unification, caching, error handling, multi-config inference, context programming etc.
+* A drop-in replacement of `openai.Completion` or `openai.ChatCompletion` as an enhanced inference API. It allows easy performance tuning, utilities like API unification & caching, and advanced usage patterns, such as error handling, multi-config inference, context programming etc.
 * A higher-level abstraction of using foundation models: intelligent agents which can perform tasks autonomously or with human feedback. The same abstraction allows both automated feedback and human feedback sent between agents, so that complex tasks can be accomplished via agent collaborations, including tasks that require using tools via code.
 The package is under active development with more features upcoming.
 ## Agents
-[`flaml.autogen.agentchat`](/docs/reference/autogen/agentchat/agent) offers conversable agents which can adapt to human or simulated feedback. This subpackage is under active development.
+[`flaml.autogen.agentchat`](/docs/reference/autogen/agentchat/agent) offers a multi-agent conversation framework, featuring capable, customizable and conversable agents which integrate LLM, tool and human via automated agent chat.
 ### Basic Concept
-We have designed a generic `ResponsiveAgent` class for Agents that are capable of conversing with each other through the exchange of messages to collaboratively finish a task. An agent can communicate with other agents and perform actions. Different agents can differ in what actions they perform after receiving messages. Two representative subclasses are `AssistantAgent` and `UserProxyAgent`.
+We have designed a generic `ResponsiveAgent` class for Agents that are capable of conversing with each other through the exchange of messages to jointly finish a task. An agent can communicate with other agents and perform actions. Different agents can differ in what actions they perform after receiving messages. Two representative subclasses are `AssistantAgent` and `UserProxyAgent`.
 - `AssistantAgent`. Designed to act as an assistant by responding to user requests. It could write Python code (in a Python coding block) for a user to execute when a message (typically a description of a task that needs to be solved) is received. Under the hood, the Python code is written by LLM (e.g., GPT-4). It can also receive the execution results and suggest code with bug fix. Its behavior can be altered by passing a new system message. The LLM [inference](#enhanced-inference) configuration can be configured via `llm_config`.
 - `UserProxyAgent`. Serves as a proxy for the human user. Upon receiving a message, the UserProxyAgent will either solicit the human user's input or prepare an automatically generated reply. The chosen action depends on the settings of the `human_input_mode` and `max_consecutive_auto_reply` when the `UserProxyAgent` instance is constructed, and whether a human user input is available.
@ -150,30 +149,26 @@ user_proxy.initiate_chat(
 *Interested in trying it yourself? Please check the following notebook examples:*
 * [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)
 * [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_human_feedback.ipynb)
 * [Solve Tasks Requiring Web Info](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_web_info.ipynb)
 * [Use Provided Tools as Functions](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_function_call.ipynb)
 * [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_planning.ipynb)
 * [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_two_users.ipynb)
 * [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_chess.ipynb)
 * [Automated Task Solving by Group Chat](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_groupchat.ipynb)
 * [Automated Continual Learning from New Data](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_stream.ipynb)
 ## Enhanced Inference
 One can use [`flaml.autogen.Completion.create`](/docs/reference/autogen/oai/completion#create) to perform inference.
-There are a number of benefits of using `autogen` to perform inference.
+There are a number of benefits of using `autogen` to perform inference: performance tuning, API unification, caching, error handling, multi-config inference, result filtering, templating and so on.
 ### Tune Inference Parameters
 *Links to notebook examples:*
 * [Optimize for Code Generation](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
 * [Optimize for Math](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
 #### Choices to optimize
 The cost of using foundation models for text generation is typically measured in terms of the number of tokens in the input and output combined. From the perspective of an application builder using foundation models, the use case is to maximize the utility of the generated text under an inference budget constraint (e.g., measured by the average dollar cost needed to solve a coding problem). This can be achieved by optimizing the hyperparameters of the inference,
@ -271,11 +266,6 @@ The returned `config` contains the optimized configuration and `analysis` contai
 The tuend config can be used to perform inference.
 *Refer to this [page](/docs/Examples/AutoGen-OpenAI) for a full example. Or check the following notebook examples:*
 * [Optimize for Code Generation](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_openai_completion.ipynb)
 * [Optimize for Math](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
 ### API unification
 `flaml.autogen.Completion.create` is compatible with both `openai.Completion.create` and `openai.ChatCompletion.create`, and both OpenAI API and Azure OpenAI API. So models such as "text-davinci-003", "gpt-3.5-turbo" and "gpt-4" can share a common API.
`@ -1 +1 @@`
	`__version__ = "2.0.0rc5"`	`__version__ = "2.0.0"`