autogen/website/docs/FAQ.mdx

297 lines
14 KiB
Plaintext

import TOCInline from "@theme/TOCInline";
# Frequently Asked Questions
<TOCInline toc={toc} />
## Install the correct package - `pyautogen`
The name of Autogen package at PyPI is `pyautogen`:
```
pip install pyautogen
```
Typical errors that you might face when using the wrong package are `AttributeError: module 'autogen' has no attribute 'Agent'`, `AttributeError: module 'autogen' has no attribute 'config_list_from_json'` etc.
## Set your API endpoints
This documentation has been moved [here](/docs/topics/llm_configuration).
### Use the constructed configuration list in agents
This documentation has been moved [here](/docs/topics/llm_configuration).
### How does an agent decide which model to pick out of the list?
This documentation has been moved [here](/docs/topics/llm_configuration#how-does-an-agent-decide-which-model-to-pick-out-of-the-list).
### Unexpected keyword argument 'base_url'
In version >=1, OpenAI renamed their `api_base` parameter to `base_url`. So for older versions, use `api_base` but for newer versions use `base_url`.
### Can I use non-OpenAI models?
Yes. You currently have two options:
- Autogen can work with any API endpoint which complies with OpenAI-compatible RESTful APIs - e.g. serving local LLM via FastChat or LM Studio. Please check https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs for an example.
- You can supply your own custom model implementation and use it with Autogen. Please check https://microsoft.github.io/autogen/blog/2024/01/26/Custom-Models for more information.
## Handling API Rate Limits
### Setting the API Rate Limit
You can set the `api_rate_limit` in a `config_list` for an agent, which will be used to control the rate at which API requests are sent.
- `api_rate_limit` (float): the maximum number of API requests allowed per second.
### Handle Rate Limit Error and Timeout Error
You can set `max_retries` to handle rate limit error. And you can set `timeout` to handle timeout error. They can all be specified in `llm_config` for an agent, which will be used in the OpenAI client for LLM inference. They can be set differently for different clients if they are set in the `config_list`.
- `max_retries` (int): the total number of times allowed for retrying failed requests for a single client.
- `timeout` (int): the timeout (in seconds) for a single client.
Please refer to the [documentation](/docs/Use-Cases/enhanced_inference#runtime-error) for more info.
## How to continue a finished conversation
When you call `initiate_chat` the conversation restarts by default. You can use `send` or `initiate_chat(clear_history=False)` to continue the conversation.
## `max_consecutive_auto_reply` vs `max_turn` vs `max_round`
- [`max_consecutive_auto_reply`](https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent#max_consecutive_auto_reply) the maximum number of consecutive auto replie (a reply from an agent without human input is considered an auto reply). It plays a role when `human_input_mode` is not "ALWAYS".
- [`max_turns` in `ConversableAgent.initiate_chat`](https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent#initiate_chat) limits the number of conversation turns between two conversable agents (without differentiating auto-reply and reply/input from human)
- [`max_round` in GroupChat](https://microsoft.github.io/autogen/docs/reference/agentchat/groupchat#groupchat-objects) specifies the maximum number of rounds in a group chat session.
## How do we decide what LLM is used for each agent? How many agents can be used? How do we decide how many agents in the group?
Each agent can be customized. You can use LLMs, tools, or humans behind each agent. If you use an LLM for an agent, use the one best suited for its role. There is no limit of the number of agents, but start from a small number like 2, 3. The more capable is the LLM and the fewer roles you need, the fewer agents you need.
The default user proxy agent doesn't use LLM. If you'd like to use an LLM in UserProxyAgent, the use case could be to simulate user's behavior.
The default assistant agent is instructed to use both coding and language skills. It doesn't have to do coding, depending on the tasks. And you can customize the system message. So if you want to use it for coding, use a model that's good at coding.
## Why is code not saved as file?
If you are using a custom system message for the coding agent, please include something like:
`If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line.`
in the system message. This line is in the default system message of the `AssistantAgent`.
If the `# filename` doesn't appear in the suggested code still, consider adding explicit instructions such as "save the code to disk" in the initial user message in `initiate_chat`.
The `AssistantAgent` doesn't save all the code by default, because there are cases in which one would just like to finish a task without saving the code.
## Legacy code executor
:::note
The new code executors offers more choices of execution backend.
Read more about [code executors](/docs/tutorial/code-executors).
:::
The legacy code executor is used by specifying the `code_execution_config` in the agent's constructor.
```python
from autogen import UserProxyAgent
user_proxy = UserProxyAgent(
name="user_proxy",
code_execution_config={"work_dir":"_output", "use_docker":"python:3"},
)
```
In this example, the `code_execution_config` specifies that the code will be
executed in a docker container with the image `python:3`.
By default, the image name is `python:3-slim` if not specified.
The `work_dir` specifies the directory where the code will be executed.
If you have problems with agents running `pip install` or get errors similar to
`Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')`,
you can choose **'python:3'** as image as shown in the code example above and
that should solve the problem.
By default it runs code in a docker container. If you want to run code locally
(not recommended) then `use_docker` can be set to `False` in `code_execution_config`
for each code-execution agent, or set `AUTOGEN_USE_DOCKER` to `False` as an
environment variable.
You can also develop your AutoGen application in a docker container.
For example, when developing in [GitHub codespace](https://codespaces.new/microsoft/autogen?quickstart=1),
AutoGen runs in a docker container.
If you are not developing in GitHub Codespaces,
follow instructions [here](installation/Docker.md#option-1-install-and-run-autogen-in-docker)
to install and run AutoGen in docker.
## Agents keep thanking each other when using `gpt-3.5-turbo`
When using `gpt-3.5-turbo` you may often encounter agents going into a "gratitude loop", meaning when they complete a task they will begin congratulating and thanking each other in a continuous loop. This is a limitation in the performance of `gpt-3.5-turbo`, in contrast to `gpt-4` which has no problem remembering instructions. This can hinder the experimentation experience when trying to test out your own use case with cheaper models.
A workaround is to add an additional termination notice to the prompt. This acts a "little nudge" for the LLM to remember that they need to terminate the conversation when their task is complete. You can do this by appending a string such as the following to your user input string:
```python
prompt = "Some user query"
termination_notice = (
'\n\nDo not show appreciation in your responses, say only what is necessary. '
'if "Thank you" or "You\'re welcome" are said in the conversation, then say TERMINATE '
'to indicate the conversation is finished and this is your last message.'
)
prompt += termination_notice
```
**Note**: This workaround gets the job done around 90% of the time, but there are occurrences where the LLM still forgets to terminate the conversation.
## ChromaDB fails in codespaces because of old version of sqlite3
(from [issue #251](https://github.com/microsoft/autogen/issues/251))
Code examples that use chromadb (like retrieval) fail in codespaces due to a sqlite3 requirement.
```
>>> import chromadb
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vscode/.local/lib/python3.10/site-packages/chromadb/__init__.py", line 69, in <module>
raise RuntimeError(
RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >= 3.35.0.
Please visit https://docs.trychroma.com/troubleshooting#sqlite to learn how to upgrade.
```
Workaround:
1. `pip install pysqlite3-binary`
2. `mkdir /home/vscode/.local/lib/python3.10/site-packages/google/colab`
Explanation: Per [this gist](https://gist.github.com/defulmere/8b9695e415a44271061cc8e272f3c300?permalink_comment_id=4711478#gistcomment-4711478), linked from the official [chromadb docs](https://docs.trychroma.com/troubleshooting#sqlite), adding this folder triggers chromadb to use pysqlite3 instead of the default.
## How to register a reply function
(from [issue #478](https://github.com/microsoft/autogen/issues/478))
See here https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent/#register_reply
For example, you can register a reply function that gets called when `generate_reply` is called for an agent.
```python
def print_messages(recipient, messages, sender, config):
if "callback" in config and config["callback"] is not None:
callback = config["callback"]
callback(sender, recipient, messages[-1])
print(f"Messages sent to: {recipient.name} | num messages: {len(messages)}")
return False, None # required to ensure the agent communication flow continues
user_proxy.register_reply(
[autogen.Agent, None],
reply_func=print_messages,
config={"callback": None},
)
assistant.register_reply(
[autogen.Agent, None],
reply_func=print_messages,
config={"callback": None},
)
```
In the above, we register a `print_messages` function that is called each time the agent's `generate_reply` is triggered after receiving a message.
## How to get last message ?
Refer to https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent/#last_message
## How to get each agent message ?
Please refer to https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent#chat_messages
## When using autogen docker, is it always necessary to reinstall modules?
The "use_docker" arg in an agent's code_execution_config will be set to the name of the image containing the change after execution, when the conversation finishes.
You can save that image name. For a new conversation, you can set "use_docker" to the saved name of the image to start execution there.
## Database locked error
When using VMs such as Azure Machine Learning compute instances,
you may encounter a "database locked error". This is because the
[LLM cache](./Use-Cases/agent_chat.md#cache)
is trying to write to a location that the application does not have access to.
You can set the `cache_path_root` to a location where the application has access.
For example,
```python
from autogen import Cache
with Cache.disk(cache_path_root="/tmp/.cache") as cache:
agent_a.initate_chat(agent_b, ..., cache=cache)
```
You can also use Redis cache instead of disk cache. For example,
```python
from autogen import Cache
with Cache.redis(redis_url=...) as cache:
agent_a.initate_chat(agent_b, ..., cache=cache)
```
You can also disable the cache. See [here](./Use-Cases/agent_chat.md#llm-caching) for details.
## Agents are throwing due to docker not running, how can I resolve this?
If running AutoGen locally the default for agents who execute code is for them to try and perform code execution within a docker container. If docker is not running, this will cause the agent to throw an error. To resolve this you have some options.
### If you want to disable code execution entirely
- Set `code_execution_config` to `False` for each code-execution agent. E.g.:
```python
user_proxy = autogen.UserProxyAgent(
name="agent",
llm_config=llm_config,
code_execution_config=False)
```
### If you want to run code execution in docker
- **Recommended**: Make sure docker is up and running.
### If you want to run code execution locally
- `use_docker` can be set to `False` in `code_execution_config` for each code-execution agent.
- To set it for all code-execution agents at once: set `AUTOGEN_USE_DOCKER` to `False` as an environment variable.
E.g.:
```python
user_proxy = autogen.UserProxyAgent(
name="agent", llm_config=llm_config,
code_execution_config={"work_dir":"coding", "use_docker":False})
```
### What should I do if I get the error "TypeError: Assistants.create() got an unexpected keyword argument 'file_ids'"?
This error typically occurs when using Autogen version earlier than 0.2.27 in combination with OpenAI library version 1.21 or later. The issue arises because the older version of Autogen does not support the file_ids parameter used by newer versions of the OpenAI API.
To resolve this issue, you need to upgrade your Autogen library to version 0.2.27 or higher that ensures compatibility between Autogen and the OpenAI library.
```python
pip install --upgrade autogen
```
## None of the devcontainers are building due to "Hash sum mismatch", what should I do?
This is an intermittent issue that appears to be caused by some combination of mirror and proxy issues.
If it arises, try to replace the `apt-get update` step with the following:
```bash
RUN echo "Acquire::http::Pipeline-Depth 0;" > /etc/apt/apt.conf.d/99custom && \
echo "Acquire::http::No-Cache true;" >> /etc/apt/apt.conf.d/99custom && \
echo "Acquire::BrokenProxy true;" >> /etc/apt/apt.conf.d/99custom
RUN apt-get clean && \
rm -r /var/lib/apt/lists/* && \
apt-get update -o Acquire::CompressionTypes::Order::=gz && \
apt-get -y update && \
apt-get install sudo git npm # and whatever packages need to be installed in this specific version of the devcontainer
```
This is a combination of StackOverflow suggestions [here](https://stackoverflow.com/a/48777773/2114580) and [here](https://stackoverflow.com/a/76092743/2114580).