[AutoBuild] Supporting build agents from library; supporting generating agent descriptions (#1039)

* try to fix blog

* modify blog

* fix test error in #717; fix blog typo in installation; update blogs with output examples.

* pre-commit

* pre-commit

* Update website/blog/2023-11-26-Agent-AutoBuild/index.mdx

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>

* add future work

* fix grammar

* update agent_builder

* solve #941; add detailed debug info; support json string config

* pre-commit

* solve #954

* pre-commit

* [new feature] build group chat agents from library.

* pre-commit

* add authors' info in notebook; add a new notebook for build_from_library; reduce prompt effort

* update test and example for build_from_library

* pre-commit

* add notebook; update docs

* change notebook name

* change description for notebook and doc

* remove default value for default_llm_config

* add embedding similarity agent selection

* pre-commit

* update test

* add dependency installation in github workflow

* update test

* pre-commit

* update notebook

* support directly json as library; support customize embedding model

* update test

* pre-commit

* update github test workflow

* Update autobuild_agent_library.ipynb

* add agent description

* refine prompt; update notebook

* pre-commit

* update test example

* update test

* update test

* update test

* change `config_path` to `config_path_or_env`; update test

* pre-commit

* update test

* update test

* update test: add config_file_location

* change `config_path_or_env` to `config_file_or_env`

* update test

* solve noqa

* fix import error for conftest

* fix test error

* pre-commit

* * update error message in `_create_agent`.
* replace `gpt-4-1106-preview` to `gpt-4` in test file.

* add comment on local server creation; modify notebook; update contrib-openai.yml for test; add autobuild option in setup.py; add autotest model name statement

* move import huggingface_hub to _create_agent

* pre-commit

* add uncover comment in the endpoint creation code block

* recover contrib-openai.yml for merge

---------

Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
This commit is contained in:
Linxin Song 2024-01-07 02:23:23 +09:00 committed by GitHub
parent e5ebdb66bf
commit e673500129
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 3021 additions and 1350 deletions

View File

@ -200,6 +200,9 @@ jobs:
pip install -e .
python -c "import autogen"
pip install coverage pytest-asyncio
- name: Install packages for test when needed
run: |
pip install -e .[autobuild]
- name: Coverage
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

View File

@ -4,7 +4,7 @@ import subprocess as sp
import socket
import json
import hashlib
from typing import Optional, List, Dict, Tuple, Union
from typing import Optional, List, Dict, Tuple
def _config_check(config: Dict):
@ -19,24 +19,22 @@ def _config_check(config: Dict):
assert (
agent_config.get("system_message", None) is not None
), 'Missing agent "system_message" in your agent_configs.'
assert agent_config.get("description", None) is not None, 'Missing agent "description" in your agent_configs.'
class AgentBuilder:
"""
AgentBuilder can help user build an automatic task solving process powered by multi-agent system.
Specifically, our building pipeline includes initialize and build.
In build(), we prompt a gpt-4 model to create multiple participant agents, and specify whether
this task need programming to solve.
In build(), we prompt a LLM to create multiple participant agents, and specify whether this task need programming to solve.
User can save the built agents' config by calling save(), and load the saved configs by load(), which can skip the
building process.
"""
openai_server_name = "openai"
max_tokens = 945
max_agents = 5 # maximum number of agents build manager can create.
online_server_name = "online"
CODING_PROMPT = """Does the following task need programming (i.e., access external API or tool by coding) to solve,
or use program may help the following task become easier?
or coding may help the following task become easier?
TASK: {task}
@ -44,53 +42,104 @@ class AgentBuilder:
# Answer only YES or NO.
"""
AGENT_NAME_PROMPT = """To complete the following task, what positions/jobs should be set to maximize the efficiency?
AGENT_NAME_PROMPT = """To complete the following task, what positions/jobs should be set to maximize efficiency?
TASK: {task}
Hint:
# Considering the effort, the position in this task should be no more then {max_agents}, less is better.
# Answer the name of those positions/jobs.
# Separated names by comma and use "_" instead of space. For example: Product_manager,Programmer
# Considering the effort, the position in this task should be no more than {max_agents}; less is better.
# These positions' name should include enough information that can help a group chat manager know when to let this position speak.
# The position name should be as specific as possible. For example, use "python_programmer" instead of "programmer".
# Do not use ambiguous position name, such as "domain expert" with no specific description of domain or "technical writer" with no description of what it should write.
# Each position should have a unique function and the position name should reflect this.
# The positions should relate to the task and significantly different in function.
# Add ONLY ONE programming related position if the task needs coding.
# Generated agent's name should follow the format of ^[a-zA-Z0-9_-]{{1,64}}$, use "_" to split words.
# Answer the names of those positions/jobs, separated names by commas.
# Only return the list of positions.
"""
AGENT_SYS_MSG_PROMPT = """Considering the following position and corresponding task:
AGENT_SYS_MSG_PROMPT = """Considering the following position and task:
TASK: {task}
POSITION: {position}
Modify the following position requirement, let it more suitable for the above task and position:
Modify the following position requirement, making it more suitable for the above task and position:
REQUIREMENT: {default_sys_msg}
Hint:
# Your answer should be natural, starting from "You are now in a group chat. You need to complete a task with other participants. As a ...".
# [IMPORTANT] You should let them reply "TERMINATE" when they think the task is completed (the user's need has actually been satisfied).
# The modified requirement should not contain the code interpreter skill.
# You should remove the related skill description when the position is not a programmer or developer.
# Coding skill is limited to Python.
# Your answer should omit the word "REQUIREMENT".
# Your should let them reply "TERMINATE" in the end when the task complete (user's need has been satisfied).
# People with the above position can doubt previous messages or code in the group chat (for example, if there is no
output after executing the code) and provide a corrected answer or code.
# People in the above position should ask for help from the group chat manager when confused and let the manager select another participant.
"""
AGENT_DESCRIPTION_PROMPT = """Considering the following position:
POSITION: {position}
What requirements should this position be satisfied?
Hint:
# This description should include enough information that can help a group chat manager know when to let this position speak.
# People with the above position can doubt previous messages or code in the group chat (for example, if there is no
output after executing the code) and provide a corrected answer or code.
# Your answer should be in at most three sentences.
# Your answer should be natural, starting from "[POSITION's name] is a ...".
# Your answer should include the skills that this position should have.
# Your answer should not contain coding-related skills when the position is not a programmer or developer.
# Coding skills should be limited to Python.
"""
AGENT_SEARCHING_PROMPT = """Considering the following task:
TASK: {task}
What following agents should be involved to the task?
AGENT LIST:
{agent_list}
Hint:
# You should consider if the agent's name and profile match the task.
# Considering the effort, you should select less then {max_agents} agents; less is better.
# Separate agent names by commas and use "_" instead of space. For example, Product_manager,Programmer
# Only return the list of agent names.
"""
def __init__(
self,
config_path: Optional[str] = "OAI_CONFIG_LIST",
config_file_or_env: Optional[str] = "OAI_CONFIG_LIST",
config_file_location: Optional[str] = "",
builder_model: Optional[str] = "gpt-4",
agent_model: Optional[str] = "gpt-4",
host: Optional[str] = "localhost",
endpoint_building_timeout: Optional[int] = 600,
max_tokens: Optional[int] = 945,
max_agents: Optional[int] = 5,
):
"""
(These APIs are experimental and may change in the future.)
Args:
config_path: path of the OpenAI api configs.
config_file_or_env: path or environment of the OpenAI api configs.
builder_model: specify a model as the backbone of build manager.
agent_model: specify a model as the backbone of participant agents.
host: endpoint host.
endpoint_building_timeout: timeout for building up an endpoint server.
max_tokens: max tokens for each agent.
max_agents: max agents for each task.
"""
self.host = host
self.builder_model = builder_model
self.agent_model = agent_model
self.config_path = config_path
self.config_file_or_env = config_file_or_env
self.config_file_location = config_file_location
self.endpoint_building_timeout = endpoint_building_timeout
self.building_task: str = None
@ -100,6 +149,9 @@ class AgentBuilder:
self.agent_procs_assign: Dict[str, Tuple[autogen.ConversableAgent, str]] = {}
self.cached_configs: Dict = {}
self.max_tokens = max_tokens
self.max_agents = max_agents
for port in range(8000, 65535):
if self._is_port_open(host, port):
self.open_ports.append(str(port))
@ -128,6 +180,7 @@ class AgentBuilder:
model_name_or_hf_repo: str,
llm_config: dict,
system_message: Optional[str] = autogen.AssistantAgent.DEFAULT_SYSTEM_MESSAGE,
description: Optional[str] = autogen.AssistantAgent.DEFAULT_DESCRIPTION,
use_oai_assistant: Optional[bool] = False,
world_size: Optional[int] = 1,
) -> autogen.AssistantAgent:
@ -139,26 +192,42 @@ class AgentBuilder:
Args:
agent_name: the name that identify the function of the agent (e.g., Coder, Product Manager,...)
model_name_or_hf_repo:
model_name_or_hf_repo: the name of the model or the huggingface repo.
llm_config: specific configs for LLM (e.g., config_list, seed, temperature, ...).
system_message: system prompt use to format an agent's behavior.
description: a brief description of the agent. This will improve the group chat performance.
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
world_size: the max size of parallel tensors (in most of the cases, this is identical to the amount of GPUs).
Returns:
agent: a set-up agent.
"""
config_list = autogen.config_list_from_json(self.config_path, filter_dict={"model": [model_name_or_hf_repo]})
from huggingface_hub import HfApi
from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError
config_list = autogen.config_list_from_json(
self.config_file_or_env,
file_location=self.config_file_location,
filter_dict={"model": [model_name_or_hf_repo]},
)
if len(config_list) == 0:
raise RuntimeError(
f"Fail to initialize agent:{agent_name}: {self.builder_model} does not exist in {self.config_path}. "
f'If you would like to change this model, please specify the "agent_model" in the constructor.'
f"Fail to initialize agent {agent_name}: {model_name_or_hf_repo} does not exist in {self.config_file_or_env}.\n"
f'If you would like to change this model, please specify the "agent_model" in the constructor.\n'
f"If you load configs from json, make sure the model in agent_configs is in the {self.config_file_or_env}."
)
if "gpt-" in model_name_or_hf_repo:
server_id = self.openai_server_name
else:
try:
hf_api = HfApi()
hf_api.model_info(model_name_or_hf_repo)
model_name = model_name_or_hf_repo.split("/")[-1]
server_id = f"{model_name}_{self.host}"
except GatedRepoError as e:
raise e
except RepositoryNotFoundError:
server_id = self.online_server_name
if server_id != self.online_server_name:
# The code in this block is uncovered by tests because online environment does not support gpu use.
if self.agent_procs.get(server_id, None) is None:
while True:
port = self.open_ports.pop()
@ -227,7 +296,10 @@ class AgentBuilder:
)
else:
agent = autogen.AssistantAgent(
name=agent_name, llm_config=current_config.copy(), system_message=system_message
name=agent_name,
llm_config=current_config.copy(),
system_message=system_message,
description=description,
)
self.agent_procs_assign[agent_name] = (agent, server_id)
return agent
@ -244,7 +316,7 @@ class AgentBuilder:
_, server_id = self.agent_procs_assign[agent_name]
del self.agent_procs_assign[agent_name]
if recycle_endpoint:
if server_id == self.openai_server_name:
if server_id == self.online_server_name:
return
else:
for _, iter_sid in self.agent_procs_assign.values():
@ -264,38 +336,27 @@ class AgentBuilder:
def build(
self,
building_task: Optional[str] = None,
default_llm_config: Optional[Dict] = None,
building_task: str,
default_llm_config: Dict,
coding: Optional[bool] = None,
cached_configs: Optional[Dict] = None,
use_oai_assistant: Optional[bool] = False,
code_execution_config: Optional[Dict] = None,
use_oai_assistant: Optional[bool] = False,
**kwargs,
):
) -> Tuple[List[autogen.ConversableAgent], Dict]:
"""
Auto build agents based on the building task.
Args:
building_task: instruction that helps build manager (gpt-4) to decide what agent should be built.
default_llm_config: specific configs for LLM (e.g., config_list, seed, temperature, ...).
coding: use to identify if the user proxy (a code interpreter) should be added.
cached_configs: previously saved agent configs.
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
code_execution_config: specific configs for user proxy (e.g., last_n_messages, work_dir, ...).
default_llm_config: specific configs for LLM (e.g., config_list, seed, temperature, ...).
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
Returns:
agent_list: a list of agents.
cached_configs: cached configs.
"""
use_api = False
if cached_configs is None:
use_api = True
agent_configs = []
self.building_task = building_task
else:
self.building_task = building_task = cached_configs["building_task"]
default_llm_config = cached_configs["default_llm_config"]
coding = cached_configs["coding"]
agent_configs = cached_configs["agent_configs"]
code_execution_config = cached_configs["code_execution_config"]
if code_execution_config is None:
code_execution_config = {
"last_n_messages": 2,
@ -304,90 +365,91 @@ class AgentBuilder:
"timeout": 60,
}
if use_api:
config_list = autogen.config_list_from_json(self.config_path, filter_dict={"model": [self.builder_model]})
if len(config_list) == 0:
raise RuntimeError(
f"Fail to initialize build manager: {self.builder_model} does not exist in {self.config_path}. "
f'If you want to change this model, please specify the "builder_model" in the constructor.'
)
build_manager = autogen.OpenAIWrapper(config_list=config_list)
agent_configs = []
self.building_task = building_task
print("Generating agents...")
resp_agent_name = (
config_list = autogen.config_list_from_json(
self.config_file_or_env,
file_location=self.config_file_location,
filter_dict={"model": [self.builder_model]},
)
if len(config_list) == 0:
raise RuntimeError(
f"Fail to initialize build manager: {self.builder_model} does not exist in {self.config_file_or_env}. "
f'If you want to change this model, please specify the "builder_model" in the constructor.'
)
build_manager = autogen.OpenAIWrapper(config_list=config_list)
print("==> Generating agents...")
resp_agent_name = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_NAME_PROMPT.format(task=building_task, max_agents=self.max_agents),
}
]
)
.choices[0]
.message.content
)
agent_name_list = [agent_name.strip().replace(" ", "_") for agent_name in resp_agent_name.split(",")]
print(f"{agent_name_list} are generated.")
print("==> Generating system message...")
agent_sys_msg_list = []
for name in agent_name_list:
print(f"Preparing system message for {name}")
resp_agent_sys_msg = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_NAME_PROMPT.format(task=building_task, max_agents=self.max_agents),
"content": self.AGENT_SYS_MSG_PROMPT.format(
task=building_task,
position=name,
default_sys_msg=autogen.AssistantAgent.DEFAULT_SYSTEM_MESSAGE,
),
}
]
)
.choices[0]
.message.content
)
agent_name_list = [agent_name.strip().replace(" ", "_") for agent_name in resp_agent_name.split(",")]
print(f"{agent_name_list} are generated.")
agent_sys_msg_list.append(resp_agent_sys_msg)
agent_sys_msg_list = []
for name in agent_name_list:
print(f"Preparing configuration for {name}...")
resp_agent_sys_msg = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_SYS_MSG_PROMPT.format(
task=building_task,
position=name,
default_sys_msg=autogen.AssistantAgent.DEFAULT_SYSTEM_MESSAGE,
),
}
]
)
.choices[0]
.message.content
print("==> Generating description...")
agent_description_list = []
for name in agent_name_list:
print(f"Preparing description for {name}")
resp_agent_description = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_DESCRIPTION_PROMPT.format(position=name),
}
]
)
agent_sys_msg_list.append(resp_agent_sys_msg)
for i in range(len(agent_name_list)):
agent_configs.append(
{"name": agent_name_list[i], "model": self.agent_model, "system_message": agent_sys_msg_list[i]}
)
if coding is None:
resp = (
build_manager.create(
messages=[{"role": "user", "content": self.CODING_PROMPT.format(task=building_task)}]
)
.choices[0]
.message.content
)
coding = True if resp == "YES" else False
for config in agent_configs:
print(f"Creating agent {config['name']} with backbone {config['model']}...")
self._create_agent(
config["name"],
config["model"],
default_llm_config,
system_message=config["system_message"],
use_oai_assistant=use_oai_assistant,
**kwargs,
.choices[0]
.message.content
)
agent_list = [agent_config[0] for agent_config in self.agent_procs_assign.values()]
agent_description_list.append(resp_agent_description)
if coding is True:
print("Adding user console proxy...")
agent_list = [
autogen.UserProxyAgent(
name="User_console_and_Python_code_interpreter",
is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
system_message="User console with a python code interpreter interface.",
code_execution_config=code_execution_config,
human_input_mode="NEVER",
for name, sys_msg, description in list(zip(agent_name_list, agent_sys_msg_list, agent_description_list)):
agent_configs.append(
{"name": name, "model": self.agent_model, "system_message": sys_msg, "description": description}
)
if coding is None:
resp = (
build_manager.create(
messages=[{"role": "user", "content": self.CODING_PROMPT.format(task=building_task)}]
)
] + agent_list
.choices[0]
.message.content
)
coding = True if resp == "YES" else False
self.cached_configs.update(
{
@ -399,6 +461,225 @@ class AgentBuilder:
}
)
return self._build_agents(use_oai_assistant, **kwargs)
def build_from_library(
self,
building_task: str,
library_path_or_json: str,
default_llm_config: Dict,
coding: Optional[bool] = True,
code_execution_config: Optional[Dict] = None,
use_oai_assistant: Optional[bool] = False,
embedding_model: Optional[str] = None,
**kwargs,
) -> Tuple[List[autogen.ConversableAgent], Dict]:
"""
Build agents from a library.
The library is a list of agent configs, which contains the name and system_message for each agent.
We use a build manager to decide what agent in that library should be involved to the task.
Args:
building_task: instruction that helps build manager (gpt-4) to decide what agent should be built.
library_path_or_json: path or JSON string config of agent library.
default_llm_config: specific configs for LLM (e.g., config_list, seed, temperature, ...).
coding: use to identify if the user proxy (a code interpreter) should be added.
code_execution_config: specific configs for user proxy (e.g., last_n_messages, work_dir, ...).
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
embedding_model: a Sentence-Transformers model use for embedding similarity to select agents from library.
if None, an openai model will be prompted to select agents. As reference, chromadb use "all-mpnet-base-
v2" as default.
Returns:
agent_list: a list of agents.
cached_configs: cached configs.
"""
import chromadb
from chromadb.utils import embedding_functions
if code_execution_config is None:
code_execution_config = {
"last_n_messages": 2,
"work_dir": "groupchat",
"use_docker": False,
"timeout": 60,
}
agent_configs = []
config_list = autogen.config_list_from_json(
self.config_file_or_env,
file_location=self.config_file_location,
filter_dict={"model": [self.builder_model]},
)
if len(config_list) == 0:
raise RuntimeError(
f"Fail to initialize build manager: {self.builder_model} does not exist in {self.config_file_or_env}. "
f'If you want to change this model, please specify the "builder_model" in the constructor.'
)
build_manager = autogen.OpenAIWrapper(config_list=config_list)
try:
agent_library = json.loads(library_path_or_json)
except json.decoder.JSONDecodeError:
with open(library_path_or_json, "r") as f:
agent_library = json.load(f)
print("==> Looking for suitable agents in library...")
if embedding_model is not None:
chroma_client = chromadb.Client()
collection = chroma_client.create_collection(
name="agent_list",
embedding_function=embedding_functions.SentenceTransformerEmbeddingFunction(model_name=embedding_model),
)
collection.add(
documents=[agent["profile"] for agent in agent_library],
metadatas=[{"source": "agent_profile"} for _ in range(len(agent_library))],
ids=[f"agent_{i}" for i in range(len(agent_library))],
)
agent_profile_list = collection.query(query_texts=[building_task], n_results=self.max_agents)["documents"][
0
]
# search name from library
agent_name_list = []
for profile in agent_profile_list:
for agent in agent_library:
if agent["profile"] == profile:
agent_name_list.append(agent["name"])
break
chroma_client.delete_collection(collection.name)
print(f"{agent_name_list} are selected.")
else:
agent_profiles = [
f"No.{i + 1} AGENT's NAME: {agent['name']}\nNo.{i + 1} AGENT's PROFILE: {agent['profile']}\n\n"
for i, agent in enumerate(agent_library)
]
resp_agent_name = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_SEARCHING_PROMPT.format(
task=building_task, agent_list="".join(agent_profiles), max_agents=self.max_agents
),
}
]
)
.choices[0]
.message.content
)
agent_name_list = [agent_name.strip().replace(" ", "_") for agent_name in resp_agent_name.split(",")]
# search profile from library
agent_profile_list = []
for name in agent_name_list:
for agent in agent_library:
if agent["name"] == name:
agent_profile_list.append(agent["profile"])
break
print(f"{agent_name_list} are selected.")
print("==> Generating system message...")
# generate system message from profile
agent_sys_msg_list = []
for name, profile in list(zip(agent_name_list, agent_profile_list)):
print(f"Preparing system message for {name}...")
resp_agent_sys_msg = (
build_manager.create(
messages=[
{
"role": "user",
"content": self.AGENT_SYS_MSG_PROMPT.format(
task=building_task,
position=f"{name}\nPOSITION PROFILE: {profile}",
default_sys_msg=autogen.AssistantAgent.DEFAULT_SYSTEM_MESSAGE,
),
}
]
)
.choices[0]
.message.content
)
agent_sys_msg_list.append(resp_agent_sys_msg)
for name, sys_msg, description in list(zip(agent_name_list, agent_sys_msg_list, agent_profile_list)):
agent_configs.append(
{"name": name, "model": self.agent_model, "system_message": sys_msg, "description": description}
)
if coding is None:
resp = (
build_manager.create(
messages=[{"role": "user", "content": self.CODING_PROMPT.format(task=building_task)}]
)
.choices[0]
.message.content
)
coding = True if resp == "YES" else False
self.cached_configs.update(
{
"building_task": building_task,
"agent_configs": agent_configs,
"coding": coding,
"default_llm_config": default_llm_config,
"code_execution_config": code_execution_config,
}
)
return self._build_agents(use_oai_assistant, **kwargs)
def _build_agents(
self, use_oai_assistant: Optional[bool] = False, **kwargs
) -> Tuple[List[autogen.ConversableAgent], Dict]:
"""
Build agents with generated configs.
Args:
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
Returns:
agent_list: a list of agents.
cached_configs: cached configs.
"""
agent_configs = self.cached_configs["agent_configs"]
default_llm_config = self.cached_configs["default_llm_config"]
coding = self.cached_configs["coding"]
code_execution_config = self.cached_configs["code_execution_config"]
print("==> Creating agents...")
for config in agent_configs:
print(f"Creating agent {config['name']} with backbone {config['model']}...")
self._create_agent(
config["name"],
config["model"],
default_llm_config,
system_message=config["system_message"],
description=config["description"],
use_oai_assistant=use_oai_assistant,
**kwargs,
)
agent_list = [agent_config[0] for agent_config in self.agent_procs_assign.values()]
if coding is True:
print("Adding user console proxy...")
agent_list = (
[
autogen.UserProxyAgent(
name="User_console_and_code_interpreter",
is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
system_message="User console with a python code interpreter interface.",
description="""A user console with a code interpreter interface.
It can provide the code execution results. Select this player when other players provide some code that needs to be executed.
DO NOT SELECT THIS PLAYER WHEN NO CODE TO EXECUTE; IT WILL NOT ANSWER ANYTHING.""",
code_execution_config=code_execution_config,
human_input_mode="NEVER",
)
]
+ agent_list
)
return agent_list, self.cached_configs.copy()
def save(self, filepath: Optional[str] = None) -> str:
@ -424,29 +705,60 @@ class AgentBuilder:
self,
filepath: Optional[str] = None,
config_json: Optional[str] = None,
use_oai_assistant: Optional[bool] = False,
**kwargs,
):
) -> Tuple[List[autogen.ConversableAgent], Dict]:
"""
Load building configs and call the build function to complete building without calling online LLMs' api.
Args:
filepath: filepath or JSON string for the save config.
config_json: JSON string for the save config.
use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
Returns:
agent_list: a list of agents.
cached_configs: cached configs.
"""
# load json string.
if config_json is not None:
cached_configs = json.loads(config_json)
print("Loading config from JSON...")
_config_check(cached_configs)
return self.build(cached_configs=cached_configs, **kwargs)
cached_configs = json.loads(config_json)
# load from path.
if filepath is not None:
print(f"Loading config from {filepath}")
try:
with open(filepath) as f:
cached_configs = json.load(f)
except FileNotFoundError as e:
raise FileNotFoundError(f"{filepath} does not exist.") from e
_config_check(cached_configs)
return self.build(cached_configs=cached_configs, **kwargs)
with open(filepath) as f:
cached_configs = json.load(f)
_config_check(cached_configs)
agent_configs = cached_configs["agent_configs"]
default_llm_config = cached_configs["default_llm_config"]
coding = cached_configs["coding"]
if kwargs.get("code_execution_config", None) is not None:
# for test
self.cached_configs.update(
{
"building_task": cached_configs["building_task"],
"agent_configs": agent_configs,
"coding": coding,
"default_llm_config": default_llm_config,
"code_execution_config": kwargs["code_execution_config"],
}
)
del kwargs["code_execution_config"]
return self._build_agents(use_oai_assistant, **kwargs)
else:
code_execution_config = cached_configs["code_execution_config"]
self.cached_configs.update(
{
"building_task": cached_configs["building_task"],
"agent_configs": agent_configs,
"coding": coding,
"default_llm_config": default_llm_config,
"code_execution_config": code_execution_config,
}
)
return self._build_agents(use_oai_assistant, **kwargs)

View File

@ -0,0 +1,74 @@
[
{
"name": "Environmental_Scientist",
"profile": "As an Environmental Scientist, the candidate should possess a strong background in environmental science, demonstrate the ability to effectively collaborate with a diverse team in a group chat to solve tasks, and have proficiency in Python for data analysis, without the need for code interpretation skills."
},
{
"name": "Astronomer",
"profile": "As an astronomer required to work collaboratively in a group chat setting, the candidate must possess strong proficiency in Python for data analysis and research purposes, alongside the ability to efficiently complete tasks assigned by leadership or colleagues without the need for code interpretation skills."
},
{
"name": "Software_Developer",
"profile": "As a Software Developer for this position, you must be able to work collaboratively in a group chat environment to complete tasks assigned by a leader or colleague, primarily using Python programming expertise, excluding the need for code interpretation skills."
},
{
"name": "Data_Analyst",
"profile": "As a Data Analyst for this position, you must be adept at analyzing data using Python, completing tasks assigned by leaders or colleagues, and collaboratively solving problems in a group chat setting with professionals of various roles."
},
{
"name": "Journalist",
"profile": "As a journalist in this position, you must possess strong collaboration and communication abilities to efficiently complete tasks assigned by leaders or colleagues within a group chat environment, without the need for code interpretation skills, although a basic understanding of Python is preferred."
},
{
"name": "Teacher",
"profile": "As a teacher, you need to possess a bachelor's degree in education or a related field, have a valid teaching certificate, be able to complete assignments provided by supervisors or colleagues, work collaboratively in group chats with professionals from various fields, and have a basic understanding of Python for educational purposes, excluding the need to interpret code."
},
{
"name": "Lawyer",
"profile": "As a lawyer in this position, you must possess a Juris Doctor degree, be licensed to practice law, have strong analytical and communication skills, be able to complete tasks assigned by leaders or colleagues, and collaborate effectively in group chat environments with professionals across various disciplines, while having a basic understanding of Python for task-related purposes, excluding code interpretation."
},
{
"name": "Programmer",
"profile": "As a Programmer for this position, you should be proficient in Python, able to effectively collaborate and solve problems within a group chat environment, and complete tasks assigned by leaders or colleagues without requiring expertise in code interpretation."
},
{
"name": "Accountant",
"profile": "As an accountant in this position, one should possess a strong proficiency in accounting principles, the ability to effectively collaborate within team environments, such as group chats, to solve tasks, and have a basic understanding of Python for limited coding tasks, all while being able to follow directives from leaders and colleagues."
},
{
"name": "Mathematician",
"profile": "As a mathematician in this position, you should possess an advanced degree in mathematics, excel at collaborating and communicating within a group chat to solve complex tasks alongside professionals from various disciplines, and have proficiency in Python for any required computational work."
},
{
"name": "Physicist",
"profile": "As a physicist for this position, one must hold a strong foundation in physics principles, possess a minimum of a master's degree in physics or related fields, demonstrate proficiency in Python for task-specific computations, be willing to collaborate and solve problems within a multidisciplinary group chat, and not be required to interpret code from languages other than Python."
},
{
"name": "Biologist",
"profile": "As a biologist for this position, one must hold a degree in biology or a related field, have proficiency in Python for data analysis, be able to complete tasks assigned by leaders or colleagues, and collaborate effectively in a group chat with professionals from various disciplines."
},
{
"name": "Chemist",
"profile": "As a chemist, one should possess a degree in chemistry or a related field, have strong analytical skills, work collaboratively within a team setting to complete tasks assigned by supervisors or peers, and have a basic proficiency in Python for any necessary data analysis."
},
{
"name": "Statistician",
"profile": "As a Statistician, the applicant should possess a strong background in statistics or mathematics, proficiency in Python for data analysis, the ability to work collaboratively in a team setting through group chats, and readiness to tackle and solve tasks delegated by supervisors or peers."
},
{
"name": "IT_Specialist",
"profile": "As an IT Specialist, you should possess strong problem-solving skills, be able to effectively collaborate within a team setting through group chats, complete tasks assigned by leaders or colleagues, and have proficiency in Python programming, excluding the need for code interpretation expertise."
},
{
"name": "Cybersecurity_Expert",
"profile": "As a Cybersecurity Expert, you must have the ability to collaborate in a group chat, completing tasks assigned by leaders or peers, and possess proficiency in Python, albeit without the need for code interpretation skills."
},
{
"name": "Artificial_Intelligence_Engineer",
"profile": "As an Artificial Intelligence Engineer, you should be adept in Python, able to fulfill tasks assigned by leaders or colleagues, and capable of collaboratively solving problems in a group chat with diverse professionals."
},
{
"name": "Financial_Analyst",
"profile": "As a Financial Analyst, one must possess strong analytical and problem-solving abilities, be proficient in Python for data analysis, have excellent communication skills to collaborate effectively in group chats, and be capable of completing assignments delegated by leaders or colleagues."
}
]

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@ -51,6 +51,7 @@ setuptools.setup(
"blendsearch": ["flaml[blendsearch]"],
"mathchat": ["sympy", "pydantic==1.10.9", "wolframalpha"],
"retrievechat": ["chromadb", "sentence_transformers", "pypdf", "ipython"],
"autobuild": ["chromadb", "sentence-transformers", "huggingface-hub"],
"teachable": ["chromadb"],
"lmm": ["replicate", "pillow"],
"graphs": ["networkx~=3.2.1", "matplotlib~=3.8.1"],

View File

@ -0,0 +1,74 @@
[
{
"name": "Environmental_Scientist",
"profile": "As an Environmental Scientist, the candidate should possess a strong background in environmental science, demonstrate the ability to effectively collaborate with a diverse team in a group chat to solve tasks, and have proficiency in Python for data analysis, without the need for code interpretation skills."
},
{
"name": "Astronomer",
"profile": "As an astronomer required to work collaboratively in a group chat setting, the candidate must possess strong proficiency in Python for data analysis and research purposes, alongside the ability to efficiently complete tasks assigned by leadership or colleagues without the need for code interpretation skills."
},
{
"name": "Software_Developer",
"profile": "As a Software Developer for this position, you must be able to work collaboratively in a group chat environment to complete tasks assigned by a leader or colleague, primarily using Python programming expertise, excluding the need for code interpretation skills."
},
{
"name": "Data_Analyst",
"profile": "As a Data Analyst for this position, you must be adept at analyzing data using Python, completing tasks assigned by leaders or colleagues, and collaboratively solving problems in a group chat setting with professionals of various roles."
},
{
"name": "Journalist",
"profile": "As a journalist in this position, you must possess strong collaboration and communication abilities to efficiently complete tasks assigned by leaders or colleagues within a group chat environment, without the need for code interpretation skills, although a basic understanding of Python is preferred."
},
{
"name": "Teacher",
"profile": "As a teacher, you need to possess a bachelor's degree in education or a related field, have a valid teaching certificate, be able to complete assignments provided by supervisors or colleagues, work collaboratively in group chats with professionals from various fields, and have a basic understanding of Python for educational purposes, excluding the need to interpret code."
},
{
"name": "Lawyer",
"profile": "As a lawyer in this position, you must possess a Juris Doctor degree, be licensed to practice law, have strong analytical and communication skills, be able to complete tasks assigned by leaders or colleagues, and collaborate effectively in group chat environments with professionals across various disciplines, while having a basic understanding of Python for task-related purposes, excluding code interpretation."
},
{
"name": "Programmer",
"profile": "As a Programmer for this position, you should be proficient in Python, able to effectively collaborate and solve problems within a group chat environment, and complete tasks assigned by leaders or colleagues without requiring expertise in code interpretation."
},
{
"name": "Accountant",
"profile": "As an accountant in this position, one should possess a strong proficiency in accounting principles, the ability to effectively collaborate within team environments, such as group chats, to solve tasks, and have a basic understanding of Python for limited coding tasks, all while being able to follow directives from leaders and colleagues."
},
{
"name": "Mathematician",
"profile": "As a mathematician in this position, you should possess an advanced degree in mathematics, excel at collaborating and communicating within a group chat to solve complex tasks alongside professionals from various disciplines, and have proficiency in Python for any required computational work."
},
{
"name": "Physicist",
"profile": "As a physicist for this position, one must hold a strong foundation in physics principles, possess a minimum of a master's degree in physics or related fields, demonstrate proficiency in Python for task-specific computations, be willing to collaborate and solve problems within a multidisciplinary group chat, and not be required to interpret code from languages other than Python."
},
{
"name": "Biologist",
"profile": "As a biologist for this position, one must hold a degree in biology or a related field, have proficiency in Python for data analysis, be able to complete tasks assigned by leaders or colleagues, and collaborate effectively in a group chat with professionals from various disciplines."
},
{
"name": "Chemist",
"profile": "As a chemist, one should possess a degree in chemistry or a related field, have strong analytical skills, work collaboratively within a team setting to complete tasks assigned by supervisors or peers, and have a basic proficiency in Python for any necessary data analysis."
},
{
"name": "Statistician",
"profile": "As a Statistician, the applicant should possess a strong background in statistics or mathematics, proficiency in Python for data analysis, the ability to work collaboratively in a team setting through group chats, and readiness to tackle and solve tasks delegated by supervisors or peers."
},
{
"name": "IT_Specialist",
"profile": "As an IT Specialist, you should possess strong problem-solving skills, be able to effectively collaborate within a team setting through group chats, complete tasks assigned by leaders or colleagues, and have proficiency in Python programming, excluding the need for code interpretation expertise."
},
{
"name": "Cybersecurity_Expert",
"profile": "As a Cybersecurity Expert, you must have the ability to collaborate in a group chat, completing tasks assigned by leaders or peers, and possess proficiency in Python, albeit without the need for code interpretation skills."
},
{
"name": "Artificial_Intelligence_Engineer",
"profile": "As an Artificial Intelligence Engineer, you should be adept in Python, able to fulfill tasks assigned by leaders or colleagues, and capable of collaboratively solving problems in a group chat with diverse professionals."
},
{
"name": "Financial_Analyst",
"profile": "As a Financial Analyst, one must possess strong analytical and problem-solving abilities, be proficient in Python for data analysis, have excellent communication skills to collaborate effectively in group chats, and be capable of completing assignments delegated by leaders or colleagues."
}
]

View File

@ -1,20 +1,35 @@
{
"building_task": "Find a paper on arxiv by programming, and analyze its application in some domain. For example, find a recent paper about gpt-4 on arxiv and find its potential applications in software.",
"building_task": "Generate some agents that can find papers on arxiv by programming and analyzing them in specific domains related to computer science and medical science.",
"agent_configs": [
{
"name": "Data_Scientist",
"name": "ArXiv_Data_Scraper_Developer",
"model": "gpt-4",
"system_message": "You are a proficient Data Scientist with strong Python skills and the ability to analyze academic papers, particularly from arxiv in the domain of programming. Ideally, your tasks involve identifying significant work in the field, such as recent papers on topics like gpt-4, and evaluating their potential applications in areas like software. You should be confident in providing outputs in the form of recommendations, insights, or analytical summaries based solely on the result of your analysis without any additional user feedback or actions. \n\nDetails of your work should include: \n\n 1. Identifying and obtaining the information needed for your task, such as browsing or searching the web, downloading/reading a file, printing the content of a webpage or a file. You'll use Python code to achieve these and more. The output should be comprehensive enough that your following steps based on data analysis can be conducted without requiring any user intervention.\n 2. Performing your main task, which is executing Python code to extract insights and applying your data science expertise to analyze those insights. You will present these results in a manner that satisfies the user's goals without needing further modification or user input. \n 3. Explaining your work in a step-by-step manner. If a plan is not provided initially, you need to formulate and explain your plan first. Clearly distinguish between steps involving coding and those dependent on your data science skills.\n 4. Indicating any errors in the code execution and proposing immediate fixes. If a fix isn't possible, or if the results don't satisfy the goals even after successful execution, you need to adjust your approach accordingly.\n 5. Verifying your results to ensure accuracy. If verifiable evidence can be provided to support your conclusion, make sure to include it in your response.\n \nWhen the task is completed to the satisfaction of the user, you should recognize this and reply with \"TERMINATE\"."
"system_message": "You are now in a group chat. You need to complete a task with other participants. As an ArXiv_Data_Scraper_Developer, your focus is to create and refine tools capable of intelligent search and data extraction from arXiv, honing in on topics within the realms of computer science and medical science. Utilize your proficiency in Python programming to design scripts that navigate, query, and parse information from the platform, generating valuable insights and datasets for analysis. \n\nDuring your mission, it\u2019s not just about formulating queries; your role encompasses the optimization and precision of the data retrieval process, ensuring relevance and accuracy of the information extracted. If you encounter an issue with a script or a discrepancy in the expected output, you are encouraged to troubleshoot and offer revisions to the code you find in the group chat.\n\nWhen you reach a point where the existing codebase does not fulfill task requirements or if the operation of provided code is unclear, you should ask for help from the group chat manager. They will facilitate your advancement by providing guidance or appointing another participant to assist you. Your ability to adapt and enhance scripts based on peer feedback is critical, as the dynamic nature of data scraping demands ongoing refinement of techniques and approaches.\n\nWrap up your participation by confirming the user's need has been satisfied with the data scraping solutions you've provided. Indicate the completion of your task by replying \"TERMINATE\" in the group chat.",
"description": "ArXiv_Data_Scraper_Developer is a specialized software development role requiring proficiency in Python, including familiarity with web scraping libraries such as BeautifulSoup or Scrapy, and a solid understanding of APIs and data parsing. They must possess the ability to identify and correct errors in existing scripts and confidently engage in technical discussions to improve data retrieval processes. The role also involves a critical eye for troubleshooting and optimizing code to ensure efficient data extraction from the ArXiv platform for research and analysis purposes."
},
{
"name": "Machine_Learning_Engineer",
"name": "Computer_Science_Research_Analyst",
"model": "gpt-4",
"system_message": "As a Machine Learning Engineer, your primary tasks involve researching, developing, and applying machine learning and data analysis for complex tasks. In relation to the task at hand, you are expected to find a paper on arxiv using programming techniques, analyze the paper, and discuss its applications in a specific domain, using GPT-4 as an example.\n\nYou will need expertise in Python for implementing your programming skills. If any additional information is required, utilize Python scripts to collect, retrieve, and present the required data by browsing or searching the internet, downloading or reading a file, printing content from a webpage or a file, retrieving the current date/time, or checking the operating system.\n\nUpon collecting the necessary information, use your professional judgment to analyze the data and solve the task at hand. Ensure to perform each task comprehensively and intelligently, presenting each step clearly, specifying when Python code was used and when it was purely your analytical skills. Specify the type of script used in the code block while suggesting a one-time executable Python code to the user, making sure that the code doesn't need modification or addition by the user. If necessary, instruct the user on how to store code into a file prior to execution.\n\nAlways confirm the execution results returned by the user. If there is an error in the execution, you are to correct the error, provide the user with the corrected full script, and prevent suggesting partial or incomplete codes. If an issue persists, revisit your assumptions, gather more data, and consider alternate approaches. Whenever you attain a solution to a task, carefully validate the answer and provide verifiable evidence where possible.\n\nLastly, reply \"TERMINATE\" once the task is complete and all needs have been addressed."
"system_message": "You are now in a group chat. You need to complete a task with other participants. As a Computer Science Research Analyst, your objective is to utilize your analytical capabilities to identify and examine scholarly articles on arXiv, focusing on areas bridging computer science and medical science. Employ Python for automation where appropriate and leverage your expertise in the subject matter to draw insights from the research.\n\nEnsure that the information is acquired systematically; tap into online databases, interpret data sets, and perform literature reviews to pinpoint relevant findings. Should you encounter a complex problem or if you find your progress stalled, feel free to question the existing approaches discussed in the chat or contribute an improved method or analysis.\n\nIf the task proves to be beyond your current means or if you face uncertainty at any stage, seek assistance from the group chat manager. The manager is available to provide guidance or to involve another expert if necessary to move forward effectively.\n\nYour contributions are crucial, and it is important to communicate your findings and conclusions clearly. Once you believe the task is complete and the group's need has been satisfied, please affirm the completion by replying \"TERMINATE\".",
"description": "Computer_Science_Research_Analyst is a role requiring strong analytical skills, a deep understanding of computer science concepts, and proficiency in Python for data analysis and automation. This position should have the ability to critically assess the validity of information, challenge assumptions, and provide evidence-based corrections or alternatives. They should also have excellent communication skills to articulate their findings and suggestions effectively within the group chat."
},
{
"name": "Research_Analyst",
"name": "Medical_Science_Research_Analyst",
"model": "gpt-4",
"system_message": "You are a proficient Research Analyst with a knack for finding and interpreting cutting-edge research in technical fields. Your ability to use Python programming to search, collect and present relevant information is a substantial part of your role.\n\nCarrying out tasks, such as navigating web platforms and downloading/reading files, requires expert use of Python code for execution. You can create detailed scripts like browsing the internet, printing webpage content or a file, obtaining the current date and time, and confirming the operating system. Once enough information has been amassed, harness your understanding of the subject matter to solve the task without the need for more code.\n\nDemonstrating intelligent problem-solving, as well as precise and efficient code execution, is paramount in this job. Perform tasks smartly and in a planned sequence if required. If a plan isn't given, outline your own first.\n\nBe especially clear about the steps that necessitate code and those that use your language competence. Specify the script type within Python code blocks, and ensure the code does not need to be altered by the user before execution. There should be only one code block per response.\n\nIf you need to save codes in a file, signify this by starting your Python code block with # filename: <filename>. Avoid asking the user to copy and paste results. Instead, generate output using the Python 'print' function.\n\nScrutinize the user's execution results and if an error crops up, rectify it immediately. Focus on providing the complete code rather than partial code snippets. If an error persists despite numerous attempts, reassess your assumptions, gather more information if needed, and explore different problem-solving strategies.\n\nPrecision is key when fruitful answers come into view. Strive for careful validation of all answers and, if feasible, include verifiable evidence in your post.\n\nOnce all matters have been diligently addressed, calmly respond back with \"TERMINATE\" to indicate the successful completion of the task."
"system_message": "You are now in a group chat. You need to complete a task with other participants. As a Medical_Science_Research_Analyst, your function is to harness your analytical strengths and understanding of medical research to source and evaluate pertinent papers from the arXiv database, focusing on the intersection of computer science and medical science. Utilize your Python programming skills to automate data retrieval and analysis tasks. Engage in systematic data mining to extract relevant content, then apply your analytical expertise to interpret the findings qualitatively. \n\nWhen there is a requirement to gather information, employ Python scripts to automate the aggregation process. This could include scraping web data, retrieving and processing documents, and performing content analyses. When these scripts produce outputs, use your subject matter expertise to evaluate the results. \n\nProgress through your task step by step. When an explicit plan is absent, present a structured outline of your intended methodology. Clarify which segments of the task are handled through automation, and which necessitate your interpretative skills. \n\nIn the event code is utilized, the script type must be specified. You are expected to execute the scripts provided without making changes. Scripts are to be complete and functionally standalone. Should you encounter an error upon execution, critically review the output, and if needed, present a revised script for the task at hand. \n\nFor tasks that require saving and executing scripts, indicate the intended filename at the beginning of the script. \n\nMaintain clear communication of the results by harnessing the 'print' function where applicable. If an error arises or a task remains unsolved after successful code execution, regroup to collect additional information, reassess your approach, and explore alternative strategies. \n\nUpon reaching a conclusion, substantiate your findings with credible evidence where possible.\n\nConclude your participation by confirming the task's completion with a \"TERMINATE\" response.\n\nShould uncertainty arise at any point, seek guidance from the group chat manager for further directives or reassignment of the task.",
"description": "The Medical Science Research Analyst is a professionally trained individual with strong analytical skills, specializing in interpreting and evaluating scientific research within the medical field. They should possess expertise in data analysis, likely with proficiency in Python for analyzing datasets, and have the ability to critically assess the validity and relevance of previous messages or findings relayed in the group chat. This role requires a solid foundation in medical knowledge to provide accurate and evidence-based corrections or insights."
},
{
"name": "Data_Analysis_Engineer",
"model": "gpt-4",
"system_message": "You are now in a group chat. You need to complete a task with other participants. As a Data Analysis Engineer, your role involves leveraging your analytical skills to gather, process, and analyze large datasets. You will employ various data analysis techniques and tools, particularly Python for scripting, to extract insights from the data related to computer science and medical science domains on arxiv.\n\nIn scenarios where information needs to be collected or analyzed, you will develop Python scripts to automate the data retrieval and processing tasks. For example, you may write scripts to scrape the arXiv website, parse metadata of research papers, filter content based on specific criteria, and perform statistical analysis or data visualization. \n\nYour workflow will include the following steps:\n\n1. Use your Python coding abilities to design scripts for data extraction and analysis. This can involve browsing or searching the web, downloading and reading files, or printing the content of web pages or files relevant to the given domains.\n2. After gathering the necessary data, apply your data analysis expertise to derive meaningful insights or patterns present in the data. This should be done methodically, making the most of your Python skills for data manipulation and interpretation.\n3. Communicate your findings clearly to the group chat. Ensure the results are straightforward for others to understand and act upon.\n4. If any issues arise from executing the code, such as lack of output or unexpected results, you can question the previous messages or code in the group chat and attempt to provide a corrected script or analysis.\n5. When uncertain or facing a complex problem that you cannot solve alone, ask for assistance from the group chat manager. They can either provide guidance or assign another participant to help you.\n\nOnce you believe the task is completed satisfactorily, and you have fulfilled the user's need, respond with \"TERMINATE\" to signify the end of your contribution to the task. Remember, while technical proficiency in Python is essential for this role, the ability to work collaboratively within the group chat, communicate effectively, and adapt to challenges is equally important.",
"description": "Data_Analysis_Engineer is a professional adept in collecting, analyzing, and interpreting large datasets, using statistical tools and machine learning techniques to provide actionable insights. They should possess strong Python coding skills for data manipulation and analysis, an understanding of database management, as well as the ability to communicate complex results effectively to non-technical stakeholders. This position should be allowed to speak when data-driven clarity is needed or when existing analyses or methodologies are called into question."
},
{
"name": "ML_Paper_Summarization_Specialist",
"model": "gpt-4",
"system_message": "You are now in a group chat. You need to complete a task with other participants. As an ML_Paper_Summarization_Specialist, your role entails leveraging machine learning techniques to extract and analyze academic papers from arXiv, focusing on domains that intersect computer science and medical science. Utilize your expertise in natural language processing and data analysis to identify relevant papers, extract key insights, and generate summaries that accurately reflect the advancements and findings within those papers.\n\nYou are expected to apply your deep understanding of machine learning algorithms, data mining, and information retrieval to construct models and systems that can efficiently process and interpret scientific literature.\n\nIf you encounter any challenges in accessing papers, parsing content, or algorithmic processing, you may seek assistance by presenting your issue to the group chat. Should there be a disagreement regarding the efficacy of a method or the accuracy of a summarization, you are encouraged to critically evaluate previous messages or outputs and offer improved solutions to enhance the group's task performance.\n\nShould confusion arise during the task, rather than relying on coding scripts, please request guidance from the group chat manager, and allow them to facilitate the necessary support by inviting another participant who can aid in overcoming the current obstacle.\n\nRemember, your primary duty is to synthesize complex academic content into concise, accessible summaries that will serve as a valuable resource for researchers and professionals seeking to stay abreast of the latest developments in their respective fields. \n\nOnce you believe your task is completed and the summaries provided meet the necessary standards of accuracy and comprehensiveness, reply \"TERMINATE\" to signal the end of your contribution to the group's task.",
"description": "The ML_Paper_Summarization_Specialist is a professional adept in machine learning concepts and current research trends, with strong analytical skills to critically evaluate information, synthesizing knowledge from academic papers into digestible summaries. This specialist should be proficient in Python for text processing and have the ability to provide constructive feedback on technical discussions, guide effective implementation, and correct misconceptions or errors related to machine learning theory and practice in the chat. They should be a reliable resource for clarifying complex information and ensuring accurate application of machine learning techniques within the group chat context."
}
],
"coding": true,
@ -22,9 +37,9 @@
"temperature": 0
},
"code_execution_config": {
"last_n_messages": 2,
"work_dir": "/home/elpis_ubuntu/autogen/test/agentchat/contrib/test_agent_scripts",
"work_dir": "groupchat",
"use_docker": false,
"timeout": 60,
"use_docker": false
"last_n_messages": 2
}
}

View File

@ -2,45 +2,50 @@ import pytest
import os
import json
import sys
from packaging.requirements import Requirement
from autogen.agentchat.contrib.agent_builder import AgentBuilder
from autogen import UserProxyAgent
sys.path.append(os.path.join(os.path.dirname(__file__), "../.."))
from conftest import skip_openai # noqa: E402
sys.path.append(os.path.join(os.path.dirname(__file__), ".."))
from test_assistant_agent import KEY_LOC, OAI_CONFIG_LIST # noqa: E402
sys.path.append(os.path.join(os.path.dirname(__file__), "../.."))
from conftest import skip_openai # noqa: E402
from test_assistant_agent import OAI_CONFIG_LIST, KEY_LOC # noqa: E402
here = os.path.abspath(os.path.dirname(__file__))
oai_config_path = OAI_CONFIG_LIST
# openai>=1 required
try:
from openai import OpenAI, APIError
from openai.types.chat import ChatCompletion
from openai.types.chat.chat_completion import ChatCompletionMessage, Choice
from openai.types.completion import Completion
from openai.types.completion_usage import CompletionUsage
import diskcache
import openai
except ImportError:
skip = True
else:
skip = False or skip_openai
def _config_check(config):
# check config loading
assert config.get("coding", None) is not None
assert config.get("default_llm_config", None) is not None
assert config.get("code_execution_config", None) is not None
for agent_config in config["agent_configs"]:
assert agent_config.get("name", None) is not None
assert agent_config.get("model", None) is not None
assert agent_config.get("description", None) is not None
assert agent_config.get("system_message", None) is not None
@pytest.mark.skipif(
skip,
reason="openai not installed OR requested to skip",
reason="do not run when dependency is not installed or requested to skip",
)
def test_build():
builder = AgentBuilder(config_path=oai_config_path, builder_model="gpt-4", agent_model="gpt-4")
builder = AgentBuilder(
config_file_or_env=OAI_CONFIG_LIST, config_file_location=KEY_LOC, builder_model="gpt-4", agent_model="gpt-4"
)
building_task = (
"Find a paper on arxiv by programming, and analyze its application in some domain. "
"For example, find a recent paper about gpt-4 on arxiv "
"and find its potential applications in software."
)
builder.build(
agent_list, agent_config = builder.build(
building_task=building_task,
default_llm_config={"temperature": 0},
code_execution_config={
@ -50,21 +55,82 @@ def test_build():
"use_docker": "python:3",
},
)
_config_check(agent_config)
# check number of agents
assert len(builder.agent_procs_assign.keys()) <= builder.max_agents
assert len(agent_config["agent_configs"]) <= builder.max_agents
# check system message
for agent, proc in builder.agent_procs_assign.values():
assert "TERMINATE" in agent.system_message
for cfg in agent_config["agent_configs"]:
assert "TERMINATE" in cfg["system_message"]
@pytest.mark.skipif(
skip,
reason="openai not installed OR requested to skip",
reason="do not run when dependency is not installed or requested to skip",
)
def test_build_from_library():
builder = AgentBuilder(
config_file_or_env=OAI_CONFIG_LIST, config_file_location=KEY_LOC, builder_model="gpt-4", agent_model="gpt-4"
)
building_task = (
"Find a paper on arxiv by programming, and analyze its application in some domain. "
"For example, find a recent paper about gpt-4 on arxiv "
"and find its potential applications in software."
)
agent_list, agent_config = builder.build_from_library(
building_task=building_task,
library_path_or_json=f"{here}/example_agent_builder_library.json",
default_llm_config={"temperature": 0},
code_execution_config={
"last_n_messages": 2,
"work_dir": f"{here}/test_agent_scripts",
"timeout": 60,
"use_docker": "python:3",
},
)
_config_check(agent_config)
# check number of agents
assert len(agent_config["agent_configs"]) <= builder.max_agents
# check system message
for cfg in agent_config["agent_configs"]:
assert "TERMINATE" in cfg["system_message"]
builder.clear_all_agents()
# test embedding similarity selection
agent_list, agent_config = builder.build_from_library(
building_task=building_task,
library_path_or_json=f"{here}/example_agent_builder_library.json",
default_llm_config={"temperature": 0},
embedding_model="all-mpnet-base-v2",
code_execution_config={
"last_n_messages": 2,
"work_dir": f"{here}/test_agent_scripts",
"timeout": 60,
"use_docker": "python:3",
},
)
_config_check(agent_config)
# check number of agents
assert len(agent_config["agent_configs"]) <= builder.max_agents
# check system message
for cfg in agent_config["agent_configs"]:
assert "TERMINATE" in cfg["system_message"]
@pytest.mark.skipif(
skip,
reason="do not run when dependency is not installed or requested to skip",
)
def test_save():
builder = AgentBuilder(config_path=oai_config_path, builder_model="gpt-4", agent_model="gpt-4")
builder = AgentBuilder(
config_file_or_env=OAI_CONFIG_LIST, config_file_location=KEY_LOC, builder_model="gpt-4", agent_model="gpt-4"
)
building_task = (
"Find a paper on arxiv by programming, and analyze its application in some domain. "
"For example, find a recent paper about gpt-4 on arxiv "
@ -88,25 +154,20 @@ def test_save():
saved_configs = json.load(open(saved_files))
# check config format
assert saved_configs.get("building_task", None) is not None
assert saved_configs.get("agent_configs", None) is not None
assert saved_configs.get("coding", None) is not None
assert saved_configs.get("default_llm_config", None) is not None
_config_check(saved_configs)
@pytest.mark.skipif(
skip,
reason="openai not installed OR requested to skip",
reason="do not run when dependency is not installed or requested to skip",
)
def test_load():
builder = AgentBuilder(config_path=oai_config_path, builder_model="gpt-4", agent_model="gpt-4")
builder = AgentBuilder(
config_file_or_env=OAI_CONFIG_LIST, config_file_location=KEY_LOC, builder_model="gpt-4", agent_model="gpt-4"
)
config_save_path = f"{here}/example_test_agent_builder_config.json"
configs = json.load(open(config_save_path))
agent_configs = {
e["name"]: {"model": e["model"], "system_message": e["system_message"]} for e in configs["agent_configs"]
}
json.load(open(config_save_path, "r"))
agent_list, loaded_agent_configs = builder.load(
config_save_path,
@ -117,25 +178,19 @@ def test_load():
"use_docker": "python:3",
},
)
print(loaded_agent_configs)
# check config loading
assert loaded_agent_configs["coding"] == configs["coding"]
if loaded_agent_configs["coding"] is True:
assert isinstance(agent_list[0], UserProxyAgent)
agent_list = agent_list[1:]
for agent in agent_list:
agent_name = agent.name
assert agent_configs.get(agent_name, None) is not None
assert agent_configs[agent_name]["model"] == agent.llm_config["model"]
assert agent_configs[agent_name]["system_message"] == agent.system_message
_config_check(loaded_agent_configs)
@pytest.mark.skipif(
skip,
reason="openai not installed OR requested to skip",
reason="do not run when dependency is not installed or requested to skip",
)
def test_clear_agent():
builder = AgentBuilder(config_path=oai_config_path, builder_model="gpt-4", agent_model="gpt-4")
builder = AgentBuilder(
config_file_or_env=OAI_CONFIG_LIST, config_file_location=KEY_LOC, builder_model="gpt-4", agent_model="gpt-4"
)
config_save_path = f"{here}/example_test_agent_builder_config.json"
builder.load(
@ -151,3 +206,11 @@ def test_clear_agent():
# check if the agent cleared
assert len(builder.agent_procs_assign) == 0
if __name__ == "__main__":
test_build()
test_build_from_library()
test_save()
test_load()
test_clear_agent()

View File

@ -14,7 +14,7 @@ user prompt required, powered by a new designed class **AgentBuilder**. AgentBui
leveraging [vLLM](https://docs.vllm.ai/en/latest/index.html) and [FastChat](https://github.com/lm-sys/FastChat).
Checkout example notebooks and source code for reference:
- [AutoBuild Examples](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_autobuild.ipynb)
- [AutoBuild Examples](https://github.com/microsoft/autogen/blob/main/notebook/autobuild_basic.ipynb)
- [AgentBuilder](https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/agent_builder.py)
## Introduction
@ -29,7 +29,7 @@ up an endpoint server automatically without any user participation.
## Installation
- AutoGen:
```bash
pip install pyautogen~=0.2.0
pip install pyautogen[autobuild]
```
- (Optional: if you want to use open-source LLMs) vLLM and FastChat
```bash
@ -43,7 +43,7 @@ In this section, we provide a step-by-step example of how to use AgentBuilder to
First, we need to prepare the Agent configurations.
Specifically, a config path containing the model name and API key, and a default config for each agent, are required.
```python
config_path = '/home/elpis_ubuntu/LLM/autogen/OAI_CONFIG_LIST' # modify path
config_file_or_env = '/home/elpis_ubuntu/LLM/autogen/OAI_CONFIG_LIST' # modify path
default_llm_config = {
'temperature': 0
}
@ -55,7 +55,7 @@ You can also specific the builder model and agent model, which are the LLMs used
```python
from autogen.agentchat.contrib.agent_builder import AgentBuilder
builder = AgentBuilder(config_path=config_path, builder_model='gpt-4-1106-preview', agent_model='gpt-4-1106-preview')
builder = AgentBuilder(config_file_or_env=config_file_or_env, builder_model='gpt-4-1106-preview', agent_model='gpt-4-1106-preview')
```
### Step 3: specify the building task
@ -80,9 +80,10 @@ For example
// an example of agent_configs. AgentBuilder will generate agents with the following configurations.
[
{
"name": "Data_scientist",
"name": "ArXiv_Data_Scraper_Developer",
"model": "gpt-4-1106-preview",
"system_message": "As a Data Scientist, you are tasked with automating the retrieval and analysis of academic papers from arXiv. Utilize your Python programming acumen to develop scripts for gathering necessary information such as searching for relevant papers, downloading them, and processing their contents. Apply your analytical and language skills to interpret the data and deduce the applications of the research within specific domains.\n\n1. To compile information, write and implement Python scripts that search and interact with online resources, download and read files, extract content from documents, and perform other information-gathering tasks. Use the printed output as the foundation for your subsequent analysis.\n\n2. Execute tasks programmatically with Python scripts when possible, ensuring results are directly displayed. Approach each task with efficiency and strategic thinking.\n\nProgress through tasks systematically. In instances where a strategy is not provided, outline your plan before executing. Clearly distinguish between tasks handled via code and those utilizing your analytical expertise.\n\nWhen providing code, include only Python scripts meant to be run without user alterations. Users should execute your script as is, without modifications:\n\n```python\n# filename: <filename>\n# Python script\nprint(\"Your output\")\n```\n\nUsers should not perform any actions other than running the scripts you provide. Avoid presenting partial or incomplete scripts that require user adjustments. Refrain from requesting users to copy-paste results; instead, use the 'print' function when suitable to display outputs. Monitor the execution results they share.\n\nIf an error surfaces, supply corrected scripts for a re-run. If the strategy fails to resolve the issue, reassess your assumptions, gather additional details as needed, and explore alternative approaches.\n\nUpon successful completion of a task and verification of the results, confirm the achievement of the stated objective. Ensuring accuracy and validity of the findings is paramount. Evidence supporting your conclusions should be provided when feasible.\n\nUpon satisfying the user's needs and ensuring all tasks are finalized, conclude your assistance with \"TERMINATE\"."
"system_message": "You are now in a group chat. You need to complete a task with other participants. As an ArXiv_Data_Scraper_Developer, your focus is to create and refine tools capable of intelligent search and data extraction from arXiv, honing in on topics within the realms of computer science and medical science. Utilize your proficiency in Python programming to design scripts that navigate, query, and parse information from the platform, generating valuable insights and datasets for analysis. \n\nDuring your mission, it\u2019s not just about formulating queries; your role encompasses the optimization and precision of the data retrieval process, ensuring relevance and accuracy of the information extracted. If you encounter an issue with a script or a discrepancy in the expected output, you are encouraged to troubleshoot and offer revisions to the code you find in the group chat.\n\nWhen you reach a point where the existing codebase does not fulfill task requirements or if the operation of provided code is unclear, you should ask for help from the group chat manager. They will facilitate your advancement by providing guidance or appointing another participant to assist you. Your ability to adapt and enhance scripts based on peer feedback is critical, as the dynamic nature of data scraping demands ongoing refinement of techniques and approaches.\n\nWrap up your participation by confirming the user's need has been satisfied with the data scraping solutions you've provided. Indicate the completion of your task by replying \"TERMINATE\" in the group chat.",
"description": "ArXiv_Data_Scraper_Developer is a specialized software development role requiring proficiency in Python, including familiarity with web scraping libraries such as BeautifulSoup or Scrapy, and a solid understanding of APIs and data parsing. They must possess the ability to identify and correct errors in existing scripts and confidently engage in technical discussions to improve data retrieval processes. The role also involves a critical eye for troubleshooting and optimizing code to ensure efficient data extraction from the ArXiv platform for research and analysis purposes."
},
...
]
@ -94,7 +95,7 @@ Let agents generated in `build()` complete the task collaboratively in a group c
import autogen
def start_task(execution_task: str, agent_list: list, llm_config: dict):
config_list = autogen.config_list_from_json(config_path, filter_dict={"model": ["gpt-4-1106-preview"]})
config_list = autogen.config_list_from_json(config_file_or_env, filter_dict={"model": ["gpt-4-1106-preview"]})
group_chat = autogen.GroupChat(agents=agent_list, messages=[], max_round=12)
manager = autogen.GroupChatManager(
@ -131,27 +132,36 @@ Configurations will be saved in JSON format with the following content:
{
"name": "...",
"model": "...",
"system_message": "..."
"system_message": "...",
"description": "..."
},
...
],
"manager_system_message": "...",
"coding": true,
"default_llm_config": {
"temperature": 0
}
"code_execution_config": {...},
"default_llm_config": {...}
}
```
You can provide a specific filename, otherwise, AgentBuilder will save config to the current path with the generated filename `save_config_TASK_MD5.json`.
You can load the saved config and skip the building process. AgentBuilder will create agents with those information without prompting the build manager.
```python
new_builder = AgentBuilder(config_path=config_path)
new_builder = AgentBuilder(config_file_or_env=config_file_or_env)
agent_list, agent_config = new_builder.load(saved_path)
start_task(...) # skip build()
```
## Use Open-source LLM
## Use OpenAI Assistant
[Assistants API](https://platform.openai.com/docs/assistants/overview) allows you to build AI assistants within your own applications.
An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.
AutoBuild also supports the assistant API by adding `use_oai_assistant=True` to `build()`.
```python
# Transfer to the OpenAI Assistant API.
agent_list, agent_config = new_builder.build(building_task, default_llm_config, use_oai_assistant=True)
...
```
## (Experimental) Use Open-source LLM
AutoBuild supports open-source LLM by [vLLM](https://docs.vllm.ai/en/latest/index.html) and [FastChat](https://github.com/lm-sys/FastChat).
Check the supported model list [here](https://docs.vllm.ai/en/latest/models/supported_models.html).
After satisfying the requirements, you can add an open-source LLM's huggingface repository to the config file,
@ -168,16 +178,6 @@ After satisfying the requirements, you can add an open-source LLM's huggingface
and specify it when initializing AgentBuilder.
AgentBuilder will automatically set up an endpoint server for open-source LLM. Make sure you have sufficient GPUs resources.
## Use OpenAI Assistant
[Assistants API](https://platform.openai.com/docs/assistants/overview) allows you to build AI assistants within your own applications.
An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.
AutoBuild also supports the assistant API by adding `use_oai_assistant=True` to `build()`.
```python
# Transfer to the OpenAI Assistant API.
agent_list, agent_config = new_builder.build(building_task, default_llm_config, use_oai_assistant=True)
...
```
## Future work/Roadmap
- Let the builder select the best agents from a given library/database to solve the task.

View File

@ -65,7 +65,8 @@ Links to notebook examples:
1. **Evaluation and Assessment**
- AgentEval: A Multi-Agent System for Assess Utility of LLM-powered Applications - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/agenteval_cq_math.ipynb)
1. **Automatic Agent Building**
- Automatically Build Multi-agent System with AgentBuilder - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_autobuild.ipynb)
- Automatically Build Multi-agent System with AgentBuilder - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/autobuild_basic.ipynb)
- Automatically Build Multi-agent System from Agent Library - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/autobuild_agent_library.ipynb)
## Enhanced Inferences
### Utilities