mirror of https://github.com/microsoft/autogen.git
WebSurfer Updated (Selenium, Playwright, and support for many filetypes) (#1929)
* Feat/headless browser (retargeted) (#1832) * Add headless browser to the WebSurferAgent, closes #1481 * replace soup.get_text() with markdownify.MarkdownConverter().convert_soup(soup) * import HeadlessChromeBrowser * implicitly wait for 10s * inicrease max. wait time to 99s * fix: trim trailing whitespace * test: fix headless tests * better bing query search * docs: add example 3 for headless option --------- Co-authored-by: Vijay Ramesh <vijay@regrello.com> * Handle missing Selenium package. * Added browser_chat.py example to simplify testing. * Based browser on mdconvert. (#1847) * Based browser on mdconvert. * Updated web_surfer. * Renamed HeadlessChromeBrowser to SeleniumChromeBrowser * Added an initial POC with Playwright. * Separated Bing search into it's own utility module. * Simple browser now uses Bing tools. * Updated Playwright browser to inherit from SimpleTextBrowser * Got Selenium working too. * Renamed classes and files for consistency. * Added more instructions. * Initial work to support other search providers. * Added some basic behavior when the BING_API_KEY is missing. * Cleaned up some search results. * Moved to using the request.Sessions object. Moved Bing SERP paring to mdconvert to be more broadly useful. * Added backward compatibility to WebSurferAgent * Selenium and Playwright now grab the whole DOM, not jus the body, allowing the converters access to metadata. * Fixed printing of page titles in Playwright. * Moved installation of WebSurfer dependencies to contrib-tests.yml * Fixing pre-commit issues. * Reverting conversable_agent, which should not have been changed in prior commit. * Added RequestMarkdownBrowser tests. * Fixed a bug with Bing search, and added search test cases. * Added tests for Bing search. * Added tests for md_convert * Added test files. * Added missing pptx. * Added more tests for WebSurfer coverage. * Fixed guard on requests_markdown_browser test. * Updated test coverage for mdconvert. * Fix brwser_utils tests. * Removed image test from browser, since exiftool isn't installed on test machine. * Removed image test from browser, since exiftool isn't installed on test machine. * Disable Selenium GPU and sandbox to ensure it runs headless in Docker. * Added option for Bing API results to be interleaved (as Bing specifies), or presented in a categorized list (Web, News, Videos), etc * Print more details when requests exceptions are thrown. * Added additional documentation to markdown_search * Added documentation to the selenium_markdown_browser. * Added documentation to playwright_markdown_browser.py * Added documentation to requests_markdown_browser * Added documentation to mdconvert.py * Updated agentchat_surfer notebook. * Update .github/workflows/contrib-tests.yml Co-authored-by: Davor Runje <davor@airt.ai> * Merge main. Resolve conflicts. * Resolve pre-commit checks. * Removed offending LFS file. * Re-added offending LFS file. * Fixed browser_utils tests. * Fixed style errors. --------- Co-authored-by: Asapanna Rakesh <45640029+INF800@users.noreply.github.com> Co-authored-by: Vijay Ramesh <vijay@regrello.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> Co-authored-by: Davor Runje <davor@airt.ai>
This commit is contained in:
parent
2e1f788293
commit
0d5163b78a
|
@ -134,6 +134,9 @@ jobs:
|
|||
- name: Install packages and dependencies for RetrieveChat
|
||||
run: |
|
||||
pip install -e .[retrievechat]
|
||||
- name: Install packages and dependencies for WebSurfer and browser_utils
|
||||
run: |
|
||||
pip install -e .[test,websurfer]
|
||||
- name: Set AUTOGEN_USE_DOCKER based on OS
|
||||
shell: bash
|
||||
run: |
|
||||
|
@ -275,7 +278,7 @@ jobs:
|
|||
fi
|
||||
- name: Coverage
|
||||
run: |
|
||||
pytest test/test_browser_utils.py test/agentchat/contrib/test_web_surfer.py --skip-openai
|
||||
pytest test/browser_utils test/agentchat/contrib/test_web_surfer.py --skip-openai
|
||||
- name: Upload coverage to Codecov
|
||||
uses: codecov/codecov-action@v3
|
||||
with:
|
||||
|
|
|
@ -46,7 +46,8 @@ repos:
|
|||
website/docs/tutorial/code-executors.ipynb |
|
||||
website/docs/topics/code-execution/custom-executor.ipynb |
|
||||
website/docs/topics/non-openai-models/cloud-gemini.ipynb |
|
||||
notebook/.*
|
||||
notebook/.* |
|
||||
test/browser_utils/test_files/.*
|
||||
)$
|
||||
# See https://jaredkhan.com/blog/mypy-pre-commit
|
||||
- repo: local
|
||||
|
|
|
@ -1,15 +1,13 @@
|
|||
import copy
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime
|
||||
from typing import Any, Callable, Dict, List, Literal, Optional, Tuple, Union
|
||||
|
||||
from typing_extensions import Annotated
|
||||
|
||||
from ... import Agent, AssistantAgent, ConversableAgent, GroupChat, GroupChatManager, OpenAIWrapper, UserProxyAgent
|
||||
from ...browser_utils import SimpleTextBrowser
|
||||
from ...browser_utils import AbstractMarkdownBrowser, BingMarkdownSearch, RequestsMarkdownBrowser
|
||||
from ...code_utils import content_str
|
||||
from ...oai.openai_utils import filter_config
|
||||
from ...token_count_utils import count_token, get_max_token_limit
|
||||
|
@ -20,12 +18,9 @@ logger = logging.getLogger(__name__)
|
|||
class WebSurferAgent(ConversableAgent):
|
||||
"""(In preview) An agent that acts as a basic web surfer that can search the web and visit web pages."""
|
||||
|
||||
DEFAULT_PROMPT = (
|
||||
"You are a helpful AI assistant with access to a web browser (via the provided functions). In fact, YOU ARE THE ONLY MEMBER OF YOUR PARTY WITH ACCESS TO A WEB BROWSER, so please help out where you can by performing web searches, navigating pages, and reporting what you find. Today's date is "
|
||||
+ datetime.now().date().isoformat()
|
||||
)
|
||||
DEFAULT_PROMPT = "You are a helpful AI assistant with access to a web browser (via the provided functions). In fact, YOU ARE THE ONLY MEMBER OF YOUR PARTY WITH ACCESS TO A WEB BROWSER, so please help out where you can by performing web searches, navigating pages, and reporting what you find."
|
||||
|
||||
DEFAULT_DESCRIPTION = "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, answer questions from pages, and or generate summaries."
|
||||
DEFAULT_DESCRIPTION = "A helpful assistant with access to a web browser. Ask them to perform web searches, open pages, navigate to Wikipedia, download files, etc. Once on a desired page, ask them to answer questions by reading the page, generate summaries, find specific words or phrases on the page (ctrl+f), or even just scroll up or down in the viewport."
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
|
@ -40,7 +35,8 @@ class WebSurferAgent(ConversableAgent):
|
|||
llm_config: Optional[Union[Dict, Literal[False]]] = None,
|
||||
summarizer_llm_config: Optional[Union[Dict, Literal[False]]] = None,
|
||||
default_auto_reply: Optional[Union[str, Dict, None]] = "",
|
||||
browser_config: Optional[Union[Dict, None]] = None,
|
||||
browser_config: Optional[Union[Dict, None]] = None, # Deprecated
|
||||
browser: Optional[Union[AbstractMarkdownBrowser, None]] = None,
|
||||
**kwargs,
|
||||
):
|
||||
super().__init__(
|
||||
|
@ -60,11 +56,39 @@ class WebSurferAgent(ConversableAgent):
|
|||
self._create_summarizer_client(summarizer_llm_config, llm_config)
|
||||
|
||||
# Create the browser
|
||||
self.browser = SimpleTextBrowser(**(browser_config if browser_config else {}))
|
||||
if browser_config is not None:
|
||||
if browser is not None:
|
||||
raise ValueError(
|
||||
"WebSurferAgent cannot accept both a 'browser_config' (deprecated) parameter and 'browser' parameter at the same time. Use only one or the other."
|
||||
)
|
||||
|
||||
inner_llm_config = copy.deepcopy(llm_config)
|
||||
# Print a warning
|
||||
logger.warning(
|
||||
"Warning: the parameter 'browser_config' in WebSurferAgent.__init__() is deprecated. Use 'browser' instead."
|
||||
)
|
||||
|
||||
# Update the settings to the new format
|
||||
_bconfig = {}
|
||||
_bconfig.update(browser_config)
|
||||
|
||||
if "bing_api_key" in _bconfig:
|
||||
_bconfig["search_engine"] = BingMarkdownSearch(
|
||||
bing_api_key=_bconfig["bing_api_key"], interleave_results=False
|
||||
)
|
||||
del _bconfig["bing_api_key"]
|
||||
else:
|
||||
_bconfig["search_engine"] = BingMarkdownSearch()
|
||||
|
||||
if "request_kwargs" in _bconfig:
|
||||
_bconfig["requests_get_kwargs"] = _bconfig["request_kwargs"]
|
||||
del _bconfig["request_kwargs"]
|
||||
|
||||
self.browser = RequestsMarkdownBrowser(**_bconfig)
|
||||
else:
|
||||
self.browser = browser
|
||||
|
||||
# Set up the inner monologue
|
||||
inner_llm_config = copy.deepcopy(llm_config)
|
||||
self._assistant = AssistantAgent(
|
||||
self.name + "_inner_assistant",
|
||||
system_message=system_message, # type: ignore[arg-type]
|
||||
|
@ -130,6 +154,7 @@ class WebSurferAgent(ConversableAgent):
|
|||
total_pages = len(self.browser.viewport_pages)
|
||||
|
||||
header += f"Viewport position: Showing page {current_page+1} of {total_pages}.\n"
|
||||
|
||||
return (header, self.browser.viewport)
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
|
@ -138,7 +163,7 @@ class WebSurferAgent(ConversableAgent):
|
|||
description="Perform an INFORMATIONAL web search query then return the search results.",
|
||||
)
|
||||
def _informational_search(query: Annotated[str, "The informational web search query to perform."]) -> str:
|
||||
self.browser.visit_page(f"bing: {query}")
|
||||
self.browser.visit_page(f"search: {query}")
|
||||
header, content = _browser_state()
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
|
@ -148,9 +173,9 @@ class WebSurferAgent(ConversableAgent):
|
|||
description="Perform a NAVIGATIONAL web search query then immediately navigate to the top result. Useful, for example, to navigate to a particular Wikipedia article or other known destination. Equivalent to Google's \"I'm Feeling Lucky\" button.",
|
||||
)
|
||||
def _navigational_search(query: Annotated[str, "The navigational web search query to perform."]) -> str:
|
||||
self.browser.visit_page(f"bing: {query}")
|
||||
self.browser.visit_page(f"search: {query}")
|
||||
|
||||
# Extract the first linl
|
||||
# Extract the first link
|
||||
m = re.search(r"\[.*?\]\((http.*?)\)", self.browser.page_content)
|
||||
if m:
|
||||
self.browser.visit_page(m.group(1))
|
||||
|
@ -168,6 +193,15 @@ class WebSurferAgent(ConversableAgent):
|
|||
header, content = _browser_state()
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
@self._assistant.register_for_llm(
|
||||
name="download_file", description="Download a file at a given URL and, if possible, return its text."
|
||||
)
|
||||
def _download_file(url: Annotated[str, "The relative or absolute url of the file to be downloaded."]) -> str:
|
||||
self.browser.visit_page(url)
|
||||
header, content = _browser_state()
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
@self._assistant.register_for_llm(
|
||||
name="page_up",
|
||||
|
@ -188,14 +222,51 @@ class WebSurferAgent(ConversableAgent):
|
|||
header, content = _browser_state()
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
@self._assistant.register_for_llm(
|
||||
name="find_on_page_ctrl_f",
|
||||
description="Scroll the viewport to the first occurrence of the search string. This is equivalent to Ctrl+F.",
|
||||
)
|
||||
def _find_on_page_ctrl_f(
|
||||
search_string: Annotated[
|
||||
str, "The string to search for on the page. This search string supports wildcards like '*'"
|
||||
]
|
||||
) -> str:
|
||||
find_result = self.browser.find_on_page(search_string)
|
||||
header, content = _browser_state()
|
||||
|
||||
if find_result is None:
|
||||
return (
|
||||
header.strip()
|
||||
+ "\n=======================\nThe search string '"
|
||||
+ search_string
|
||||
+ "' was not found on this page."
|
||||
)
|
||||
else:
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
@self._assistant.register_for_llm(
|
||||
name="find_next",
|
||||
description="Scroll the viewport to next occurrence of the search string.",
|
||||
)
|
||||
def _find_next() -> str:
|
||||
find_result = self.browser.find_next()
|
||||
header, content = _browser_state()
|
||||
|
||||
if find_result is None:
|
||||
return header.strip() + "\n=======================\nThe search string was not found on this page."
|
||||
else:
|
||||
return header.strip() + "\n=======================\n" + content
|
||||
|
||||
if self.summarization_client is not None:
|
||||
|
||||
@self._user_proxy.register_for_execution()
|
||||
@self._assistant.register_for_llm(
|
||||
name="answer_from_page",
|
||||
name="read_page_and_answer",
|
||||
description="Uses AI to read the page and directly answer a given question based on the content.",
|
||||
)
|
||||
def _answer_from_page(
|
||||
def _read_page_and_answer(
|
||||
question: Annotated[Optional[str], "The question to directly answer."],
|
||||
url: Annotated[Optional[str], "[Optional] The url of the page. (Defaults to the current page)"] = None,
|
||||
) -> str:
|
||||
|
@ -256,7 +327,7 @@ class WebSurferAgent(ConversableAgent):
|
|||
Optional[str], "[Optional] The url of the page to summarize. (Defaults to current page)"
|
||||
] = None,
|
||||
) -> str:
|
||||
return _answer_from_page(url=url, question=None)
|
||||
return _read_page_and_answer(url=url, question=None)
|
||||
|
||||
def generate_surfer_reply(
|
||||
self,
|
||||
|
|
|
@ -1,285 +0,0 @@
|
|||
import io
|
||||
import json
|
||||
import mimetypes
|
||||
import os
|
||||
import re
|
||||
import uuid
|
||||
from typing import Any, Dict, List, Optional, Tuple, Union
|
||||
from urllib.parse import urljoin, urlparse
|
||||
|
||||
import markdownify
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
# Optional PDF support
|
||||
IS_PDF_CAPABLE = False
|
||||
try:
|
||||
import pdfminer
|
||||
import pdfminer.high_level
|
||||
|
||||
IS_PDF_CAPABLE = True
|
||||
except ModuleNotFoundError:
|
||||
pass
|
||||
|
||||
# Other optional dependencies
|
||||
try:
|
||||
import pathvalidate
|
||||
except ModuleNotFoundError:
|
||||
pass
|
||||
|
||||
|
||||
class SimpleTextBrowser:
|
||||
"""(In preview) An extremely simple text-based web browser comparable to Lynx. Suitable for Agentic use."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
start_page: Optional[str] = None,
|
||||
viewport_size: Optional[int] = 1024 * 8,
|
||||
downloads_folder: Optional[Union[str, None]] = None,
|
||||
bing_base_url: str = "https://api.bing.microsoft.com/v7.0/search",
|
||||
bing_api_key: Optional[Union[str, None]] = None,
|
||||
request_kwargs: Optional[Union[Dict[str, Any], None]] = None,
|
||||
):
|
||||
self.start_page: str = start_page if start_page else "about:blank"
|
||||
self.viewport_size = viewport_size # Applies only to the standard uri types
|
||||
self.downloads_folder = downloads_folder
|
||||
self.history: List[str] = list()
|
||||
self.page_title: Optional[str] = None
|
||||
self.viewport_current_page = 0
|
||||
self.viewport_pages: List[Tuple[int, int]] = list()
|
||||
self.set_address(self.start_page)
|
||||
self.bing_base_url = bing_base_url
|
||||
self.bing_api_key = bing_api_key
|
||||
self.request_kwargs = request_kwargs
|
||||
|
||||
self._page_content = ""
|
||||
|
||||
@property
|
||||
def address(self) -> str:
|
||||
"""Return the address of the current page."""
|
||||
return self.history[-1]
|
||||
|
||||
def set_address(self, uri_or_path: str) -> None:
|
||||
self.history.append(uri_or_path)
|
||||
|
||||
# Handle special URIs
|
||||
if uri_or_path == "about:blank":
|
||||
self._set_page_content("")
|
||||
elif uri_or_path.startswith("bing:"):
|
||||
self._bing_search(uri_or_path[len("bing:") :].strip())
|
||||
else:
|
||||
if not uri_or_path.startswith("http:") and not uri_or_path.startswith("https:"):
|
||||
uri_or_path = urljoin(self.address, uri_or_path)
|
||||
self.history[-1] = uri_or_path # Update the address with the fully-qualified path
|
||||
self._fetch_page(uri_or_path)
|
||||
|
||||
self.viewport_current_page = 0
|
||||
|
||||
@property
|
||||
def viewport(self) -> str:
|
||||
"""Return the content of the current viewport."""
|
||||
bounds = self.viewport_pages[self.viewport_current_page]
|
||||
return self.page_content[bounds[0] : bounds[1]]
|
||||
|
||||
@property
|
||||
def page_content(self) -> str:
|
||||
"""Return the full contents of the current page."""
|
||||
return self._page_content
|
||||
|
||||
def _set_page_content(self, content: str) -> None:
|
||||
"""Sets the text content of the current page."""
|
||||
self._page_content = content
|
||||
self._split_pages()
|
||||
if self.viewport_current_page >= len(self.viewport_pages):
|
||||
self.viewport_current_page = len(self.viewport_pages) - 1
|
||||
|
||||
def page_down(self) -> None:
|
||||
self.viewport_current_page = min(self.viewport_current_page + 1, len(self.viewport_pages) - 1)
|
||||
|
||||
def page_up(self) -> None:
|
||||
self.viewport_current_page = max(self.viewport_current_page - 1, 0)
|
||||
|
||||
def visit_page(self, path_or_uri: str) -> str:
|
||||
"""Update the address, visit the page, and return the content of the viewport."""
|
||||
self.set_address(path_or_uri)
|
||||
return self.viewport
|
||||
|
||||
def _split_pages(self) -> None:
|
||||
# Split only regular pages
|
||||
if not self.address.startswith("http:") and not self.address.startswith("https:"):
|
||||
self.viewport_pages = [(0, len(self._page_content))]
|
||||
return
|
||||
|
||||
# Handle empty pages
|
||||
if len(self._page_content) == 0:
|
||||
self.viewport_pages = [(0, 0)]
|
||||
return
|
||||
|
||||
# Break the viewport into pages
|
||||
self.viewport_pages = []
|
||||
start_idx = 0
|
||||
while start_idx < len(self._page_content):
|
||||
end_idx = min(start_idx + self.viewport_size, len(self._page_content)) # type: ignore[operator]
|
||||
# Adjust to end on a space
|
||||
while end_idx < len(self._page_content) and self._page_content[end_idx - 1] not in [" ", "\t", "\r", "\n"]:
|
||||
end_idx += 1
|
||||
self.viewport_pages.append((start_idx, end_idx))
|
||||
start_idx = end_idx
|
||||
|
||||
def _bing_api_call(self, query: str) -> Dict[str, Dict[str, List[Dict[str, Union[str, Dict[str, str]]]]]]:
|
||||
# Make sure the key was set
|
||||
if self.bing_api_key is None:
|
||||
raise ValueError("Missing Bing API key.")
|
||||
|
||||
# Prepare the request parameters
|
||||
request_kwargs = self.request_kwargs.copy() if self.request_kwargs is not None else {}
|
||||
|
||||
if "headers" not in request_kwargs:
|
||||
request_kwargs["headers"] = {}
|
||||
request_kwargs["headers"]["Ocp-Apim-Subscription-Key"] = self.bing_api_key
|
||||
|
||||
if "params" not in request_kwargs:
|
||||
request_kwargs["params"] = {}
|
||||
request_kwargs["params"]["q"] = query
|
||||
request_kwargs["params"]["textDecorations"] = False
|
||||
request_kwargs["params"]["textFormat"] = "raw"
|
||||
|
||||
request_kwargs["stream"] = False
|
||||
|
||||
# Make the request
|
||||
response = requests.get(self.bing_base_url, **request_kwargs)
|
||||
response.raise_for_status()
|
||||
results = response.json()
|
||||
|
||||
return results # type: ignore[no-any-return]
|
||||
|
||||
def _bing_search(self, query: str) -> None:
|
||||
results = self._bing_api_call(query)
|
||||
|
||||
web_snippets: List[str] = list()
|
||||
idx = 0
|
||||
for page in results["webPages"]["value"]:
|
||||
idx += 1
|
||||
web_snippets.append(f"{idx}. [{page['name']}]({page['url']})\n{page['snippet']}")
|
||||
if "deepLinks" in page:
|
||||
for dl in page["deepLinks"]:
|
||||
idx += 1
|
||||
web_snippets.append(
|
||||
f"{idx}. [{dl['name']}]({dl['url']})\n{dl['snippet'] if 'snippet' in dl else ''}" # type: ignore[index]
|
||||
)
|
||||
|
||||
news_snippets = list()
|
||||
if "news" in results:
|
||||
for page in results["news"]["value"]:
|
||||
idx += 1
|
||||
news_snippets.append(f"{idx}. [{page['name']}]({page['url']})\n{page['description']}")
|
||||
|
||||
self.page_title = f"{query} - Search"
|
||||
|
||||
content = (
|
||||
f"A Bing search for '{query}' found {len(web_snippets) + len(news_snippets)} results:\n\n## Web Results\n"
|
||||
+ "\n\n".join(web_snippets)
|
||||
)
|
||||
if len(news_snippets) > 0:
|
||||
content += "\n\n## News Results:\n" + "\n\n".join(news_snippets)
|
||||
self._set_page_content(content)
|
||||
|
||||
def _fetch_page(self, url: str) -> None:
|
||||
try:
|
||||
# Prepare the request parameters
|
||||
request_kwargs = self.request_kwargs.copy() if self.request_kwargs is not None else {}
|
||||
request_kwargs["stream"] = True
|
||||
|
||||
# Send a HTTP request to the URL
|
||||
response = requests.get(url, **request_kwargs)
|
||||
response.raise_for_status()
|
||||
|
||||
# If the HTTP request returns a status code 200, proceed
|
||||
if response.status_code == 200:
|
||||
content_type = response.headers.get("content-type", "")
|
||||
for ct in ["text/html", "text/plain", "application/pdf"]:
|
||||
if ct in content_type.lower():
|
||||
content_type = ct
|
||||
break
|
||||
|
||||
if content_type == "text/html":
|
||||
# Get the content of the response
|
||||
html = ""
|
||||
for chunk in response.iter_content(chunk_size=512, decode_unicode=True):
|
||||
html += chunk
|
||||
|
||||
soup = BeautifulSoup(html, "html.parser")
|
||||
|
||||
# Remove javascript and style blocks
|
||||
for script in soup(["script", "style"]):
|
||||
script.extract()
|
||||
|
||||
# Convert to markdown -- Wikipedia gets special attention to get a clean version of the page
|
||||
if url.startswith("https://en.wikipedia.org/"):
|
||||
body_elm = soup.find("div", {"id": "mw-content-text"})
|
||||
title_elm = soup.find("span", {"class": "mw-page-title-main"})
|
||||
|
||||
if body_elm:
|
||||
# What's the title
|
||||
main_title = soup.title.string
|
||||
if title_elm and len(title_elm) > 0:
|
||||
main_title = title_elm.string
|
||||
webpage_text = (
|
||||
"# " + main_title + "\n\n" + markdownify.MarkdownConverter().convert_soup(body_elm)
|
||||
)
|
||||
else:
|
||||
webpage_text = markdownify.MarkdownConverter().convert_soup(soup)
|
||||
else:
|
||||
webpage_text = markdownify.MarkdownConverter().convert_soup(soup)
|
||||
|
||||
# Convert newlines
|
||||
webpage_text = re.sub(r"\r\n", "\n", webpage_text)
|
||||
|
||||
# Remove excessive blank lines
|
||||
self.page_title = soup.title.string
|
||||
self._set_page_content(re.sub(r"\n{2,}", "\n\n", webpage_text).strip())
|
||||
elif content_type == "text/plain":
|
||||
# Get the content of the response
|
||||
plain_text = ""
|
||||
for chunk in response.iter_content(chunk_size=512, decode_unicode=True):
|
||||
plain_text += chunk
|
||||
|
||||
self.page_title = None
|
||||
self._set_page_content(plain_text)
|
||||
elif IS_PDF_CAPABLE and content_type == "application/pdf":
|
||||
pdf_data = io.BytesIO(response.raw.read())
|
||||
self.page_title = None
|
||||
self._set_page_content(pdfminer.high_level.extract_text(pdf_data))
|
||||
elif self.downloads_folder is not None:
|
||||
# Try producing a safe filename
|
||||
fname = None
|
||||
try:
|
||||
fname = pathvalidate.sanitize_filename(os.path.basename(urlparse(url).path)).strip()
|
||||
except NameError:
|
||||
pass
|
||||
|
||||
# No suitable name, so make one
|
||||
if fname is None:
|
||||
extension = mimetypes.guess_extension(content_type)
|
||||
if extension is None:
|
||||
extension = ".download"
|
||||
fname = str(uuid.uuid4()) + extension
|
||||
|
||||
# Open a file for writing
|
||||
download_path = os.path.abspath(os.path.join(self.downloads_folder, fname))
|
||||
with open(download_path, "wb") as fh:
|
||||
for chunk in response.iter_content(chunk_size=512):
|
||||
fh.write(chunk)
|
||||
|
||||
# Return a page describing what just happened
|
||||
self.page_title = "Download complete."
|
||||
self._set_page_content(f"Downloaded '{url}' to '{download_path}'.")
|
||||
else:
|
||||
self.page_title = f"Error - Unsupported Content-Type '{content_type}'"
|
||||
self._set_page_content(self.page_title)
|
||||
else:
|
||||
self.page_title = "Error"
|
||||
self._set_page_content("Failed to retrieve " + url)
|
||||
except requests.exceptions.RequestException as e:
|
||||
self.page_title = "Error"
|
||||
self._set_page_content(str(e))
|
|
@ -0,0 +1,19 @@
|
|||
from .abstract_markdown_browser import AbstractMarkdownBrowser
|
||||
from .markdown_search import AbstractMarkdownSearch, BingMarkdownSearch
|
||||
from .mdconvert import DocumentConverterResult, FileConversionException, MarkdownConverter, UnsupportedFormatException
|
||||
from .playwright_markdown_browser import PlaywrightMarkdownBrowser
|
||||
from .requests_markdown_browser import RequestsMarkdownBrowser
|
||||
from .selenium_markdown_browser import SeleniumMarkdownBrowser
|
||||
|
||||
__all__ = (
|
||||
"AbstractMarkdownBrowser",
|
||||
"RequestsMarkdownBrowser",
|
||||
"SeleniumMarkdownBrowser",
|
||||
"PlaywrightMarkdownBrowser",
|
||||
"AbstractMarkdownSearch",
|
||||
"BingMarkdownSearch",
|
||||
"MarkdownConverter",
|
||||
"UnsupportedFormatException",
|
||||
"FileConversionException",
|
||||
"DocumentConverterResult",
|
||||
)
|
|
@ -0,0 +1,64 @@
|
|||
from abc import ABC, abstractmethod
|
||||
from typing import Dict, Optional, Union
|
||||
|
||||
|
||||
class AbstractMarkdownBrowser(ABC):
|
||||
"""
|
||||
An abstract class for a Markdown web browser.
|
||||
|
||||
All MarkdownBrowers work by:
|
||||
|
||||
(1) fetching a web page by URL (via requests, Selenium, Playwright, etc.)
|
||||
(2) converting the page's HTML or DOM to Markdown
|
||||
(3) operating on the Markdown
|
||||
|
||||
Such browsers are simple, and suitable for read-only agentic use.
|
||||
They cannot be used to interact with complex web applications.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def address(self) -> str:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def set_address(self, uri_or_path):
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def viewport(self) -> str:
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def page_content(self) -> str:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def page_down(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def page_up(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def visit_page(self, path_or_uri):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def open_local_file(self, local_path):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def find_on_page(self, query: str):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def find_next(self):
|
||||
pass
|
|
@ -0,0 +1,290 @@
|
|||
# ruff: noqa: E722
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Any, Dict, List, Optional, Tuple, Union
|
||||
from urllib.parse import parse_qs, quote, quote_plus, unquote, urlparse, urlunparse
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
from .mdconvert import MarkdownConverter
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class AbstractMarkdownSearch(ABC):
|
||||
"""
|
||||
An abstract class for providing search capabilities to a Markdown browser.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def search(self, query) -> str:
|
||||
pass
|
||||
|
||||
|
||||
class BingMarkdownSearch(AbstractMarkdownSearch):
|
||||
"""
|
||||
Provides Bing web search capabilities to Markdown browsers.
|
||||
"""
|
||||
|
||||
def __init__(self, bing_api_key: str = None, interleave_results: bool = True):
|
||||
"""
|
||||
Perform a Bing web search, and return the results formatted in Markdown.
|
||||
|
||||
Args:
|
||||
bing_api_key: key for the Bing search API. If omitted, an attempt is made to read the key from the BING_API_KEY environment variable. If no key is found, BingMarkdownSearch will print a warning, and will fall back to visiting and scraping the live Bing results page. Scraping is objectively worse than using the API, and thus is not recommended.
|
||||
interleave_results: When using the Bing API, results are returned based on category (web, news, videos, etc.), along with instructions for how they should be interleaved on the page. When `interleave` is set to True, these interleaving instructions are followed, and a single results list is returned by BingMarkdownSearch. When `interleave` is set to false, results are separated by category, and no interleaving is done.
|
||||
"""
|
||||
super().__init__()
|
||||
|
||||
self._mdconvert = MarkdownConverter()
|
||||
self._interleave_results = interleave_results
|
||||
|
||||
if bing_api_key is None or bing_api_key.strip() == "":
|
||||
self._bing_api_key = os.environ.get("BING_API_KEY")
|
||||
else:
|
||||
self._bing_api_key = bing_api_key
|
||||
|
||||
if self._bing_api_key is None:
|
||||
if not self._interleave_results:
|
||||
raise ValueError(
|
||||
"No Bing API key was provided. This is incompatible with setting `interleave_results` to False. Please provide a key, or set `interleave_results` to True."
|
||||
)
|
||||
|
||||
logger.warning(
|
||||
"Warning: No Bing API key provided. BingMarkdownSearch will submit an HTTP request to the Bing landing page, but results may be missing or low quality. To resolve this warning, provide a Bing API key by setting the BING_API_KEY environment variable, or using the 'bing_api_key' parameter in by BingMarkdownSearch's constructor. Bing API keys can be obtained via https://www.microsoft.com/en-us/bing/apis/bing-web-search-api\n"
|
||||
)
|
||||
|
||||
def search(self, query: str):
|
||||
"""Search Bing and return the results formatted in Markdown. If a Bing API key is available, the API is used to perform the search. If no API key is available, the search is performed by submitting an HTTPs GET request directly to Bing. Searches performed with the API are much higher quality, and are more reliable.
|
||||
|
||||
Args:
|
||||
query: The search query to issue
|
||||
|
||||
Returns:
|
||||
A Markdown rendering of the search results.
|
||||
"""
|
||||
|
||||
if self._bing_api_key is None:
|
||||
return self._fallback_search(query)
|
||||
else:
|
||||
return self._api_search(query)
|
||||
|
||||
def _api_search(self, query: str):
|
||||
"""Search Bing using the API, and return the results formatted in Markdown.
|
||||
|
||||
Args:
|
||||
query: The search query to issue
|
||||
|
||||
Returns:
|
||||
A Markdown rendering of the search results.
|
||||
"""
|
||||
results = self._bing_api_call(query)
|
||||
|
||||
snippets = dict()
|
||||
|
||||
def _processFacts(elm):
|
||||
facts = list()
|
||||
for e in elm:
|
||||
k = e["label"]["text"]
|
||||
v = " ".join(item["text"] for item in e["items"])
|
||||
facts.append(f"{k}: {v}")
|
||||
return "\n".join(facts)
|
||||
|
||||
# Web pages
|
||||
# __POS__ is a placeholder for the final ranking position, added at the end
|
||||
web_snippets = list()
|
||||
if "webPages" in results:
|
||||
for page in results["webPages"]["value"]:
|
||||
snippet = f"__POS__. {self._markdown_link(page['name'], page['url'])}\n{page['snippet']}"
|
||||
|
||||
if "richFacts" in page:
|
||||
snippet += "\n" + _processFacts(page["richFacts"])
|
||||
|
||||
if "mentions" in page:
|
||||
snippet += "\nMentions: " + ", ".join(e["name"] for e in page["mentions"])
|
||||
|
||||
if page["id"] not in snippets:
|
||||
snippets[page["id"]] = list()
|
||||
snippets[page["id"]].append(snippet)
|
||||
web_snippets.append(snippet)
|
||||
|
||||
if "deepLinks" in page:
|
||||
for dl in page["deepLinks"]:
|
||||
deep_snippet = f"__POS__. {self._markdown_link(dl['name'], dl['url'])}\n{dl['snippet'] if 'snippet' in dl else ''}"
|
||||
snippets[page["id"]].append(deep_snippet)
|
||||
web_snippets.append(deep_snippet)
|
||||
|
||||
# News results
|
||||
news_snippets = list()
|
||||
if "news" in results:
|
||||
for page in results["news"]["value"]:
|
||||
snippet = (
|
||||
f"__POS__. {self._markdown_link(page['name'], page['url'])}\n{page.get('description', '')}".strip()
|
||||
)
|
||||
|
||||
if "datePublished" in page:
|
||||
snippet += "\nDate published: " + page["datePublished"].split("T")[0]
|
||||
|
||||
if "richFacts" in page:
|
||||
snippet += "\n" + _processFacts(page["richFacts"])
|
||||
|
||||
if "mentions" in page:
|
||||
snippet += "\nMentions: " + ", ".join(e["name"] for e in page["mentions"])
|
||||
|
||||
news_snippets.append(snippet)
|
||||
|
||||
if len(news_snippets) > 0:
|
||||
snippets[results["news"]["id"]] = news_snippets
|
||||
|
||||
# Videos
|
||||
video_snippets = list()
|
||||
if "videos" in results:
|
||||
for page in results["videos"]["value"]:
|
||||
if not page["contentUrl"].startswith("https://www.youtube.com/watch?v="):
|
||||
continue
|
||||
|
||||
snippet = f"__POS__. {self._markdown_link(page['name'], page['contentUrl'])}\n{page.get('description', '')}".strip()
|
||||
|
||||
if "datePublished" in page:
|
||||
snippet += "\nDate published: " + page["datePublished"].split("T")[0]
|
||||
|
||||
if "richFacts" in page:
|
||||
snippet += "\n" + _processFacts(page["richFacts"])
|
||||
|
||||
if "mentions" in page:
|
||||
snippet += "\nMentions: " + ", ".join(e["name"] for e in page["mentions"])
|
||||
|
||||
video_snippets.append(snippet)
|
||||
|
||||
if len(video_snippets) > 0:
|
||||
snippets[results["videos"]["id"]] = video_snippets
|
||||
|
||||
# Related searches
|
||||
related_searches = ""
|
||||
if "relatedSearches" in results:
|
||||
related_searches = "## Related Searches:\n"
|
||||
for s in results["relatedSearches"]["value"]:
|
||||
related_searches += "- " + s["text"] + "\n"
|
||||
snippets[results["relatedSearches"]["id"]] = [related_searches.strip()]
|
||||
|
||||
idx = 0
|
||||
content = ""
|
||||
if self._interleave_results:
|
||||
# Interleaved
|
||||
for item in results["rankingResponse"]["mainline"]["items"]:
|
||||
_id = item["value"]["id"]
|
||||
if _id in snippets:
|
||||
for s in snippets[_id]:
|
||||
if "__POS__" in s:
|
||||
idx += 1
|
||||
content += s.replace("__POS__", str(idx)) + "\n\n"
|
||||
else:
|
||||
content += s + "\n\n"
|
||||
else:
|
||||
# Categorized
|
||||
if len(web_snippets) > 0:
|
||||
content += "## Web Results\n\n"
|
||||
for s in web_snippets:
|
||||
if "__POS__" in s:
|
||||
idx += 1
|
||||
content += s.replace("__POS__", str(idx)) + "\n\n"
|
||||
else:
|
||||
content += s + "\n\n"
|
||||
if len(news_snippets) > 0:
|
||||
content += "## News Results\n\n"
|
||||
for s in news_snippets:
|
||||
if "__POS__" in s:
|
||||
idx += 1
|
||||
content += s.replace("__POS__", str(idx)) + "\n\n"
|
||||
else:
|
||||
content += s + "\n\n"
|
||||
if len(video_snippets) > 0:
|
||||
content += "## Video Results\n\n"
|
||||
for s in video_snippets:
|
||||
if "__POS__" in s:
|
||||
idx += 1
|
||||
content += s.replace("__POS__", str(idx)) + "\n\n"
|
||||
else:
|
||||
content += s + "\n\n"
|
||||
if len(related_searches) > 0:
|
||||
content += related_searches
|
||||
|
||||
return f"## A Bing search for '{query}' found {idx} results:\n\n" + content.strip()
|
||||
|
||||
def _bing_api_call(self, query: str):
|
||||
"""Make a Bing API call, and return a Python representation of the JSON response."
|
||||
|
||||
Args:
|
||||
query: The search query to issue
|
||||
|
||||
Returns:
|
||||
A Python representation of the Bing API's JSON response (as parsed by `json.loads()`).
|
||||
"""
|
||||
# Make sure the key was set
|
||||
if not self._bing_api_key:
|
||||
raise ValueError("Missing Bing API key.")
|
||||
|
||||
# Prepare the request parameters
|
||||
request_kwargs = {}
|
||||
request_kwargs["headers"] = {}
|
||||
request_kwargs["headers"]["Ocp-Apim-Subscription-Key"] = self._bing_api_key
|
||||
|
||||
request_kwargs["params"] = {}
|
||||
request_kwargs["params"]["q"] = query
|
||||
request_kwargs["params"]["textDecorations"] = False
|
||||
request_kwargs["params"]["textFormat"] = "raw"
|
||||
|
||||
request_kwargs["stream"] = False
|
||||
|
||||
# Make the request
|
||||
response = requests.get("https://api.bing.microsoft.com/v7.0/search", **request_kwargs)
|
||||
response.raise_for_status()
|
||||
results = response.json()
|
||||
|
||||
return results
|
||||
|
||||
def _fallback_search(self, query: str):
|
||||
"""When no Bing API key is provided, we issue a simple HTTPs GET call to the Bing landing page and convert it to Markdown.
|
||||
|
||||
Args:
|
||||
query: The search query to issue
|
||||
|
||||
Returns:
|
||||
The Bing search results page, converted to Markdown.
|
||||
"""
|
||||
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0"
|
||||
headers = {"User-Agent": user_agent}
|
||||
|
||||
url = f"https://www.bing.com/search?q={quote_plus(query)}&FORM=QBLH"
|
||||
response = requests.get(url, headers=headers)
|
||||
response.raise_for_status()
|
||||
return self._mdconvert.convert_response(response).text_content
|
||||
|
||||
def _markdown_link(self, anchor: str, href: str):
|
||||
"""Create a Markdown hyperlink, escaping the URLs as appropriate.
|
||||
|
||||
Args:
|
||||
anchor: The anchor text of the hyperlink
|
||||
href: The href destination of the hyperlink
|
||||
|
||||
Returns:
|
||||
A correctly-formatted Markdown hyperlink
|
||||
"""
|
||||
try:
|
||||
parsed_url = urlparse(href)
|
||||
# URLs provided by Bing are sometimes only partially quoted, leaving in characters
|
||||
# the conflict with Markdown. We unquote the URL, and then re-quote more completely
|
||||
href = urlunparse(parsed_url._replace(path=quote(unquote(parsed_url.path))))
|
||||
anchor = re.sub(r"[\[\]]", " ", anchor)
|
||||
return f"[{anchor}]({href})"
|
||||
except ValueError: # It's not clear if this ever gets thrown
|
||||
return f"[{anchor}]({href})"
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,113 @@
|
|||
import io
|
||||
import os
|
||||
from typing import Any, Dict, Optional, Union
|
||||
from urllib.parse import parse_qs, quote_plus, unquote, urljoin, urlparse
|
||||
|
||||
from .requests_markdown_browser import RequestsMarkdownBrowser
|
||||
|
||||
# Check if Playwright dependencies are installed
|
||||
IS_PLAYWRIGHT_ENABLED = False
|
||||
try:
|
||||
from playwright._impl._errors import TimeoutError
|
||||
from playwright.sync_api import sync_playwright
|
||||
|
||||
IS_PLAYWRIGHT_ENABLED = True
|
||||
except ModuleNotFoundError:
|
||||
pass
|
||||
|
||||
|
||||
class PlaywrightMarkdownBrowser(RequestsMarkdownBrowser):
|
||||
"""
|
||||
(In preview) A Playwright and Chromium powered Markdown web browser.
|
||||
PlaywrightMarkdownBrowser extends RequestsMarkdownBrowser, and replaces only the functionality of `visit_page(url)`.
|
||||
"""
|
||||
|
||||
def __init__(self, launch_args: Dict[str, Any] = {}, **kwargs):
|
||||
"""
|
||||
Instantiate a new PlaywrightMarkdownBrowser.
|
||||
|
||||
Arguments:
|
||||
**launch_args: Arguments passed to `playwright.chromium.launch`. See Playwright documentation for more details.
|
||||
**kwargs: PlaywrightMarkdownBrowser passes these arguments to the RequestsMarkdownBrowser superclass. See RequestsMarkdownBrowser documentation for more details.
|
||||
"""
|
||||
super().__init__(**kwargs)
|
||||
self._playwright = None
|
||||
self._browser = None
|
||||
self._page = None
|
||||
|
||||
# Raise an error if Playwright isn't available
|
||||
if not IS_PLAYWRIGHT_ENABLED:
|
||||
raise ModuleNotFoundError(
|
||||
"No module named 'playwright'. Playwright can be installed via 'pip install playwright' or 'conda install playwright' depending on your environment.\n\nOnce installed, you must also install a browser via 'playwright install --with-deps chromium'"
|
||||
)
|
||||
|
||||
# Create the playwright instance
|
||||
self._playwright = sync_playwright().start()
|
||||
self._browser = self._playwright.chromium.launch(**launch_args)
|
||||
|
||||
# Browser context
|
||||
self._page = self._browser.new_page()
|
||||
self.set_address(self.start_page)
|
||||
|
||||
def __del__(self):
|
||||
"""
|
||||
Close the Playwright session and browser when garbage-collected. Garbage collection may not always occur, or may happen at a later time. Call `close()` explicitly if you wish to free up resources used by Playwright or Chromium.
|
||||
"""
|
||||
self.close()
|
||||
|
||||
def close(self):
|
||||
"""
|
||||
Close the Playwright session and browser used by Playwright. The session cannot be reopened without instantiating a new PlaywrightMarkdownBrowser instance.
|
||||
"""
|
||||
if self._browser is not None:
|
||||
self._browser.close()
|
||||
self._browser = None
|
||||
if self._playwright is not None:
|
||||
self._playwright.stop()
|
||||
self._playwright = None
|
||||
|
||||
def _fetch_page(self, url) -> None:
|
||||
"""
|
||||
Fetch a page. If the page is a regular HTTP page, use Playwright to gather the HTML. If the page is a download, or a local file, rely on superclass behavior.
|
||||
"""
|
||||
if url.startswith("file://"):
|
||||
super()._fetch_page(url)
|
||||
else:
|
||||
try:
|
||||
# Regular webpage
|
||||
self._page.goto(url)
|
||||
return self._process_page(url, self._page)
|
||||
except Exception as e:
|
||||
# Downloaded file
|
||||
if self.downloads_folder and "net::ERR_ABORTED" in str(e):
|
||||
with self._page.expect_download() as download_info:
|
||||
try:
|
||||
self._page.goto(url)
|
||||
except Exception as e:
|
||||
if "net::ERR_ABORTED" in str(e):
|
||||
pass
|
||||
else:
|
||||
raise e
|
||||
download = download_info.value
|
||||
fname = os.path.join(self.downloads_folder, download.suggested_filename)
|
||||
download.save_as(fname)
|
||||
self._process_download(url, fname)
|
||||
else:
|
||||
raise e
|
||||
|
||||
def _process_page(self, url, page):
|
||||
"""
|
||||
Playwright fetched a regular HTTP page. Gather the document HTML, and convert it to Markdown.
|
||||
"""
|
||||
html = page.evaluate("document.documentElement.outerHTML;")
|
||||
res = self._markdown_converter.convert_stream(io.StringIO(html), file_extension=".html", url=url)
|
||||
self.page_title = page.title()
|
||||
self._set_page_content(res.text_content)
|
||||
|
||||
def _process_download(self, url, path):
|
||||
"""
|
||||
Playwright downloaded a file. Convert it to Markdown.
|
||||
"""
|
||||
res = self._markdown_converter.convert_local(path, url=url)
|
||||
self.page_title = res.title
|
||||
self._set_page_content(res.text_content)
|
|
@ -0,0 +1,426 @@
|
|||
# ruff: noqa: E722
|
||||
import datetime
|
||||
import html
|
||||
import io
|
||||
import json
|
||||
import mimetypes
|
||||
import os
|
||||
import pathlib
|
||||
import re
|
||||
import time
|
||||
import traceback
|
||||
import uuid
|
||||
from typing import Any, Dict, List, Optional, Tuple, Union
|
||||
from urllib.parse import parse_qs, unquote, urljoin, urlparse
|
||||
from urllib.request import url2pathname
|
||||
|
||||
import pathvalidate
|
||||
import requests
|
||||
|
||||
from .abstract_markdown_browser import AbstractMarkdownBrowser
|
||||
from .markdown_search import AbstractMarkdownSearch, BingMarkdownSearch
|
||||
from .mdconvert import FileConversionException, MarkdownConverter, UnsupportedFormatException
|
||||
|
||||
|
||||
class RequestsMarkdownBrowser(AbstractMarkdownBrowser):
|
||||
"""
|
||||
(In preview) An extremely simple Python requests-powered Markdown web browser.
|
||||
This browser cannot run JavaScript, compute CSS, etc. It simply fetches the HTML document, and converts it to Markdown.
|
||||
See AbstractMarkdownBrowser for more details.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
start_page: Optional[str] = None,
|
||||
viewport_size: Optional[int] = 1024 * 8,
|
||||
downloads_folder: Optional[Union[str, None]] = None,
|
||||
search_engine: Optional[Union[AbstractMarkdownSearch, None]] = None,
|
||||
markdown_converter: Optional[Union[MarkdownConverter, None]] = None,
|
||||
requests_session: Optional[Union[requests.Session, None]] = None,
|
||||
requests_get_kwargs: Optional[Union[Dict[str, Any], None]] = None,
|
||||
):
|
||||
"""
|
||||
Instantiate a new RequestsMarkdownBrowser.
|
||||
|
||||
Arguments:
|
||||
start_page: The page on which the browser starts (default: "about:blank")
|
||||
viewport_size: Approximately how many *characters* fit in the viewport. Viewport dimensions are adjusted dynamically to avoid cutting off words (default: 8192).
|
||||
downloads_folder: Path to where downloads are saved. If None, downloads are disabled. (default: None)
|
||||
search_engine: An instance of MarkdownSearch, which handles web searches performed by this browser (default: a new `BingMarkdownSearch()` with default parameters)
|
||||
markdown_converted: An instance of a MarkdownConverter used to convert HTML pages and downloads to Markdown (default: a new `MarkdownConerter()` with default parameters)
|
||||
request_session: The session from which to issue requests (default: a new `requests.Session()` instance with default parameters)
|
||||
request_get_kwargs: Extra parameters passed to evert `.get()` call made to requests.
|
||||
"""
|
||||
self.start_page: str = start_page if start_page else "about:blank"
|
||||
self.viewport_size = viewport_size # Applies only to the standard uri types
|
||||
self.downloads_folder = downloads_folder
|
||||
self.history: List[Tuple[str, float]] = list()
|
||||
self.page_title: Optional[str] = None
|
||||
self.viewport_current_page = 0
|
||||
self.viewport_pages: List[Tuple[int, int]] = list()
|
||||
self.set_address(self.start_page)
|
||||
self._page_content: str = ""
|
||||
|
||||
if search_engine is None:
|
||||
self._search_engine = BingMarkdownSearch()
|
||||
else:
|
||||
self._search_engine = search_engine
|
||||
|
||||
if markdown_converter is None:
|
||||
self._markdown_converter = MarkdownConverter()
|
||||
else:
|
||||
self._markdown_converter = markdown_converter
|
||||
|
||||
if requests_session is None:
|
||||
self._requests_session = requests.Session()
|
||||
else:
|
||||
self._requests_session = requests_session
|
||||
|
||||
if requests_get_kwargs is None:
|
||||
self._requests_get_kwargs = {}
|
||||
else:
|
||||
self._requests_get_kwargs = requests_get_kwargs
|
||||
|
||||
self._find_on_page_query: Union[str, None] = None
|
||||
self._find_on_page_last_result: Union[int, None] = None # Location of the last result
|
||||
|
||||
@property
|
||||
def address(self) -> str:
|
||||
"""Return the address of the current page."""
|
||||
return self.history[-1][0]
|
||||
|
||||
def set_address(self, uri_or_path: str) -> None:
|
||||
"""Sets the address of the current page.
|
||||
This will result in the page being fetched via the underlying requests session.
|
||||
|
||||
Arguments:
|
||||
uri_or_path: The fully-qualified URI to fetch, or the path to fetch from the current location. If the URI protocol is `search:`, the remainder of the URI is interpreted as a search query, and a web search is performed. If the URI protocol is `file://`, the remainder of the URI is interpreted as a local absolute file path.
|
||||
"""
|
||||
# TODO: Handle anchors
|
||||
self.history.append((uri_or_path, time.time()))
|
||||
|
||||
# Handle special URIs
|
||||
if uri_or_path == "about:blank":
|
||||
self._set_page_content("")
|
||||
elif uri_or_path.startswith("search:"):
|
||||
query = uri_or_path[len("search:") :].strip()
|
||||
results = self._search_engine.search(query)
|
||||
self.page_title = f"{query} - Search"
|
||||
self._set_page_content(results, split_pages=False)
|
||||
else:
|
||||
if (
|
||||
not uri_or_path.startswith("http:")
|
||||
and not uri_or_path.startswith("https:")
|
||||
and not uri_or_path.startswith("file:")
|
||||
):
|
||||
if len(self.history) > 1:
|
||||
prior_address = self.history[-2][0]
|
||||
uri_or_path = urljoin(prior_address, uri_or_path)
|
||||
# Update the address with the fully-qualified path
|
||||
self.history[-1] = (uri_or_path, self.history[-1][1])
|
||||
self._fetch_page(uri_or_path)
|
||||
|
||||
self.viewport_current_page = 0
|
||||
self.find_on_page_query = None
|
||||
self.find_on_page_viewport = None
|
||||
|
||||
@property
|
||||
def viewport(self) -> str:
|
||||
"""Return the content of the current viewport."""
|
||||
bounds = self.viewport_pages[self.viewport_current_page]
|
||||
return self.page_content[bounds[0] : bounds[1]]
|
||||
|
||||
@property
|
||||
def page_content(self) -> str:
|
||||
"""Return the full contents of the current page."""
|
||||
return self._page_content
|
||||
|
||||
def _set_page_content(self, content: str, split_pages=True) -> None:
|
||||
"""Sets the text content of the current page."""
|
||||
self._page_content = content
|
||||
|
||||
if split_pages:
|
||||
self._split_pages()
|
||||
else:
|
||||
self.viewport_pages = [(0, len(self._page_content))]
|
||||
|
||||
if self.viewport_current_page >= len(self.viewport_pages):
|
||||
self.viewport_current_page = len(self.viewport_pages) - 1
|
||||
|
||||
def page_down(self) -> None:
|
||||
"""Move the viewport down one page, if possible."""
|
||||
self.viewport_current_page = min(self.viewport_current_page + 1, len(self.viewport_pages) - 1)
|
||||
|
||||
def page_up(self) -> None:
|
||||
"""Move the viewport up one page, if possible."""
|
||||
self.viewport_current_page = max(self.viewport_current_page - 1, 0)
|
||||
|
||||
def find_on_page(self, query: str) -> Union[str, None]:
|
||||
"""Searches for the query from the current viewport forward, looping back to the start if necessary."""
|
||||
|
||||
# Did we get here via a previous find_on_page search with the same query?
|
||||
# If so, map to find_next
|
||||
if query == self._find_on_page_query and self.viewport_current_page == self._find_on_page_last_result:
|
||||
return self.find_next()
|
||||
|
||||
# Ok it's a new search start from the current viewport
|
||||
self._find_on_page_query = query
|
||||
viewport_match = self._find_next_viewport(query, self.viewport_current_page)
|
||||
if viewport_match is None:
|
||||
self._find_on_page_last_result = None
|
||||
return None
|
||||
else:
|
||||
self.viewport_current_page = viewport_match
|
||||
self._find_on_page_last_result = viewport_match
|
||||
return self.viewport
|
||||
|
||||
def find_next(self) -> None:
|
||||
"""Scroll to the next viewport that matches the query"""
|
||||
|
||||
if self._find_on_page_query is None:
|
||||
return None
|
||||
|
||||
starting_viewport = self._find_on_page_last_result
|
||||
if starting_viewport is None:
|
||||
starting_viewport = 0
|
||||
else:
|
||||
starting_viewport += 1
|
||||
if starting_viewport >= len(self.viewport_pages):
|
||||
starting_viewport = 0
|
||||
|
||||
viewport_match = self._find_next_viewport(self._find_on_page_query, starting_viewport)
|
||||
if viewport_match is None:
|
||||
self._find_on_page_last_result = None
|
||||
return None
|
||||
else:
|
||||
self.viewport_current_page = viewport_match
|
||||
self._find_on_page_last_result = viewport_match
|
||||
return self.viewport
|
||||
|
||||
def _find_next_viewport(self, query: str, starting_viewport: int) -> Union[int, None]:
|
||||
"""Search for matches between the starting viewport looping when reaching the end."""
|
||||
|
||||
if query is None:
|
||||
return None
|
||||
|
||||
# Normalize the query, and convert to a regular expression
|
||||
nquery = re.sub(r"\*", "__STAR__", query)
|
||||
nquery = " " + (" ".join(re.split(r"\W+", nquery))).strip() + " "
|
||||
nquery = nquery.replace(" __STAR__ ", "__STAR__ ") # Merge isolated stars with prior word
|
||||
nquery = nquery.replace("__STAR__", ".*").lower()
|
||||
|
||||
if nquery.strip() == "":
|
||||
return None
|
||||
|
||||
idxs = list()
|
||||
idxs.extend(range(starting_viewport, len(self.viewport_pages)))
|
||||
idxs.extend(range(0, starting_viewport))
|
||||
|
||||
for i in idxs:
|
||||
bounds = self.viewport_pages[i]
|
||||
content = self.page_content[bounds[0] : bounds[1]]
|
||||
|
||||
# TODO: Remove markdown links and images
|
||||
ncontent = " " + (" ".join(re.split(r"\W+", content))).strip().lower() + " "
|
||||
if re.search(nquery, ncontent):
|
||||
return i
|
||||
|
||||
return None
|
||||
|
||||
def visit_page(self, path_or_uri: str) -> str:
|
||||
"""Update the address, visit the page, and return the content of the viewport."""
|
||||
self.set_address(path_or_uri)
|
||||
return self.viewport
|
||||
|
||||
def open_local_file(self, local_path: str) -> str:
|
||||
"""Convert a local file path to a file:/// URI, update the address, visit the page, and return the contents of the viewport."""
|
||||
full_path = os.path.abspath(os.path.expanduser(local_path))
|
||||
self.set_address(pathlib.Path(full_path).as_uri())
|
||||
return self.viewport
|
||||
|
||||
def _split_pages(self) -> None:
|
||||
"""Split the page contents into pages that are approximately the viewport size. Small deviations are permitted to ensure words are not broken."""
|
||||
# Handle empty pages
|
||||
if len(self._page_content) == 0:
|
||||
self.viewport_pages = [(0, 0)]
|
||||
return
|
||||
|
||||
# Break the viewport into pages
|
||||
self.viewport_pages = []
|
||||
start_idx = 0
|
||||
while start_idx < len(self._page_content):
|
||||
end_idx = min(start_idx + self.viewport_size, len(self._page_content)) # type: ignore[operator]
|
||||
# Adjust to end on a space
|
||||
while end_idx < len(self._page_content) and self._page_content[end_idx - 1] not in [" ", "\t", "\r", "\n"]:
|
||||
end_idx += 1
|
||||
self.viewport_pages.append((start_idx, end_idx))
|
||||
start_idx = end_idx
|
||||
|
||||
def _fetch_page(
|
||||
self, url: str, session: requests.Session = None, requests_get_kwargs: Dict[str, Any] = None
|
||||
) -> None:
|
||||
"""Fetch a page using the requests library. Then convert it to Markdown, and set `page_content` (which splits the content into pages as necessary.
|
||||
|
||||
Arguments:
|
||||
url: The fully-qualified URL to fetch.
|
||||
session: Used to override the session used for this request. If None, use `self._requests_session` as usual.
|
||||
requests_get_kwargs: Extra arguments passes to `requests.Session.get`.
|
||||
"""
|
||||
download_path = ""
|
||||
response = None
|
||||
try:
|
||||
if url.startswith("file://"):
|
||||
download_path = os.path.normcase(os.path.normpath(unquote(url[7:])))
|
||||
if os.path.isdir(download_path):
|
||||
res = self._markdown_converter.convert_stream(
|
||||
io.StringIO(self._fetch_local_dir(download_path)), file_extension=".html"
|
||||
)
|
||||
self.page_title = res.title
|
||||
self._set_page_content(
|
||||
res.text_content, split_pages=False
|
||||
) # Like search results, don't split directory listings
|
||||
else:
|
||||
res = self._markdown_converter.convert_local(download_path)
|
||||
self.page_title = res.title
|
||||
self._set_page_content(res.text_content)
|
||||
else:
|
||||
# Send a HTTP request to the URL
|
||||
if session is None:
|
||||
session = self._requests_session
|
||||
|
||||
_get_kwargs = {}
|
||||
_get_kwargs.update(self._requests_get_kwargs)
|
||||
if requests_get_kwargs is not None:
|
||||
_get_kwargs.update(requests_get_kwargs)
|
||||
_get_kwargs["stream"] = True
|
||||
|
||||
response = session.get(url, **_get_kwargs)
|
||||
response.raise_for_status()
|
||||
|
||||
# If the HTTP request was successful
|
||||
content_type = response.headers.get("content-type", "")
|
||||
|
||||
# Text or HTML
|
||||
if "text/" in content_type.lower():
|
||||
res = self._markdown_converter.convert_response(response)
|
||||
self.page_title = res.title
|
||||
self._set_page_content(res.text_content)
|
||||
# A download
|
||||
else:
|
||||
# Try producing a safe filename
|
||||
fname = None
|
||||
download_path = None
|
||||
try:
|
||||
fname = pathvalidate.sanitize_filename(os.path.basename(urlparse(url).path)).strip()
|
||||
download_path = os.path.abspath(os.path.join(self.downloads_folder, fname))
|
||||
|
||||
suffix = 0
|
||||
while os.path.exists(download_path) and suffix < 1000:
|
||||
suffix += 1
|
||||
base, ext = os.path.splitext(fname)
|
||||
new_fname = f"{base}__{suffix}{ext}"
|
||||
download_path = os.path.abspath(os.path.join(self.downloads_folder, new_fname))
|
||||
|
||||
except NameError:
|
||||
pass
|
||||
|
||||
# No suitable name, so make one
|
||||
if fname is None:
|
||||
extension = mimetypes.guess_extension(content_type)
|
||||
if extension is None:
|
||||
extension = ".download"
|
||||
fname = str(uuid.uuid4()) + extension
|
||||
download_path = os.path.abspath(os.path.join(self.downloads_folder, fname))
|
||||
|
||||
# Open a file for writing
|
||||
with open(download_path, "wb") as fh:
|
||||
for chunk in response.iter_content(chunk_size=512):
|
||||
fh.write(chunk)
|
||||
|
||||
# Render it
|
||||
local_uri = pathlib.Path(download_path).as_uri()
|
||||
self.set_address(local_uri)
|
||||
|
||||
except UnsupportedFormatException:
|
||||
self.page_title = ("Download complete.",)
|
||||
self._set_page_content(f"# Download complete\n\nSaved file to '{download_path}'")
|
||||
except FileConversionException:
|
||||
self.page_title = ("Download complete.",)
|
||||
self._set_page_content(f"# Download complete\n\nSaved file to '{download_path}'")
|
||||
except FileNotFoundError:
|
||||
self.page_title = "Error 404"
|
||||
self._set_page_content(f"## Error 404\n\nFile not found: {download_path}")
|
||||
except requests.exceptions.RequestException:
|
||||
if response is None:
|
||||
self.page_title = "Request Exception"
|
||||
self._set_page_content("## Unhandled Request Exception:\n\n" + traceback.format_exc())
|
||||
else:
|
||||
self.page_title = f"Error {response.status_code}"
|
||||
|
||||
# If the error was rendered in HTML we might as well render it
|
||||
content_type = response.headers.get("content-type", "")
|
||||
if content_type is not None and "text/html" in content_type.lower():
|
||||
res = self._markdown_converter.convert(response)
|
||||
self.page_title = f"Error {response.status_code}"
|
||||
self._set_page_content(f"## Error {response.status_code}\n\n{res.text_content}")
|
||||
else:
|
||||
text = ""
|
||||
for chunk in response.iter_content(chunk_size=512, decode_unicode=True):
|
||||
text += chunk
|
||||
self.page_title = f"Error {response.status_code}"
|
||||
self._set_page_content(f"## Error {response.status_code}\n\n{text}")
|
||||
|
||||
def _fetch_local_dir(self, local_path: str) -> str:
|
||||
"""Render a local directory listing in HTML to assist with local file browsing via the "file://" protocol.
|
||||
Through rendered in HTML, later parts of the pipeline will convert the listing to Markdown.
|
||||
|
||||
Arguments:
|
||||
local_path: A path to the local directory whose contents are to be listed.
|
||||
|
||||
Returns:
|
||||
A directory listing, rendered in HTML.
|
||||
"""
|
||||
pardir = os.path.normpath(os.path.join(local_path, os.pardir))
|
||||
pardir_uri = pathlib.Path(pardir).as_uri()
|
||||
listing = f"""
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>Index of {html.escape(local_path)}</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Index of {html.escape(local_path)}</h1>
|
||||
|
||||
<a href="{html.escape(pardir_uri, quote=True)}">.. (parent directory)</a>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name</th><th>Size</th><th>Date modified</th>
|
||||
</tr>
|
||||
"""
|
||||
|
||||
for entry in os.listdir(local_path):
|
||||
full_path = os.path.normpath(os.path.join(local_path, entry))
|
||||
full_path_uri = pathlib.Path(full_path).as_uri()
|
||||
size = ""
|
||||
mtime = datetime.datetime.fromtimestamp(os.path.getmtime(full_path)).strftime("%Y-%m-%d %H:%M")
|
||||
|
||||
if os.path.isdir(full_path):
|
||||
entry = entry + os.path.sep
|
||||
else:
|
||||
size = str(os.path.getsize(full_path))
|
||||
|
||||
listing += (
|
||||
"<tr>\n"
|
||||
+ f'<td><a href="{html.escape(full_path_uri, quote=True)}">{html.escape(entry)}</a></td>'
|
||||
+ f"<td>{html.escape(size)}</td>"
|
||||
+ f"<td>{html.escape(mtime)}</td>"
|
||||
+ "</tr>"
|
||||
)
|
||||
|
||||
listing += """
|
||||
</table>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
return listing
|
|
@ -0,0 +1,80 @@
|
|||
import io
|
||||
import os
|
||||
from typing import Dict, Optional, Union
|
||||
from urllib.parse import parse_qs, quote_plus, unquote, urljoin, urlparse
|
||||
|
||||
from .requests_markdown_browser import RequestsMarkdownBrowser
|
||||
|
||||
# Check if Selenium dependencies are installed
|
||||
IS_SELENIUM_ENABLED = False
|
||||
try:
|
||||
from selenium import webdriver
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
from selenium.webdriver.common.by import By
|
||||
|
||||
IS_SELENIUM_ENABLED = True
|
||||
except ModuleNotFoundError:
|
||||
pass
|
||||
|
||||
|
||||
class SeleniumMarkdownBrowser(RequestsMarkdownBrowser):
|
||||
"""
|
||||
(In preview) A Selenium and Chromium powered Markdown web browser.
|
||||
SeleniumMarkdownBrowser extends RequestsMarkdownBrowser, and replaces only the functionality of `visit_page(url)`.
|
||||
"""
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
"""
|
||||
Instantiate a new SeleniumMarkdownBrowser.
|
||||
|
||||
Arguments:
|
||||
**kwargs: SeleniumMarkdownBrowser passes all arguments to the RequestsMarkdownBrowser superclass. See RequestsMarkdownBrowser documentation for more details.
|
||||
"""
|
||||
|
||||
super().__init__(**kwargs)
|
||||
self._webdriver = None
|
||||
|
||||
# Raise an error if Playwright isn't available
|
||||
if not IS_SELENIUM_ENABLED:
|
||||
raise ModuleNotFoundError(
|
||||
"No module named 'selenium'. Selenium can be installed via 'pip install selenium' or 'conda install selenium' depending on your environment."
|
||||
)
|
||||
|
||||
chrome_options = Options()
|
||||
chrome_options.add_argument("--headless")
|
||||
chrome_options.add_argument("--disable-gpu")
|
||||
chrome_options.add_argument("--no-sandbox")
|
||||
self._webdriver = webdriver.Chrome(options=chrome_options)
|
||||
self._webdriver.implicitly_wait(99)
|
||||
self._webdriver.get(self.start_page)
|
||||
|
||||
def __del__(self):
|
||||
"""
|
||||
Close the Selenium session when garbage-collected. Garbage collection may not always occur, or may happen at a later time. Call `close()` explicitly if you wish to free up resources used by Selenium or Chromium.
|
||||
"""
|
||||
self.close()
|
||||
|
||||
def close(self):
|
||||
"""
|
||||
Close the Selenium session used by this instance. The session cannot be reopened without instantiating a new SeleniumMarkdownBrowser instance.
|
||||
"""
|
||||
if self._webdriver is not None:
|
||||
self._webdriver.quit()
|
||||
self._webdriver = None
|
||||
|
||||
def _fetch_page(self, url) -> None:
|
||||
"""
|
||||
Fetch a page. If the page is a regular HTTP page, use Selenium to gather the HTML. If the page is a download, or a local file, rely on superclass behavior.
|
||||
"""
|
||||
if url.startswith("file://"):
|
||||
super()._fetch_page(url)
|
||||
else:
|
||||
self._webdriver.get(url)
|
||||
html = self._webdriver.execute_script("return document.documentElement.outerHTML;")
|
||||
|
||||
if not html: # Nothing... it's probably a download
|
||||
super()._fetch_page(url)
|
||||
else:
|
||||
self.page_title = self._webdriver.execute_script("return document.title;")
|
||||
res = self._markdown_converter.convert_stream(io.StringIO(html), file_extension=".html", url=url)
|
||||
self._set_page_content(res.text_content)
|
|
@ -129,12 +129,19 @@
|
|||
"outputs": [],
|
||||
"source": [
|
||||
"from autogen.agentchat.contrib.web_surfer import WebSurferAgent # noqa: E402\n",
|
||||
"from autogen.browser_utils import BingMarkdownSearch, RequestsMarkdownBrowser # noqa: E402\n",
|
||||
"\n",
|
||||
"browser = RequestsMarkdownBrowser(\n",
|
||||
" downloads_folder=os.getcwd(),\n",
|
||||
" search_engine=BingMarkdownSearch(bing_api_key=bing_api_key),\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"web_surfer = WebSurferAgent(\n",
|
||||
" \"web_surfer\",\n",
|
||||
" llm_config=llm_config,\n",
|
||||
" summarizer_llm_config=summarizer_llm_config,\n",
|
||||
" browser_config={\"viewport_size\": 4096, \"bing_api_key\": bing_api_key},\n",
|
||||
" is_termination_msg=lambda x: x.get(\"content\", \"\").find(\"TERMINATE\") >= 0,\n",
|
||||
" browser=browser,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"user_proxy = autogen.UserProxyAgent(\n",
|
||||
|
@ -179,42 +186,87 @@
|
|||
">>>>>>>> EXECUTING FUNCTION informational_web_search...\u001b[0m\n",
|
||||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"Address: bing: Microsoft AutoGen\n",
|
||||
"Address: search: Microsoft AutoGen\n",
|
||||
"Title: Microsoft AutoGen - Search\n",
|
||||
"Viewport position: Showing page 1 of 1.\n",
|
||||
"=======================\n",
|
||||
"A Bing search for 'Microsoft AutoGen' found 10 results:\n",
|
||||
"## A Bing search for 'Microsoft AutoGen' found 19 results:\n",
|
||||
"\n",
|
||||
"## Web Results\n",
|
||||
"1. [AutoGen: Enabling next-generation large language model applications](https://www.microsoft.com/en-us/research/blog/autogen-enabling-next-generation-large-language-model-applications/)\n",
|
||||
"AutoGen is a Python package that simplifies the orchestration, optimization, and automation of large language model applications. It enables customizable and conversable agents that integrate with humans, tools, and other agents to solve tasks using GPT-4 and other advanced LLMs. Learn how to use AutoGen for code-based question answering, supply-chain optimization, conversational chess, and more.\n",
|
||||
"AutoGen enables complex LLM-based workflows using multi-agent conversations. (Left) AutoGen agents are customizable and can be based on LLMs, tools, humans, and even a combination of them. (Top-right) Agents can converse to solve tasks. (Bottom-right) The framework supports many additional complex conversation patterns.\n",
|
||||
"\n",
|
||||
"2. [GitHub - microsoft/autogen: Enable Next-Gen Large Language Model ...](https://github.com/microsoft/autogen)\n",
|
||||
"AutoGen is a Python library that enables the development of large language model applications using multiple agents that can converse with each other to solve tasks. It supports various conversation patterns, enhanced LLM inference, and customizable and conversable agents based on OpenAI models.\n",
|
||||
"2. [AutoGen - Microsoft Research](https://www.microsoft.com/en-us/research/project/autogen/)\n",
|
||||
"Related projects. AutoGen is an open-source, community-driven project under active development (as a spinoff from FLAML, a fast library for automated machine learning and tuning), which encourages contributions from individuals of all backgrounds.Many Microsoft Research collaborators have made great contributions to this project, including academic contributors like Pennsylvania State ...\n",
|
||||
"\n",
|
||||
"3. [Getting Started | AutoGen](https://microsoft.github.io/autogen/docs/Getting-Started/)\n",
|
||||
"AutoGen is a framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools. Main Features\n",
|
||||
"3. [AutoGen: Downloads - Microsoft Research](https://www.microsoft.com/en-us/research/project/autogen/downloads/)\n",
|
||||
"Enable Next-Gen Large Language Model Applications. AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They…. AutoGen allows developers to build LLM applications ...\n",
|
||||
"\n",
|
||||
"4. [AutoGen | AutoGen - microsoft.github.io](https://microsoft.github.io/autogen/)\n",
|
||||
"AutoGen is a tool that enables next-gen large language model applications by providing a high-level abstraction for building diverse and enhanced LLM workflows. It offers a collection of working systems for various domains and complexities, as well as enhanced LLM inference and optimization APIs.\n",
|
||||
"4. [Microsoft Semantic Kernel and AutoGen: Open Source Frameworks for AI ...](https://techcommunity.microsoft.com/t5/educator-developer-blog/microsoft-semantic-kernel-and-autogen-open-source-frameworks-for/ba-p/4051305)\n",
|
||||
"Microsoft AutoGen is designed for integrating and controlling multiple LLMs. It’s a research project that shows the potential of using multiple agents together. AutoGen allows for the creation of diverse teams of agents, each with their own specialized skills or goals. These agents can chat with each other, facilitating greater diversity in ...\n",
|
||||
"\n",
|
||||
"5. [AutoGen - Microsoft Research](https://www.microsoft.com/en-us/research/project/autogen/)\n",
|
||||
"AutoGen is an open-source library for building next-generation LLM applications with multiple agents, teachability and personalization. It supports agents that can be backed by various LLM configurations, code generation and execution, and human proxy agent integration.\n",
|
||||
"5. [GitHub - microsoft/autogen: A programming framework for agentic AI ...](https://github.com/microsoft/autogen)\n",
|
||||
"microsoft.github.io/autogen/ Topics chat chatbot gpt chat-application agent-based-framework agent-oriented-programming gpt-4 chatgpt llmops gpt-35-turbo llm-agent llm-inference agentic llm-framework agentic-agi\n",
|
||||
"\n",
|
||||
"6. [Installation | AutoGen](https://microsoft.github.io/autogen/docs/Installation/)\n",
|
||||
"Installation Setup Virtual Environment When not using a docker container, we recommend using a virtual environment to install AutoGen. This will ensure that the dependencies for AutoGen are isolated from the rest of your system. Option 1: venv You can create a virtual environment with venv as below: python3 -m venv pyautogen\n",
|
||||
"6. [Getting Started | AutoGen - microsoft.github.io](https://microsoft.github.io/autogen/docs/Getting-Started/)\n",
|
||||
"Getting Started. AutoGen is a framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools.\n",
|
||||
"\n",
|
||||
"7. [AutoGen: Downloads - Microsoft Research](https://www.microsoft.com/en-us/research/project/autogen/downloads/)\n",
|
||||
"AutoGen allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.\n",
|
||||
"7. [AutoGen | AutoGen - microsoft.github.io](https://microsoft.github.io/autogen/)\n",
|
||||
"AutoGen provides multi-agent conversation framework as a high-level abstraction. With this framework, one can conveniently build LLM workflows. Easily Build Diverse Applications. AutoGen offers a collection of working systems spanning a wide range of applications from various domains and complexities.\n",
|
||||
"\n",
|
||||
"8. [Multi-agent Conversation Framework | AutoGen - microsoft.github.io](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat/)\n",
|
||||
"AutoGen offers a unified multi-agent conversation framework as a high-level abstraction of using foundation models. It features capable, customizable and conversable agents which integrate LLMs, tools, and humans via automated agent chat.\n",
|
||||
"8. [AutoGen Studio: Interactively Explore Multi-Agent Workflows](https://microsoft.github.io/autogen/blog/2023/12/01/AutoGenStudio/)\n",
|
||||
"To help you rapidly prototype multi-agent solutions for your tasks, we are introducing AutoGen Studio, an interface powered by AutoGen. It allows you to: Declaratively define and modify agents and multi-agent workflows through a point and click, drag and drop interface (e.g., you can select the parameters of two agents that will communicate to ...\n",
|
||||
"\n",
|
||||
"9. [[2308.08155] AutoGen: Enabling Next-Gen LLM Applications via Multi ...](https://arxiv.org/abs/2308.08155)\n",
|
||||
"AutoGen is an open-source framework that allows developers to create and customize agents that can converse with each other to perform tasks using various types of language models (LLMs). The framework supports natural language and code-based conversation patterns, and is effective for diverse applications such as mathematics, coding, question answering, and more.\n",
|
||||
"9. [AutoGen | Getting Started A-Z Install & Run | Easy | Microsoft](https://www.youtube.com/watch?v=UxtJsIDTFZo)\n",
|
||||
"Getting Started with AutoGen in 10 mins. AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human ...\n",
|
||||
"Date published: 2023-10-04\n",
|
||||
"\n",
|
||||
"10. [How to setup and use the new Microsoft AutoGen AI agent](https://www.geeky-gadgets.com/microsoft-autogen/)\n",
|
||||
"Learn how to use AutoGen, a tool that simplifies the automation and optimization of complex language model applications using multiple agents that can converse with each other. AutoGen supports diverse conversation patterns, human participation, and the tuning of expensive LLMs like ChatGPT and GPT-4.\n",
|
||||
"10. [AutoGen Tutorial 🚀 Create Custom AI Agents EASILY (Incredible)](https://www.youtube.com/watch?v=vU2S6dVf79M)\n",
|
||||
"In this video, I show you how to use AutoGen, which allows anyone to use multi-agent LLMs to power their applications. First, I give an overview of what AutoGen is, and then I show you how to use it with two examples. Currently, AutoGen works with OpenAI's API, but they are already working on adding local models natively, and you can already do ...\n",
|
||||
"Date published: 2023-10-03\n",
|
||||
"\n",
|
||||
"11. [Microsoft Autogen Studio 2 - How to run an army of agents](https://www.youtube.com/watch?v=lRu_-yFY-4M)\n",
|
||||
"In this video, we’ll explore how to use and leverage Autogen Studio 2 by creating two agents: one to extract YouTube comments and another to transform those insights into fresh video content ideas. 🔧 Installation is a breeze with just two commands. 👁️🗨️ The 'Build' tab's intuitive interface to use different LLMs 🧠 'Skills ...\n",
|
||||
"Date published: 2024-02-02\n",
|
||||
"\n",
|
||||
"12. [AutoGen Studio 2.0 Advanced Tutorial | Build multi-agent GenAI Application!!](https://www.youtube.com/watch?v=MUhRP8QCb9A)\n",
|
||||
"In this tutorial we will be covering AutoGen 2.0 from Microsoft which is an open-source library, offers a high-level abstraction for multi-agent conversation frameworks, facilitating next-generation LLM applications with collaborative, teachable, and personalized features to enhance productivity. In this tutorial we will be installing AutoGen ...\n",
|
||||
"Date published: 2024-02-02\n",
|
||||
"\n",
|
||||
"13. [AutoGen Tutorial: Create GODLY Custom AI Agents EASILY (Installation Tutorial)](https://www.youtube.com/watch?v=ijYDTDR4f8k)\n",
|
||||
"In this video, we delve into the revolutionary world of AutoGen, a sophisticated framework designed to simplify and streamline the management of workflows involving large language models (LLMs). These workflows are intricate, demanding, and require expertise to design, implement, and optimize effectively. As developers explore complex ...\n",
|
||||
"Date published: 2023-10-12\n",
|
||||
"\n",
|
||||
"14. [AutoGen Tutorial 🤖 Create Collaborating AI Agent teams](https://www.youtube.com/watch?v=0GyJ3FLHR1o)\n",
|
||||
"You can now use AutoGen to create multiple AI agents to work together to complete a task that you defined. Let's take a look at what Autogen is, and then I will show you how to quickly start using it. In this video, we will go through the examples of task solving with code generation, execution and debugging and setting up a group chat of more ...\n",
|
||||
"Date published: 2023-10-30\n",
|
||||
"\n",
|
||||
"15. [Autogen - Microsoft's best AI Agent framework that is controllable?](https://www.youtube.com/watch?v=Bq-0ClZttc8)\n",
|
||||
"Microsoft just announced a multi agent framework called Autogen, which solved a few problems of existing agent frameworks; Let’s dive in 🔗 Links - Follow me on twitter: https://twitter.com/jasonzhou1993 - Join my AI email list: https://www.ai-jason.com/ - My discord: https://discord.gg/eZXprSaCDE - Github repo & blog: https://ai-jason ...\n",
|
||||
"Date published: 2023-10-03\n",
|
||||
"\n",
|
||||
"16. [Microsoft AUTOGEN STUDIO 2.0 HUGE UPDATE - Create Custom AI Agents | Microsoft AI](https://www.youtube.com/watch?v=kj8nVBI_oiM)\n",
|
||||
"Learn everything about Microsoft’s revolutionary Autogen Studio 2.0 - an intuitive graphical interface enabling anyone to build coordinated AI solutions without coding! See how pre-built skills are visually combined into flexible teams handling complex goals through specialization and conversation. From travel planning to content creation ...\n",
|
||||
"Date published: 2024-01-17\n",
|
||||
"\n",
|
||||
"17. [Microsoft's Autogen 2 - Create Custom AI Agents](https://www.youtube.com/watch?v=_LGUXoNuwOo)\n",
|
||||
"💬 Access GPT-4 ,Claude-2 and more - chat.forefront.ai/?ref=theaigrid 🎤 Use the best AI Voice Creator - elevenlabs.io/?from=partnerscott3908 ️ Join Our Weekly Newsletter - https://mailchi.mp/6cff54ad7e2e/theaigrid 🐤 Follow us on Twitter https://twitter.com/TheAiGrid 🌐 Checkout Our website - https://theaigrid.com/ https://microsoft ...\n",
|
||||
"Date published: 2024-01-14\n",
|
||||
"\n",
|
||||
"18. [arXiv:2308.08155v2 cs.AI 3 Oct 2023](https://arxiv.org/pdf/2308.08155.pdf)\n",
|
||||
"AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Qingyun Wu †, Gagan Bansal ∗, Jieyu Zhang±, Yiran Wu , Beibin Li Erkang Zhu ∗, Li Jiang , Xiaoyun Zhang , Shaokun Zhang†, Jiale Liu∓ Ahmed Awadallah ∗, Ryen W. White , Doug Burger , Chi Wang∗1 ∗Microsoft Research, †Pennsylvania State University ±University of Washington,∓Xidian University\n",
|
||||
"\n",
|
||||
"19. [Releases · microsoft/autogen · GitHub](https://github.com/microsoft/autogen/releases)\n",
|
||||
"Notebook. New feature in code execution: Support user defined functions in local CLI executor - similar functionality to the \"skills\" in AutoGen Studio. New agent capability: Vision Capability for ConversableAgents allows them to \"see\" images. New IOStream protocol and support for web sockets!\n",
|
||||
"\n",
|
||||
"## Related Searches:\n",
|
||||
"- microsoft autogen download\n",
|
||||
"- microsoft autogen examples\n",
|
||||
"- autogenai website\n",
|
||||
"- is microsoft autogen free\n",
|
||||
"- autogen install\n",
|
||||
"- how to install microsoft autogen\n",
|
||||
"- autogen microsoft tutorial\n",
|
||||
"- autogen openai\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
|
@ -225,7 +277,7 @@
|
|||
"Search the web for information about Microsoft AutoGen\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task1)"
|
||||
"user_proxy.initiate_chat(web_surfer, message=task1);"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -248,7 +300,7 @@
|
|||
">>>>>>>> EXECUTING FUNCTION summarize_page...\u001b[0m\n",
|
||||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"AutoGen is a Python package and framework developed by Microsoft that simplifies the orchestration, optimization, and automation of large language model (LLM) applications. It enables the development of customizable and conversable agents that can solve tasks using advanced LLMs like GPT-4. AutoGen supports various conversation patterns, enhanced LLM inference, and seamless integration with humans, tools, and other agents. It offers a high-level abstraction for building diverse and enhanced LLM workflows and provides a collection of working systems for different domains and complexities. AutoGen is open-source and supports natural language and code-based conversation patterns for applications such as question answering, coding, mathematics, and more.\n",
|
||||
"Microsoft AutoGen is a framework that enables the development of large language model (LLM) applications using multiple agents that can converse with each other to solve tasks. It allows for the creation of diverse teams of agents with specialized skills or goals, facilitating greater diversity in workflows. AutoGen agents are customizable, conversable, and seamlessly allow human participation. The framework supports various conversation patterns and offers a collection of working systems spanning a wide range of applications. AutoGen Studio is an interface powered by AutoGen that allows for the rapid prototyping of multi-agent solutions through a point and click, drag and drop interface. There are also tutorials and videos available to help users get started with AutoGen.\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
|
@ -256,7 +308,7 @@
|
|||
],
|
||||
"source": [
|
||||
"task2 = \"Summarize these results\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task2, clear_history=False)"
|
||||
"user_proxy.initiate_chat(web_surfer, message=task2, clear_history=False);"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -276,59 +328,183 @@
|
|||
"\u001b[31m\n",
|
||||
">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
|
||||
"\u001b[35m\n",
|
||||
">>>>>>>> EXECUTING FUNCTION navigational_web_search...\u001b[0m\n",
|
||||
">>>>>>>> EXECUTING FUNCTION visit_page...\u001b[0m\n",
|
||||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"Address: https://microsoft.github.io/autogen/docs/Getting-Started/\n",
|
||||
"Title: Getting Started | AutoGen\n",
|
||||
"Viewport position: Showing page 1 of 2.\n",
|
||||
"Viewport position: Showing page 1 of 1.\n",
|
||||
"=======================\n",
|
||||
"Getting Started | AutoGen\n",
|
||||
"\n",
|
||||
"[Skip to main content](#)[![AutoGen](/autogen/img/ag.svg)![AutoGen](/autogen/img/ag.svg)**AutoGen**](/autogen/)[Docs](/autogen/docs/Getting-Started)[SDK](/autogen/docs/reference/agentchat/conversable_agent)[Blog](/autogen/blog)[FAQ](/autogen/docs/FAQ)[Examples](/autogen/docs/Examples)Resources* [Ecosystem](/autogen/docs/Ecosystem)\n",
|
||||
"* [Gallery](/autogen/docs/Gallery)\n",
|
||||
"[GitHub](https://github.com/microsoft/autogen)🌜🌞`ctrl``K`* [Getting Started](/autogen/docs/Getting-Started)\n",
|
||||
"* [Installation](/autogen/docs/Installation)\n",
|
||||
"* [Use Cases](#)\n",
|
||||
"[Skip to main content](#__docusaurus_skipToContent_fallback)What's new in AutoGen? Read [this blog](/autogen/blog/2024/03/03/AutoGen-Update) for an overview of updates[![AutoGen](/autogen/img/ag.svg)![AutoGen](/autogen/img/ag.svg)**AutoGen**](/autogen/)Docs* [Getting Started](/autogen/docs/Getting-Started)\n",
|
||||
"* [Installation](/autogen/docs/installation/)\n",
|
||||
"* [Tutorial](/autogen/docs/tutorial/introduction)\n",
|
||||
"* [User Guide](/autogen/docs/topics)\n",
|
||||
"* [API Reference](/autogen/docs/reference/agentchat/conversable_agent)\n",
|
||||
"* [FAQs](/autogen/docs/FAQ)\n",
|
||||
"* [Ecosystem](/autogen/docs/ecosystem)\n",
|
||||
"* [Contribute](/autogen/docs/Contribute)\n",
|
||||
"* [Research](/autogen/docs/Research)\n",
|
||||
"Examples* [Examples by Category](/autogen/docs/Examples)\n",
|
||||
"* [Examples by Notebook](/autogen/docs/notebooks)\n",
|
||||
"* [Application Gallery](/autogen/docs/Gallery)\n",
|
||||
"Other Languages* [Dotnet](https://microsoft.github.io/autogen-for-net/)\n",
|
||||
"[Blog](/autogen/blog)[GitHub](https://github.com/microsoft/autogen)[Discord](https://aka.ms/autogen-dc)[Twitter](https://twitter.com/pyautogen)`ctrl``K`* [Getting Started](/autogen/docs/Getting-Started)\n",
|
||||
"* [Installation](/autogen/docs/installation/)\n",
|
||||
"* [Tutorial](/autogen/docs/tutorial)\n",
|
||||
"\t+ [Introduction](/autogen/docs/tutorial/introduction)\n",
|
||||
"\t+ [Chat Termination](/autogen/docs/tutorial/chat-termination)\n",
|
||||
"\t+ [Human in the Loop](/autogen/docs/tutorial/human-in-the-loop)\n",
|
||||
"\t+ [Code Executors](/autogen/docs/tutorial/code-executors)\n",
|
||||
"\t+ [Tool Use](/autogen/docs/tutorial/tool-use)\n",
|
||||
"\t+ [Conversation Patterns](/autogen/docs/tutorial/conversation-patterns)\n",
|
||||
"\t+ [What Next?](/autogen/docs/tutorial/what-next)\n",
|
||||
"* [Use Cases](/autogen/docs/Use-Cases/agent_chat)\n",
|
||||
"* [User Guide](/autogen/docs/topics)\n",
|
||||
"\t+ [Code Execution](/autogen/docs/topics/code-execution/cli-code-executor)\n",
|
||||
"\t+ [Using Non-OpenAI Models](/autogen/docs/topics/non-openai-models/about-using-nonopenai-models)\n",
|
||||
"\t+ [LLM Caching](/autogen/docs/topics/llm-caching)\n",
|
||||
"\t+ [LLM Configuration](/autogen/docs/topics/llm_configuration)\n",
|
||||
"\t+ [Prompting and Reasoning](/autogen/docs/topics/prompting-and-reasoning/react)\n",
|
||||
"\t+ [Retrieval Augmentation](/autogen/docs/topics/retrieval_augmentation)\n",
|
||||
"\t+ [Task Decomposition](/autogen/docs/topics/task_decomposition)\n",
|
||||
"* [API Reference](/autogen/docs/reference/agentchat/conversable_agent)\n",
|
||||
"* [FAQs](/autogen/docs/FAQ)\n",
|
||||
"* [Ecosystem](/autogen/docs/ecosystem)\n",
|
||||
"* [Contributing](/autogen/docs/Contribute)\n",
|
||||
"* [Research](/autogen/docs/Research)\n",
|
||||
"On this pageGetting Started\n",
|
||||
"===============\n",
|
||||
"* [Migration Guide](/autogen/docs/Migration-Guide)\n",
|
||||
"*\n",
|
||||
"* Getting Started\n",
|
||||
"On this page\n",
|
||||
"# Getting Started\n",
|
||||
"\n",
|
||||
"AutoGen is a framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools.\n",
|
||||
"AutoGen is a framework that enables development of LLM applications using\n",
|
||||
"multiple agents that can converse with each other to solve tasks. AutoGen agents\n",
|
||||
"are customizable, conversable, and seamlessly allow human participation. They\n",
|
||||
"can operate in various modes that employ combinations of LLMs, human inputs, and\n",
|
||||
"tools.\n",
|
||||
"\n",
|
||||
"![AutoGen Overview](/autogen/assets/images/autogen_agentchat-250ca64b77b87e70d34766a080bf6ba8.png)\n",
|
||||
"\n",
|
||||
"### Main Features[](#main-features \"Direct link to heading\")\n",
|
||||
"### Main Features[](#main-features \"Direct link to Main Features\")\n",
|
||||
"\n",
|
||||
"* AutoGen enables building next-gen LLM applications based on [multi-agent conversations](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.\n",
|
||||
"* It supports [diverse conversation patterns](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns) for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,\n",
|
||||
"the number of agents, and agent conversation topology.\n",
|
||||
"* It provides a collection of working systems with different complexities. These systems span a [wide range of applications](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#diverse-applications-implemented-with-autogen) from various domains and complexities. This demonstrates how AutoGen can easily support diverse conversation patterns.\n",
|
||||
"* AutoGen provides [enhanced LLM inference](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification). It offers utilities like API unification and caching, and advanced usage patterns, such as error handling, multi-config inference, context programming, etc.\n",
|
||||
"* AutoGen enables building next-gen LLM applications based on [multi-agent\n",
|
||||
"conversations](/autogen/docs/Use-Cases/agent_chat) with minimal effort. It simplifies\n",
|
||||
"the orchestration, automation, and optimization of a complex LLM workflow. It\n",
|
||||
"maximizes the performance of LLM models and overcomes their weaknesses.\n",
|
||||
"* It supports [diverse conversation\n",
|
||||
"patterns](/autogen/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns)\n",
|
||||
"for complex workflows. With customizable and conversable agents, developers can\n",
|
||||
"use AutoGen to build a wide range of conversation patterns concerning\n",
|
||||
"conversation autonomy, the number of agents, and agent conversation topology.\n",
|
||||
"* It provides a collection of working systems with different complexities. These\n",
|
||||
"systems span a [wide range of\n",
|
||||
"applications](/autogen/docs/Use-Cases/agent_chat#diverse-applications-implemented-with-autogen)\n",
|
||||
"from various domains and complexities. This demonstrates how AutoGen can\n",
|
||||
"easily support diverse conversation patterns.\n",
|
||||
"\n",
|
||||
"AutoGen is powered by collaborative [research studies](/autogen/docs/Research) from Microsoft, Penn State University, and University of Washington.\n",
|
||||
"AutoGen is powered by collaborative [research studies](/autogen/docs/Research) from\n",
|
||||
"Microsoft, Penn State University, and University of Washington.\n",
|
||||
"\n",
|
||||
"### Quickstart[](#quickstart \"Direct link to heading\")\n",
|
||||
"### Quickstart[](#quickstart \"Direct link to Quickstart\")\n",
|
||||
"\n",
|
||||
"Install from pip: `pip install pyautogen`. Find more options in [Installation](/autogen/docs/Installation).\n",
|
||||
"For [code execution](/autogen/docs/FAQ#code-execution), we strongly recommend installing the python docker package, and using docker.\n",
|
||||
"```\n",
|
||||
"pip install pyautogen\n",
|
||||
"\n",
|
||||
"#### Multi-Agent Conversation Framework[](#multi-agent-conversation-framework \"Direct link to heading\")\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"* No code execution\n",
|
||||
"* Local execution\n",
|
||||
"* Docker execution\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"from autogen import AssistantAgent, UserProxyAgent\n",
|
||||
"\n",
|
||||
"llm_config = {\"model\": \"gpt-4\", \"api_key\": os.environ[\"OPENAI_API_KEY\"]}\n",
|
||||
"assistant = AssistantAgent(\"assistant\", llm_config=llm_config)\n",
|
||||
"user_proxy = UserProxyAgent(\"user_proxy\", code_execution_config=False)\n",
|
||||
"\n",
|
||||
"# Start the chat\n",
|
||||
"user_proxy.initiate_chat(\n",
|
||||
" assistant,\n",
|
||||
" message=\"Tell me a joke about NVDA and TESLA stock prices.\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"warningWhen asked, be sure to check the generated code before continuing to ensure it is safe to run.\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"import autogen\n",
|
||||
"from autogen import AssistantAgent, UserProxyAgent\n",
|
||||
"\n",
|
||||
"llm_config = {\"model\": \"gpt-4\", \"api_key\": os.environ[\"OPENAI_API_KEY\"]}\n",
|
||||
"assistant = AssistantAgent(\"assistant\", llm_config=llm_config)\n",
|
||||
"\n",
|
||||
"user_proxy = UserProxyAgent(\n",
|
||||
" \"user_proxy\", code_execution_config={\"executor\": autogen.coding.LocalCommandLineCodeExecutor(work_dir=\"coding\")}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Start the chat\n",
|
||||
"user_proxy.initiate_chat(\n",
|
||||
" assistant,\n",
|
||||
" message=\"Plot a chart of NVDA and TESLA stock price change YTD.\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"import autogen\n",
|
||||
"from autogen import AssistantAgent, UserProxyAgent\n",
|
||||
"\n",
|
||||
"llm_config = {\"model\": \"gpt-4\", \"api_key\": os.environ[\"OPENAI_API_KEY\"]}\n",
|
||||
"\n",
|
||||
"with autogen.coding.DockerCommandLineCodeExecutor(work_dir=\"coding\") as code_executor:\n",
|
||||
" assistant = AssistantAgent(\"assistant\", llm_config=llm_config)\n",
|
||||
" user_proxy = UserProxyAgent(\n",
|
||||
" \"user_proxy\", code_execution_config={\"executor\": code_executor}\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" # Start the chat\n",
|
||||
" user_proxy.initiate_chat(\n",
|
||||
" assistant,\n",
|
||||
" message=\"Plot a chart of NVDA and TESLA stock price change YTD. Save the plot to a file called plot.png\",\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"Open `coding/plot.png` to see the generated plot.\n",
|
||||
"\n",
|
||||
"tipLearn more about configuring LLMs for agents [here](/autogen/docs/topics/llm_configuration).\n",
|
||||
"\n",
|
||||
"#### Multi-Agent Conversation Framework[](#multi-agent-conversation-framework \"Direct link to Multi-Agent Conversation Framework\")\n",
|
||||
"\n",
|
||||
"Autogen enables the next-gen LLM applications with a generic multi-agent conversation framework. It offers customizable and conversable agents which integrate LLMs, tools, and humans.\n",
|
||||
"By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For [example](https://github.com/microsoft/autogen/blob/main/test/twoagent.py),\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"from autogen import AssistantAgent, UserProxyAgent, config\\_list\\_from\\_json \n",
|
||||
" \n",
|
||||
"# Load LLM inference endpoints from an env variable or a file \n",
|
||||
"# See https://microsoft.github.io/autogen/docs/FAQ#set-your-api-endpoints \n",
|
||||
"# and OAI\\_CONFIG\\_LIST\\_sample.json \n",
|
||||
"config\\_list = config\\_list\\_from\\_json(env\\_or\\_file=\"OAI\\_CONFIG\\_LIST\") \n",
|
||||
"assistant = AssistantAgent(\"assistant\", llm\\_config={\"config\\_list\": config\\_list}) \n",
|
||||
"user\\_proxy = UserProxyAgent(\"user\\_proxy\", code\\_execution\\_config={\"work\\_dir\": \"coding\"}) \n",
|
||||
"user\\_proxy.initiate\\_chat(assistant, \n",
|
||||
"The figure below shows an example conversation flow with AutoGen.\n",
|
||||
"\n",
|
||||
"![Agent Chat Example](/autogen/assets/images/chat_example-da70a7420ebc817ef9826fa4b1e80951.png)\n",
|
||||
"\n",
|
||||
"### Where to Go Next?[](#where-to-go-next \"Direct link to Where to Go Next?\")\n",
|
||||
"\n",
|
||||
"* Go through the [tutorial](/autogen/docs/tutorial/introduction) to learn more about the core concepts in AutoGen\n",
|
||||
"* Read the examples and guides in the [notebooks section](/autogen/docs/notebooks)\n",
|
||||
"* Understand the use cases for [multi-agent conversation](/autogen/docs/Use-Cases/agent_chat) and [enhanced LLM inference](/autogen/docs/Use-Cases/enhanced_inference)\n",
|
||||
"* Read the [API](/autogen/docs/reference/agentchat/conversable_agent/) docs\n",
|
||||
"* Learn about [research](/autogen/docs/Research) around AutoGen\n",
|
||||
"* Chat on [Discord](https://aka.ms/autogen-dc)\n",
|
||||
"* Follow on [Twitter](https://twitter.com/pyautogen)\n",
|
||||
"* See our [roadmaps](https://aka.ms/autogen-roadmap)\n",
|
||||
"\n",
|
||||
"If you like our project, please give it a [star](https://github.com/microsoft/autogen/stargazers) on GitHub. If you are interested in contributing, please read [Contributor's Guide](/autogen/docs/Contribute).\n",
|
||||
"\n",
|
||||
"[Edit this page](https://github.com/microsoft/autogen/edit/main/website/docs/Getting-Started.mdx)[NextInstallation](/autogen/docs/installation/)* [Main Features](#main-features)\n",
|
||||
"* [Quickstart](#quickstart)\n",
|
||||
"* [Where to Go Next?](#where-to-go-next)\n",
|
||||
"Community* [Discord](https://aka.ms/autogen-dc)\n",
|
||||
"* [Twitter](https://twitter.com/pyautogen)\n",
|
||||
"Copyright © 2024 AutoGen Authors | [Privacy and Cookies](https://go.microsoft.com/fwlink/?LinkId=521839)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
|
@ -336,7 +512,7 @@
|
|||
],
|
||||
"source": [
|
||||
"task3 = \"Click the 'Getting Started' result\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task3, clear_history=False)"
|
||||
"user_proxy.initiate_chat(web_surfer, message=task3, clear_history=False);"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -370,74 +546,47 @@
|
|||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"Address: https://en.wikipedia.org/wiki/Microsoft\n",
|
||||
"Title: Microsoft - Wikipedia\n",
|
||||
"Viewport position: Showing page 1 of 64.\n",
|
||||
"Title: Microsoft\n",
|
||||
"Viewport position: Showing page 1 of 34.\n",
|
||||
"=======================\n",
|
||||
"# Microsoft\n",
|
||||
"\n",
|
||||
"American multinational technology corporation\n",
|
||||
"\n",
|
||||
"Microsoft Corporation| [A square divided into four sub-squares, colored red-orange, green, yellow and blue (clockwise), with the company name appearing to its right](/wiki/File:Microsoft_logo_(2012).svg) |\n",
|
||||
"| Building 92 on the [Microsoft Redmond campus](/wiki/Microsoft_Redmond_campus \"Microsoft Redmond campus\") |\n",
|
||||
"| Type | [Public](/wiki/Public_company \"Public company\") |\n",
|
||||
"| [Traded as](/wiki/Ticker_symbol \"Ticker symbol\") | * [Nasdaq](/wiki/Nasdaq \"Nasdaq\"): [MSFT](https://www.nasdaq.com/market-activity/stocks/msft)\n",
|
||||
"* [Nasdaq-100](/wiki/Nasdaq-100 \"Nasdaq-100\") component\n",
|
||||
"* [DJIA](/wiki/Dow_Jones_Industrial_Average \"Dow Jones Industrial Average\") component\n",
|
||||
"* [S&P 100](/wiki/S%26P_100 \"S&P 100\") component\n",
|
||||
"* [S&P 500](/wiki/S%26P_500 \"S&P 500\") component\n",
|
||||
" |\n",
|
||||
"Microsoft Corporation\n",
|
||||
"| [A square divided into four sub-squares, colored red-orange, green, yellow and blue (clockwise), with the company name appearing to its right](/wiki/File%3AMicrosoft_logo_%282012%29.svg) | |\n",
|
||||
"| --- | --- |\n",
|
||||
"| Aerial view of the [Microsoft Redmond campus](/wiki/Microsoft_Redmond_campus \"Microsoft Redmond campus\") | |\n",
|
||||
"| Company type | [Public](/wiki/Public_company \"Public company\") |\n",
|
||||
"| [Traded as](/wiki/Ticker_symbol \"Ticker symbol\") | * [Nasdaq](/wiki/Nasdaq \"Nasdaq\"): [MSFT](https://www.nasdaq.com/market-activity/stocks/msft) * [Nasdaq-100](/wiki/Nasdaq-100 \"Nasdaq-100\") component * [DJIA](/wiki/Dow_Jones_Industrial_Average \"Dow Jones Industrial Average\") component * [S&P 100](/wiki/S%26P_100 \"S&P 100\") component * [S&P 500](/wiki/S%26P_500 \"S&P 500\") component |\n",
|
||||
"| [ISIN](/wiki/International_Securities_Identification_Number \"International Securities Identification Number\") | [US5949181045](https://isin.toolforge.org/?language=en&isin=US5949181045) |\n",
|
||||
"| Industry | [Information technology](/wiki/Information_technology \"Information technology\") |\n",
|
||||
"| Founded | April 4, 1975; 48 years ago (1975-04-04) in [Albuquerque, New Mexico](/wiki/Albuquerque,_New_Mexico \"Albuquerque, New Mexico\"), U.S. |\n",
|
||||
"| Founders | * [Bill Gates](/wiki/Bill_Gates \"Bill Gates\")\n",
|
||||
"* [Paul Allen](/wiki/Paul_Allen \"Paul Allen\")\n",
|
||||
" |\n",
|
||||
"| Headquarters | [One Microsoft Way](/wiki/Microsoft_campus \"Microsoft campus\")[Redmond, Washington](/wiki/Redmond,_Washington \"Redmond, Washington\"), U.S. |\n",
|
||||
"| Founded | April 4, 1975; 48 years ago (1975-04-04) in [Albuquerque, New Mexico](/wiki/Albuquerque%2C_New_Mexico \"Albuquerque, New Mexico\"), U.S. |\n",
|
||||
"| Founders | * [Bill Gates](/wiki/Bill_Gates \"Bill Gates\") * [Paul Allen](/wiki/Paul_Allen \"Paul Allen\") |\n",
|
||||
"| Headquarters | [One Microsoft Way](/wiki/One_Microsoft_Way \"One Microsoft Way\"), [Redmond, Washington](/wiki/Redmond%2C_Washington \"Redmond, Washington\"), U.S. |\n",
|
||||
"| Area served | Worldwide |\n",
|
||||
"| Key people | * [Satya Nadella](/wiki/Satya_Nadella \"Satya Nadella\")([Chairman](/wiki/Chairman \"Chairman\") & [CEO](/wiki/Chief_executive_officer \"Chief executive officer\"))\n",
|
||||
"* [Brad Smith](/wiki/Brad_Smith_(American_lawyer) \"Brad Smith (American lawyer)\")([Vice Chairman](/wiki/Vice-Chairman \"Vice-Chairman\") & [President](/wiki/President_(corporate_title) \"President (corporate title)\"))\n",
|
||||
"* [Bill Gates](/wiki/Bill_Gates \"Bill Gates\")([technical adviser](/wiki/Adviser \"Adviser\"))\n",
|
||||
" |\n",
|
||||
"| Products | * [Software development](/wiki/Software_development \"Software development\")\n",
|
||||
"* [Computer hardware](/wiki/Computer_hardware \"Computer hardware\")\n",
|
||||
"* [Consumer electronics](/wiki/Consumer_electronics \"Consumer electronics\")\n",
|
||||
"* [Social networking service](/wiki/Social_networking_service \"Social networking service\")\n",
|
||||
"* [Cloud computing](/wiki/Cloud_computing \"Cloud computing\")\n",
|
||||
"* [Video games](/wiki/Video_game_industry \"Video game industry\")\n",
|
||||
"* [Internet](/wiki/Internet \"Internet\")\n",
|
||||
"* [Corporate venture capital](/wiki/Corporate_venture_capital \"Corporate venture capital\")\n",
|
||||
" |\n",
|
||||
"| Brands | \n",
|
||||
"* [Windows](/wiki/Microsoft_Windows \"Microsoft Windows\")\n",
|
||||
"* [Microsoft 365](/wiki/Microsoft_365 \"Microsoft 365\")\n",
|
||||
"* [Skype](/wiki/Skype \"Skype\")\n",
|
||||
"* [Visual Studio](/wiki/Visual_Studio \"Visual Studio\")\n",
|
||||
"* [Xbox](/wiki/Xbox \"Xbox\")\n",
|
||||
"* [Dynamics](/wiki/Microsoft_Dynamics_365 \"Microsoft Dynamics 365\")\n",
|
||||
"* [Surface](/wiki/Microsoft_Surface \"Microsoft Surface\")\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| Services | \n",
|
||||
"* [Edge](/wiki/Microsoft_Edge \"Microsoft Edge\")\n",
|
||||
"* [Azure](/wiki/Microsoft_Azure \"Microsoft Azure\")\n",
|
||||
"* [Bing](/wiki/Microsoft_Bing \"Microsoft Bing\")\n",
|
||||
"* [LinkedIn](/wiki/LinkedIn \"LinkedIn\")\n",
|
||||
"* [Yammer](/wiki/Yammer \"Yammer\")\n",
|
||||
"* [Microsoft 365](/wiki/Microsoft_365 \"Microsoft 365\")\n",
|
||||
"* [OneDrive](/wiki/OneDrive \"OneDrive\")\n",
|
||||
"* [Outlook](/wiki/Microsoft_Outlook \"Microsoft Outlook\")\n",
|
||||
"* [GitHub](/wiki/GitHub \"GitHub\")\n",
|
||||
"* [Microsoft Store](/wiki/Microsoft_Store_(digital) \"Microsoft Store (digital)\")\n",
|
||||
"* [Windows Update](/wiki/Windows_Update \"Windows Update\")\n",
|
||||
"* [Xbox Game Pass](/wiki/Xbox_Game_Pass \"Xbox Game Pass\")\n",
|
||||
"* [Xbox network](/wiki/Xbox_network \"Xbox network\")\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| Key people | * [Satya Nadella](/wiki/Satya_Nadella \"Satya Nadella\")([Chairman](/wiki/Chairman \"Chairman\") & [CEO](/wiki/Chief_executive_officer \"Chief executive officer\")) * [Brad Smith](/wiki/Brad_Smith_%28American_lawyer%29 \"Brad Smith (American lawyer)\")([Vice Chairman](/wiki/Vice-Chairman \"Vice-Chairman\") & [President](/wiki/President_%28corporate_title%29 \"President (corporate title)\")) * Bill Gates([technical adviser](/wiki/Adviser \"Adviser\")) |\n",
|
||||
"| Products | * [Software development](/wiki/Software_development \"Software development\") * [Computer hardware](/wiki/Computer_hardware \"Computer hardware\") * [Consumer electronics](/wiki/Consumer_electronics \"Consumer electronics\") * [Social networking service](/wiki/Social_networking_service \"Social networking service\") * [Cloud computing](/wiki/Cloud_computing \"Cloud computing\") * [Video games](/wiki/Video_game_industry \"Video game industry\") * [Internet](/wiki/Internet \"Internet\") * [Corporate venture capital](/wiki/Corporate_venture_capital \"Corporate venture capital\") |\n",
|
||||
"| Brands | * [Windows](/wiki/Microsoft_Windows \"Microsoft Windows\") * [Microsoft 365](/wiki/Microsoft_365 \"Microsoft 365\") * [Skype](/wiki/Skype \"Skype\") * [Visual Studio](/wiki/Visual_Studio \"Visual Studio\") * [Xbox](/wiki/Xbox \"Xbox\") * [Dynamics](/wiki/Microsoft_Dynamics_365 \"Microsoft Dynamics 365\") * [Surface](/wiki/Microsoft_Surface \"Microsoft Surface\") |\n",
|
||||
"| Services | * [Edge](/wiki/Microsoft_Edge \"Microsoft Edge\") * [Azure](/wiki/Microsoft_Azure \"Microsoft Azure\") * [Bing](/wiki/Microsoft_Bing \"Microsoft Bing\") * [LinkedIn](/wiki/LinkedIn \"LinkedIn\") * [Yammer](/wiki/Yammer \"Yammer\") * [Microsoft 365](/wiki/Microsoft_365 \"Microsoft 365\") * [OneDrive](/wiki/OneDrive \"OneDrive\") * [Outlook](/wiki/Microsoft_Outlook \"Microsoft Outlook\") * [GitHub](/wiki/GitHub \"GitHub\") * [Microsoft Store](/wiki/Microsoft_Store_%28digital%29 \"Microsoft Store (digital)\") * [Windows Update](/wiki/Windows_Update \"Windows Update\") * [Xbox Game Pass](/wiki/Xbox_Game_Pass \"Xbox Game Pass\") * [Xbox network](/wiki/Xbox_network \"Xbox network\") |\n",
|
||||
"| Revenue | Increase [US$](/wiki/United_States_dollar \"United States dollar\")211.9 billion (2023) |\n",
|
||||
"| [Operating income](/wiki/Earnings_before_interest_and_taxes \"Earnings before interest and taxes\") | Increase US$88.5 billion (2023) |\n",
|
||||
"| [Net income](/wiki/Net_income \"Net income\") | Increase US$73.4 billion (2023) |\n",
|
||||
"| [Total assets](/wiki/Asset \"Asset\") | Increase US$411.9 billion (2023) |\n",
|
||||
"| [Total equity](/wiki/Equity_(finance) \"Equity \n",
|
||||
"| [Total equity](/wiki/Equity_%28finance%29 \"Equity (finance)\") | Increase US$206.2 billion (2023) |\n",
|
||||
"| Number of employees | 221,000 (2023) |\n",
|
||||
"| [Divisions](/wiki/Division_%28business%29 \"Division (business)\") | * [Microsoft Engineering Groups](/wiki/Microsoft_engineering_groups \"Microsoft engineering groups\") * [Microsoft Digital Crimes Unit](/wiki/Microsoft_Digital_Crimes_Unit \"Microsoft Digital Crimes Unit\") * [Microsoft Press](/wiki/Microsoft_Press \"Microsoft Press\") * [Microsoft Gaming](/wiki/Microsoft_Gaming \"Microsoft Gaming\") * Microsoft AI |\n",
|
||||
"| [Subsidiaries](/wiki/Subsidiary \"Subsidiary\") | * [Microsoft Japan](/wiki/Microsoft_Japan \"Microsoft Japan\") * [Microsoft India](/wiki/Microsoft_India \"Microsoft India\") * [Microsoft Egypt](/wiki/Microsoft_Egypt \"Microsoft Egypt\") * [GitHub](/wiki/GitHub \"GitHub\") * [LinkedIn](/wiki/LinkedIn \"LinkedIn\") * [Metaswitch](/wiki/Metaswitch \"Metaswitch\") * [Nuance Communications](/wiki/Nuance_Communications \"Nuance Communications\") * [RiskIQ](/wiki/RiskIQ \"RiskIQ\") * [Skype Technologies](/wiki/Skype_Technologies \"Skype Technologies\") * [Xamarin](/wiki/Xamarin \"Xamarin\") * [Xandr](/wiki/Xandr \"Xandr\") |\n",
|
||||
"| | |\n",
|
||||
"| [ASN](/wiki/Autonomous_System_Number \"Autonomous System Number\") | * [8075](https://bgp.tools/as/8075) |\n",
|
||||
"| | |\n",
|
||||
"| Website | [microsoft.com](https://www.microsoft.com/) |\n",
|
||||
"| **Footnotes / references**Financials as of June 30, 2023[[update]](https://en.wikipedia.org/w/index.php?title=Microsoft&action=edit)[[1]](#cite_note-1) | |\n",
|
||||
"\n",
|
||||
"| | [Bill Gates in 2023](/wiki/File%3ABill_Gates_2017_%28cropped%29.jpg) | This article is part of a series about [Bill Gates](/wiki/Bill_Gates \"Bill Gates\") | | --- | --- | |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| * [Awards and honors](/wiki/Bill_Gates#Recognition \"Bill Gates\") * [Philanthropy](/wiki/Bill_Gates#Philanthropy \"Bill Gates\") * [Political positions](/wiki/Bill_Gates#Political_positions \"Bill Gates\") * [Public image](/wiki/Bill_Gates#Public_image \"Bill Gates\") * [Residence](/wiki/Bill_Gates%27s_house \"Bill Gates's house\") --- Companies* [Traf-O-Data](/wiki/Traf-O-Data \"Traf-O-Data\") * Microsoft ([criticism](/wiki/Criticism_of_Microsoft \"Criticism of Microsoft\")) * [BEN](/wiki/Branded_Entertainment_Network \"Branded Entertainment Network\") * [Cascade Investment](/wiki/Cascade_Investment \"Cascade Investment\") * [TerraPower](/wiki/TerraPower \"TerraPower\") * [Gates Ventures](/wiki/Gates_Ventures \"Gates Ventures\") --- Charitable organizations* [Bill & Melinda Gates Foundation](/wiki/Bill_%26_Melinda_Gates_Foundation \"Bill & Melinda Gates Foundation\") * [Match for Africa](/wiki/Match_for_Africa \"Match for Africa\") * [The Giving Pledge](/wiki/The_Giving_Pledge \"The Giving Pledge\") * [OER Project](/wiki/OER_Project \"OER Project\") * [Breakthrough Energy](/wiki/Breakthrough_Energy \"Breakthrough Energy\") * [Mission Innovation](/wiki/Mission_Innovation \"Mission Innovation\") --- Writings* \"[An Open Letter to Hobbyists](/wiki/An_Open_Letter_to_Hobbyists \"An Open Letter to Hobbyists\")\" * *[The Road Ahead](/wiki/The_Road_Ahead_%28Gates_book%29 \"The Road Ahead (Gates book)\")* * *[Business @ the Speed of Thought](/wiki/Business_%40_the_Speed_of_Thought \"Business @ the Speed of Thought\")* * *[How to Avoid a Climate Disaster](/wiki/How_to_Avoid_a_Climate_Disaster \"How to Avoid a Climate Disaster\")* * *[How to Prevent the Next Pandemic](/wiki/How_to_Prevent_the_Next_Pandemic \"How to Prevent the Next Pandemic\")* --- Related* [Bill Gates' flower fly](/wiki/Bill_Gates%27_flower_fly \"Bill Gates' flower fly\") * [Codex Leicester](/wiki/Codex_Leicester \"Codex Leicester\") * *[Lost on the Grand Banks](/wiki/Lost_on_the_Grand_Banks \"Lost on the Grand Banks\")* * [History of Microsoft](/wiki/History_of_Microsoft \"History of Microsoft\") * [Timeline of Microsoft](/wiki/Timeline_of_Microsoft \"Timeline of Microsoft\") * [Paul Allen](/wiki/Paul_Allen \"Paul Allen\") --- |\n",
|
||||
"| * [v](/wiki/Template%3ABill_Gates_series \"Template:Bill Gates series\") * [t](/wiki/Template_talk%3ABill_Gates_series \"Template talk:Bill Gates series\") * [e](/wiki/Special%3AEditPage/Template%3ABill_Gates_series \"Special:EditPage/Template:Bill \n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
|
@ -445,7 +594,7 @@
|
|||
],
|
||||
"source": [
|
||||
"task4 = \"\"\"Find Microsoft's Wikipedia page.\"\"\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task4, clear_history=False)"
|
||||
"user_proxy.initiate_chat(web_surfer, message=task4, clear_history=False);"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -469,98 +618,40 @@
|
|||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"Address: https://en.wikipedia.org/wiki/Microsoft\n",
|
||||
"Title: Microsoft - Wikipedia\n",
|
||||
"Viewport position: Showing page 2 of 64.\n",
|
||||
"Title: Microsoft\n",
|
||||
"Viewport position: Showing page 2 of 34.\n",
|
||||
"=======================\n",
|
||||
"(finance)\") | Increase US$206.2 billion (2023) |\n",
|
||||
"| Number of employees | 238,000 (2023) |\n",
|
||||
"| [Divisions](/wiki/Division_(business) \"Division (business)\") | \n",
|
||||
"* [Microsoft Engineering Groups](/wiki/Microsoft_engineering_groups \"Microsoft engineering groups\")\n",
|
||||
"* [Microsoft Digital Crimes Unit](/wiki/Microsoft_Digital_Crimes_Unit \"Microsoft Digital Crimes Unit\")\n",
|
||||
"* [Microsoft Press](/wiki/Microsoft_Press \"Microsoft Press\")\n",
|
||||
"* [Microsoft Japan](/wiki/Microsoft_Japan \"Microsoft Japan\")\n",
|
||||
"* [Microsoft Gaming](/wiki/Microsoft_Gaming \"Microsoft Gaming\")\n",
|
||||
"Gates series\") |\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| [Subsidiaries](/wiki/Subsidiary \"Subsidiary\") | \n",
|
||||
"* [GitHub](/wiki/GitHub \"GitHub\")\n",
|
||||
"* [LinkedIn](/wiki/LinkedIn \"LinkedIn\")\n",
|
||||
"* [Metaswitch](/wiki/Metaswitch \"Metaswitch\")\n",
|
||||
"* [Nuance Communications](/wiki/Nuance_Communications \"Nuance Communications\")\n",
|
||||
"* [RiskIQ](/wiki/RiskIQ \"RiskIQ\")\n",
|
||||
"* [Skype Technologies](/wiki/Skype_Technologies \"Skype Technologies\")\n",
|
||||
"* [OpenAI](/wiki/OpenAI \"OpenAI\") (49%)[[1]](#cite_note-1)\n",
|
||||
"* [Xamarin](/wiki/Xamarin \"Xamarin\")\n",
|
||||
"* [Xandr](/wiki/Xandr \"Xandr\")\n",
|
||||
"**Microsoft Corporation** is an American [multinational corporation](/wiki/Multinational_corporation \"Multinational corporation\") and [technology company](/wiki/Technology_company \"Technology company\") headquartered in [Redmond, Washington](/wiki/Redmond%2C_Washington \"Redmond, Washington\").[[2]](#cite_note-2) Microsoft's best-known [software products](/wiki/List_of_Microsoft_software \"List of Microsoft software\") are the [Windows](/wiki/Microsoft_Windows \"Microsoft Windows\") line of [operating systems](/wiki/List_of_Microsoft_operating_systems \"List of Microsoft operating systems\"), the [Microsoft 365](/wiki/Microsoft_365 \"Microsoft 365\") suite of productivity applications, and the [Edge](/wiki/Microsoft_Edge \"Microsoft Edge\") web browser. Its flagship [hardware products](/wiki/List_of_Microsoft_hardware \"List of Microsoft hardware\") are the [Xbox](/wiki/Xbox \"Xbox\") video game consoles and the [Microsoft Surface](/wiki/Microsoft_Surface \"Microsoft Surface\") lineup of [touchscreen](/wiki/Touchscreen \"Touchscreen\") personal computers. Microsoft ranked No. 14 in the 2022 [Fortune 500](/wiki/Fortune_500 \"Fortune 500\") rankings of the largest United States corporations by total revenue;[[3]](#cite_note-3) and it was the world's [largest software maker](/wiki/List_of_the_largest_software_companies \"List of the largest software companies\") by revenue in 2022 according to [Forbes Global 2000](/wiki/Forbes_Global_2000 \"Forbes Global 2000\"). It is considered one of the [Big Five](/wiki/Big_Tech \"Big Tech\") American [information technology](/wiki/Information_technology \"Information technology\") companies, alongside [Alphabet](/wiki/Alphabet_Inc. \"Alphabet Inc.\") (parent company of [Google](/wiki/Google \"Google\")), [Amazon](/wiki/Amazon_%28company%29 \"Amazon (company)\"), [Apple](/wiki/Apple_Inc. \"Apple Inc.\"), and [Meta](/wiki/Meta_Platforms \"Meta Platforms\") (parent company of [Facebook](/wiki/Facebook \"Facebook\")).\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| |\n",
|
||||
"| [ASN](/wiki/Autonomous_System_Number \"Autonomous System Number\") | * [8075](https://bgp.tools/as/8075)\n",
|
||||
" |\n",
|
||||
"| |\n",
|
||||
"| Website | [microsoft.com](https://www.microsoft.com/) |\n",
|
||||
"| **Footnotes / references**Financials as of June 30, 2023[[update]](https://en.wikipedia.org/w/index.php?title=Microsoft&action=edit)[[2]](#cite_note-2) |\n",
|
||||
"Microsoft was founded by [Bill Gates](/wiki/Bill_Gates \"Bill Gates\") and [Paul Allen](/wiki/Paul_Allen \"Paul Allen\") on April 4, 1975, to develop and sell [BASIC interpreters](/wiki/BASIC_interpreter \"BASIC interpreter\") for the [Altair 8800](/wiki/Altair_8800 \"Altair 8800\"). It rose to dominate the personal computer operating system market with [MS-DOS](/wiki/MS-DOS \"MS-DOS\") in the mid-1980s, followed by Windows. The company's 1986 [initial public offering](/wiki/Initial_public_offering \"Initial public offering\") (IPO) and subsequent rise in its share price created three billionaires and an estimated 12,000 millionaires among Microsoft employees. Since the 1990s, it has increasingly diversified from the operating system market and has made several [corporate acquisitions](/wiki/List_of_mergers_and_acquisitions_by_Microsoft \"List of mergers and acquisitions by Microsoft\"), the largest being the [acquisition](/wiki/Acquisition_of_Activision_Blizzard_by_Microsoft \"Acquisition of Activision Blizzard by Microsoft\") of [Activision Blizzard](/wiki/Activision_Blizzard \"Activision Blizzard\") for $68.7 billion in October 2023,[[4]](#cite_note-4) followed by its acquisition of [LinkedIn](/wiki/LinkedIn \"LinkedIn\") for $26.2 billion in December 2016,[[5]](#cite_note-5) and its acquisition of [Skype Technologies](/wiki/Skype_Technologies \"Skype Technologies\") for $8.5 billion in May 2011.[[6]](#cite_note-6)\n",
|
||||
"\n",
|
||||
"| | | |\n",
|
||||
"| --- | --- | --- |\n",
|
||||
"| \n",
|
||||
"As of 2015[[update]](https://en.wikipedia.org/w/index.php?title=Microsoft&action=edit), Microsoft is market-dominant in the [IBM PC compatible](/wiki/IBM_PC_compatible \"IBM PC compatible\") operating system market and the office software suite market, although it has lost the majority of the overall operating system market to [Android](/wiki/Android_%28operating_system%29 \"Android (operating system)\").[[7]](#cite_note-7) The company also produces a wide range of other consumer and enterprise software for desktops, laptops, tabs, gadgets, and servers, including [Internet search](/wiki/Web_search_engine \"Web search engine\") (with [Bing](/wiki/Microsoft_Bing \"Microsoft Bing\")), the digital services market (through [MSN](/wiki/MSN \"MSN\")), [mixed reality](/wiki/Mixed_reality \"Mixed reality\") ([HoloLens](/wiki/Microsoft_HoloLens \"Microsoft HoloLens\")), cloud computing ([Azure](/wiki/Microsoft_Azure \"Microsoft Azure\")), and software development ([Visual Studio](/wiki/Microsoft_Visual_Studio \"Microsoft Visual Studio\")).\n",
|
||||
"\n",
|
||||
"| | |\n",
|
||||
"| --- | --- |\n",
|
||||
"| [Bill Gates in 2023](/wiki/File:Bill_Gates_2017_(cropped).jpg) | This article is part of a series about\n",
|
||||
"[Bill Gates](/wiki/Bill_Gates \"Bill Gates\") |\n",
|
||||
"[Steve Ballmer](/wiki/Steve_Ballmer \"Steve Ballmer\") replaced Gates as CEO in 2000 and later envisioned a \"devices and services\" strategy.[[8]](#cite_note-8) This unfolded with Microsoft acquiring [Danger Inc.](/wiki/Danger_Inc. \"Danger Inc.\") in 2008,[[9]](#cite_note-9) entering the personal computer production market for the first time in June 2012 with the launch of the Microsoft Surface line of [tablet computers](/wiki/Tablet_computer \"Tablet computer\"), and later forming [Microsoft Mobile](/wiki/Microsoft_Mobile \"Microsoft Mobile\") through the acquisition of [Nokia](/wiki/Nokia \"Nokia\")'s devices and services division. Since [Satya Nadella](/wiki/Satya_Nadella \"Satya Nadella\") took over as CEO in 2014, the company has scaled back on hardware and instead focused on [cloud computing](/wiki/Cloud_computing \"Cloud computing\"), a move that helped the company's [shares](/wiki/Share_%28finance%29 \"Share (finance)\") reach their highest value since December 1999.[[10]](#cite_note-10)[[11]](#cite_note-11) Under Nadella's direction, the company has also heavily expanded its gaming business to support the Xbox brand, establishing the [Microsoft Gaming](/wiki/Microsoft_Gaming \"Microsoft Gaming\") division in 2022, dedicated to operating Xbox in addition to its three subsidiaries ([publishers](/wiki/Video_game_publisher \"Video game publisher\")). Microsoft Gaming is the third-largest gaming company in the world by revenue as of 2023.[[12]](#cite_note-12)\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| * [Awards and honors](/wiki/Bill_Gates#Recognition \"Bill Gates\")\n",
|
||||
"* [Philanthropy](/wiki/Bill_Gates#Philanthropy \"Bill Gates\")\n",
|
||||
"* [Political positions](/wiki/Bill_Gates#Political_positions \"Bill Gates\")\n",
|
||||
"* [Public image](/wiki/Bill_Gates#Public_image \"Bill Gates\")\n",
|
||||
"* [Residence](/wiki/Bill_Gates%27s_house \"Bill Gates's house\")\n",
|
||||
"Earlier dethroned by Apple in 2010, and in 2018, Microsoft reclaimed[*[when?](/wiki/Wikipedia%3AManual_of_Style/Dates_and_numbers#Chronological_items \"Wikipedia:Manual of Style/Dates and numbers\")*] its position as the most valuable publicly traded company in the world.[[13]](#cite_note-13) In April 2019, Microsoft reached a trillion-dollar [market cap](/wiki/Market_capitalization \"Market capitalization\"), becoming the third U.S. public company to be [valued at over $1 trillion](/wiki/Trillion-dollar_company \"Trillion-dollar company\") after Apple and Amazon, respectively. As of 2023[[update]](https://en.wikipedia.org/w/index.php?title=Microsoft&action=edit), Microsoft has the [third-highest](/wiki/List_of_most_valuable_brands \"List of most valuable brands\") global [brand valuation](/wiki/Brand_valuation \"Brand valuation\").\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"Microsoft [has been criticized](/wiki/Criticism_of_Microsoft \"Criticism of Microsoft\") for its monopolistic practices and the company's software has been criticized for problems with [ease of use](/wiki/Ease_of_use \"Ease of use\"), [robustness](/wiki/Robustness_%28computer_science%29 \"Robustness (computer science)\"), and [security](/wiki/Computer_security \"Computer security\").\n",
|
||||
"\n",
|
||||
"Companies* [Traf-O-Data](/wiki/Traf-O-Data \"Traf-O-Data\")\n",
|
||||
"* Microsoft ([criticism](/wiki/Criticism_of_Microsoft \"Criticism of Microsoft\"))\n",
|
||||
"* [BEN](/wiki/Branded_Entertainment_Network \"Branded Entertainment Network\")\n",
|
||||
"* [Cascade Investment](/wiki/Cascade_Investment \"Cascade Investment\")\n",
|
||||
"* [TerraPower](/wiki/TerraPower \"TerraPower\")\n",
|
||||
"* [Gates Ventures](/wiki/Gates_Ventures \"Gates Ventures\")\n",
|
||||
"## History\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"Main article: [History of Microsoft](/wiki/History_of_Microsoft \"History of Microsoft\")\n",
|
||||
"For a chronological guide, see [Timeline of Microsoft](/wiki/Timeline_of_Microsoft \"Timeline of Microsoft\").\n",
|
||||
"\n",
|
||||
"Charitable organizations* [Bill & Melinda Gates Foundation](/wiki/Bill_%26_Melinda_Gates_Foundation \"Bill & Melinda Gates Foundation\")\n",
|
||||
"* [Match for Africa](/wiki/Match_for_Africa \"Match for Africa\")\n",
|
||||
"* [The Giving Pledge](/wiki/The_Giving_Pledge \"The Giving Pledge\")\n",
|
||||
"* [OER Project](/wiki/OER_Project \"OER Project\")\n",
|
||||
"* [Breakthrough Energy](/wiki/Breakthrough_Energy \"Breakthrough Energy\")\n",
|
||||
"* [Mission Innovation](/wiki/Mission_Innovation \"Mission Innovation\")\n",
|
||||
"### 1972–1985: Founding\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"[![](//upload.wikimedia.org/wikipedia/commons/thumb/d/d7/Altair_8800_and_Model_33_ASR_Teletype_.jpg/256px-Altair_8800_and_Model_33_ASR_Teletype_.jpg)](/wiki/File%3AAltair_8800_and_Model_33_ASR_Teletype_.jpg)\n",
|
||||
"\n",
|
||||
"Writings* \"[An Open Letter to Hobbyists](/wiki/An_Open_Letter_to_Hobbyists \"An Open Letter to Hobbyists\")\"\n",
|
||||
"* *[The Road Ahead](/wiki/The_Road_Ahead_(Gates_book) \"The Road Ahead (Gates book)\")*\n",
|
||||
"* *[Business @ the Speed of Thought](/wiki/Business_@_the_Speed_of_Thought \"Business @ the Speed of Thought\")*\n",
|
||||
"* *[How to Avoid a Climate Disaster](/wiki/How_to_Avoid_a_Climate_Disaster \"How to Avoid a Climate Disaster\")*\n",
|
||||
"* *[How to Prevent the Next Pandemic](/wiki/How_to_Prevent_the_Next_Pandemic \"How to Prevent the Next Pandemic\")*\n",
|
||||
"An Altair 8800 computer (left) with the popular Model 33 ASR Teletype as terminal, paper tape reader, and paper tape punch\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"[![](//upload.wikimedia.org/wikipedia/en/thumb/4/4f/1981BillPaul.jpg/220px-1981BillPaul.jpg)](/wiki/File%3A1981BillPaul.jpg)\n",
|
||||
"\n",
|
||||
"Related* [Bill Gates' flower fly](/wiki/Bill_Gates%27_flower_fly \"Bill Gates' flower fly\")\n",
|
||||
"* [Codex Leicester](/wiki/Codex_Leicester \"Codex Leicester\")\n",
|
||||
"* *[Lost on the Grand Banks](/wiki/Lost_on_the_Grand_Banks \"Lost on the Grand Banks\")*\n",
|
||||
"* [History of Microsoft](/wiki/History_of_Microsoft \"History of Microsoft\")\n",
|
||||
"* [Timeline of Microsoft](/wiki/Timeline_of_Microsoft \"Timeline of Microsoft\")\n",
|
||||
"* [Paul Allen](/wiki/Paul_Allen \"Paul Allen\")\n",
|
||||
"[Paul Allen](/wiki/Paul_Allen \"Paul Allen\") and [Bill Gates](/wiki/Bill_Gates \"Bill Gates\") on October 19, 1981, after signing a pivotal contract with [IBM](/wiki/IBM \"IBM\")[[14]](#cite_note-Allan_2001-14): 228\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"[![](//upload.wikimedia.org/wikipedia/commons/thumb/f/f1/Bill_Gates_and_Paul_Allen_Business_Cards.jpg/220px-Bill_Gates_and_Paul_Allen_Business_Cards.jpg)](/wiki/File%3ABill_Gates_and_Paul_Allen_Business_Cards.jpg)\n",
|
||||
"\n",
|
||||
" |\n",
|
||||
"| * [v](/wiki/Template:Bill_Gates_series \"Template:Bill Gates series\")\n",
|
||||
"* [t](/wiki/Template_talk:Bill_Gates_series \"Template talk:Bill Gates series\")\n",
|
||||
"* [e](/wiki/Special:EditPage/Template:Bill_Gates_series \"Special:EditPage/Template:Bill Gates series\")\n",
|
||||
" |\n",
|
||||
"\n",
|
||||
"**Microsoft Corporation** is an American multinational [technology corporation](/wiki/Technology_company \n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
|
@ -568,7 +659,7 @@
|
|||
],
|
||||
"source": [
|
||||
"task5 = \"\"\"Scroll down.\"\"\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task5, clear_history=False)"
|
||||
"user_proxy.initiate_chat(web_surfer, message=task5, clear_history=False);"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -582,24 +673,24 @@
|
|||
"text": [
|
||||
"\u001b[33muser_proxy\u001b[0m (to web_surfer):\n",
|
||||
"\n",
|
||||
"Where was the first office location, and when did they move to Redmond?\n",
|
||||
"Read the page and answer: Where was the first office location, and when did they move to Redmond?\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n",
|
||||
"\u001b[31m\n",
|
||||
">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
|
||||
"\u001b[35m\n",
|
||||
">>>>>>>> EXECUTING FUNCTION answer_from_page...\u001b[0m\n",
|
||||
">>>>>>>> EXECUTING FUNCTION read_page_and_answer...\u001b[0m\n",
|
||||
"\u001b[33mweb_surfer\u001b[0m (to user_proxy):\n",
|
||||
"\n",
|
||||
"Microsoft's first office location was in Albuquerque, New Mexico, where it was founded on April 4, 1975. However, Microsoft later moved its headquarters to Redmond, Washington in January 1979. Since then, Redmond has been the main office location for Microsoft.\n",
|
||||
"Microsoft Corporation, an American multinational technology company, was founded on April 4, 1975, in Albuquerque, New Mexico, by Bill Gates and Paul Allen. The company's first office location was in Albuquerque, but they later moved their headquarters to Redmond, Washington. The move to Redmond occurred in January 1979. Since then, Microsoft has become a major player in the technology industry, developing and selling software products such as the Windows operating system, Microsoft Office suite, and Xbox video game consoles. They have also expanded into cloud computing with Microsoft Azure and have made acquisitions such as Nokia's mobile unit and LinkedIn.\n",
|
||||
"\n",
|
||||
"--------------------------------------------------------------------------------\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"task6 = \"\"\"Where was the first office location, and when did they move to Redmond?\"\"\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task6, clear_history=False)"
|
||||
"task6 = \"\"\"Read the page and answer: Where was the first office location, and when did they move to Redmond?\"\"\"\n",
|
||||
"user_proxy.initiate_chat(web_surfer, message=task6, clear_history=False);"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
|
|
@ -0,0 +1,45 @@
|
|||
import os
|
||||
|
||||
from autogen import UserProxyAgent, config_list_from_json
|
||||
from autogen.agentchat.contrib.web_surfer import WebSurferAgent
|
||||
from autogen.browser_utils import (
|
||||
BingMarkdownSearch,
|
||||
PlaywrightMarkdownBrowser,
|
||||
RequestsMarkdownBrowser,
|
||||
SeleniumMarkdownBrowser,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
# Load LLM inference endpoints from an env variable or a file
|
||||
# See https://microsoft.github.io/autogen/docs/FAQ#set-your-api-endpoints
|
||||
# and OAI_CONFIG_LIST_sample.
|
||||
# For example, if you have created a OAI_CONFIG_LIST file in the current working directory, that file will be used.
|
||||
config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")
|
||||
|
||||
browser = RequestsMarkdownBrowser(
|
||||
# PlaywrightMarkdownBrowser(
|
||||
viewport_size=1024 * 3,
|
||||
downloads_folder=os.getcwd(),
|
||||
search_engine=BingMarkdownSearch(bing_api_key=os.environ["BING_API_KEY"]),
|
||||
# launch_args={"channel": "msedge", "headless": False},
|
||||
)
|
||||
|
||||
web_surfer = WebSurferAgent(
|
||||
"web_surfer",
|
||||
llm_config={"config_list": config_list},
|
||||
summarizer_llm_config={"config_list": config_list},
|
||||
is_termination_msg=lambda x: x.get("content", "").rstrip().find("TERMINATE") >= 0,
|
||||
code_execution_config=False,
|
||||
browser=browser,
|
||||
)
|
||||
|
||||
# Create the agent that represents the user in the conversation.
|
||||
user_proxy = UserProxyAgent("user", code_execution_config=False)
|
||||
|
||||
# Let the assistant start the conversation. It will end when the user types exit.
|
||||
web_surfer.initiate_chat(user_proxy, message="How can I help you today?")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
15
setup.py
15
setup.py
|
@ -81,7 +81,20 @@ extra_require = {
|
|||
"graph": ["networkx", "matplotlib"],
|
||||
"gemini": ["google-generativeai>=0.5,<1", "google-cloud-aiplatform", "google-auth", "pillow", "pydantic"],
|
||||
"together": ["together>=1.2"],
|
||||
"websurfer": ["beautifulsoup4", "markdownify", "pdfminer.six", "pathvalidate"],
|
||||
"websurfer": [
|
||||
"beautifulsoup4",
|
||||
"markdownify",
|
||||
"pathvalidate",
|
||||
# for mdconvert
|
||||
"puremagic", # File identification
|
||||
"binaryornot", # More file identification
|
||||
"pdfminer.six", # Pdf
|
||||
"mammoth", # Docx
|
||||
"python-pptx", # Ppts
|
||||
"pandas", # Xlsx
|
||||
"openpyxl",
|
||||
"youtube_transcript_api==0.6.0", # Transcription
|
||||
],
|
||||
"redis": ["redis"],
|
||||
"cosmosdb": ["azure-cosmos>=4.2.0"],
|
||||
"websockets": ["websockets>=12.0,<13"],
|
||||
|
|
|
@ -81,16 +81,9 @@ def test_web_surfer() -> None:
|
|||
response = function_map["page_down"]()
|
||||
assert f"Viewport position: Showing page {total_pages} of {total_pages}." in response
|
||||
|
||||
# Test web search -- we don't have a key in this case, so we expect it to raise an error (but it means the code path is correct)
|
||||
with pytest.raises(ValueError, match="Missing Bing API key."):
|
||||
response = function_map["informational_web_search"](BING_QUERY)
|
||||
|
||||
with pytest.raises(ValueError, match="Missing Bing API key."):
|
||||
response = function_map["navigational_web_search"](BING_QUERY)
|
||||
|
||||
# Test Q&A and summarization -- we don't have a key so we expect it to fail (but it means the code path is correct)
|
||||
with pytest.raises(IndexError):
|
||||
response = function_map["answer_from_page"]("When was it founded?")
|
||||
response = function_map["read_page_and_answer"]("When was it founded?")
|
||||
|
||||
with pytest.raises(IndexError):
|
||||
response = function_map["summarize_page"]()
|
||||
|
@ -155,7 +148,7 @@ def test_web_surfer_bing() -> None:
|
|||
"config_list": [
|
||||
{
|
||||
"model": "gpt-3.5-turbo-16k",
|
||||
"api_key": "sk-PLACEHOLDER_KEY",
|
||||
"api_key": MOCK_OPEN_AI_API_KEY,
|
||||
}
|
||||
]
|
||||
},
|
||||
|
@ -167,7 +160,7 @@ def test_web_surfer_bing() -> None:
|
|||
|
||||
# Test informational queries
|
||||
response = function_map["informational_web_search"](BING_QUERY)
|
||||
assert f"Address: bing: {BING_QUERY}" in response
|
||||
assert f"Address: search: {BING_QUERY}" in response
|
||||
assert f"Title: {BING_QUERY} - Search" in response
|
||||
assert "Viewport position: Showing page 1 of 1." in response
|
||||
assert f"A Bing search for '{BING_QUERY}' found " in response
|
||||
|
|
|
@ -0,0 +1,49 @@
|
|||
#!/usr/bin/env python3 -m pytest
|
||||
import os
|
||||
|
||||
import pytest
|
||||
|
||||
try:
|
||||
from autogen.browser_utils import BingMarkdownSearch
|
||||
except ImportError:
|
||||
skip_all = True
|
||||
else:
|
||||
skip_all = False
|
||||
|
||||
bing_api_key = None
|
||||
if "BING_API_KEY" in os.environ:
|
||||
bing_api_key = os.environ["BING_API_KEY"]
|
||||
del os.environ["BING_API_KEY"]
|
||||
skip_api = bing_api_key is None
|
||||
|
||||
BING_QUERY = "Microsoft wikipedia"
|
||||
BING_STRING = f"A Bing search for '{BING_QUERY}' found"
|
||||
BING_EXPECTED_RESULT = "https://en.wikipedia.org/wiki/Microsoft"
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_all,
|
||||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_bing_markdown_search():
|
||||
search_engine = BingMarkdownSearch()
|
||||
results = search_engine.search(BING_QUERY)
|
||||
assert BING_STRING in results
|
||||
assert BING_EXPECTED_RESULT in results
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_api,
|
||||
reason="skipping tests that require a Bing API key",
|
||||
)
|
||||
def test_bing_markdown_search_api():
|
||||
search_engine = BingMarkdownSearch(bing_api_key=bing_api_key)
|
||||
results = search_engine.search(BING_QUERY)
|
||||
assert BING_STRING in results
|
||||
assert BING_EXPECTED_RESULT in results
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""Runs this file's tests from the command line."""
|
||||
test_bing_markdown_search()
|
||||
test_bing_markdown_search_api()
|
Binary file not shown.
|
@ -0,0 +1,3 @@
|
|||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9390b34525fd044df69265e022a06346abb6d203b14cbc9b2473c080c680e82e
|
||||
size 474288
|
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -0,0 +1,175 @@
|
|||
#!/usr/bin/env python3 -m pytest
|
||||
import io
|
||||
import os
|
||||
import shutil
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
|
||||
try:
|
||||
from autogen.browser_utils import FileConversionException, MarkdownConverter, UnsupportedFormatException
|
||||
except ImportError:
|
||||
skip_all = True
|
||||
else:
|
||||
skip_all = False
|
||||
|
||||
skip_exiftool = shutil.which("exiftool") is None
|
||||
|
||||
TEST_FILES_DIR = os.path.join(os.path.dirname(__file__), "test_files")
|
||||
|
||||
JPG_TEST_EXIFTOOL = {
|
||||
"Author": "AutoGen Authors",
|
||||
"Title": "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation",
|
||||
"Description": "AutoGen enables diverse LLM-based applications",
|
||||
"ImageSize": "1615x1967",
|
||||
"DateTimeOriginal": "2024:03:14 22:10:00",
|
||||
}
|
||||
|
||||
PDF_TEST_URL = "https://arxiv.org/pdf/2308.08155v2.pdf"
|
||||
PDF_TEST_STRINGS = ["While there is contemporaneous exploration of multi-agent approaches"]
|
||||
|
||||
YOUTUBE_TEST_URL = "https://www.youtube.com/watch?v=V2qZ_lgxTzg"
|
||||
YOUTUBE_TEST_STRINGS = [
|
||||
"## AutoGen FULL Tutorial with Python (Step-By-Step)",
|
||||
"This is an intermediate tutorial for installing and using AutoGen locally",
|
||||
"PT15M4S",
|
||||
"the model we're going to be using today is GPT 3.5 turbo", # From the transcript
|
||||
]
|
||||
|
||||
XLSX_TEST_STRINGS = [
|
||||
"## 09060124-b5e7-4717-9d07-3c046eb",
|
||||
"6ff4173b-42a5-4784-9b19-f49caff4d93d",
|
||||
"affc7dad-52dc-4b98-9b5d-51e65d8a8ad0",
|
||||
]
|
||||
|
||||
DOCX_TEST_STRINGS = [
|
||||
"314b0a30-5b04-470b-b9f7-eed2c2bec74a",
|
||||
"49e168b7-d2ae-407f-a055-2167576f39a1",
|
||||
"## d666f1f7-46cb-42bd-9a39-9a39cf2a509f",
|
||||
"# Abstract",
|
||||
"# Introduction",
|
||||
"AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation",
|
||||
]
|
||||
|
||||
PPTX_TEST_STRINGS = [
|
||||
"2cdda5c8-e50e-4db4-b5f0-9722a649f455",
|
||||
"04191ea8-5c73-4215-a1d3-1cfb43aaaf12",
|
||||
"44bf7d06-5e7a-4a40-a2e1-a2e42ef28c8a",
|
||||
"1b92870d-e3b5-4e65-8153-919f4ff45592",
|
||||
"AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation",
|
||||
]
|
||||
|
||||
BLOG_TEST_URL = "https://microsoft.github.io/autogen/blog/2023/04/21/LLM-tuning-math"
|
||||
BLOG_TEST_STRINGS = [
|
||||
"Large language models (LLMs) are powerful tools that can generate natural language texts for various applications, such as chatbots, summarization, translation, and more. GPT-4 is currently the state of the art LLM in the world. Is model selection irrelevant? What about inference parameters?",
|
||||
"an example where high cost can easily prevent a generic complex",
|
||||
]
|
||||
|
||||
WIKIPEDIA_TEST_URL = "https://en.wikipedia.org/wiki/Microsoft"
|
||||
WIKIPEDIA_TEST_STRINGS = [
|
||||
"Microsoft entered the operating system (OS) business in 1980 with its own version of [Unix]",
|
||||
'Microsoft was founded by [Bill Gates](/wiki/Bill_Gates "Bill Gates")',
|
||||
]
|
||||
WIKIPEDIA_TEST_EXCLUDES = [
|
||||
"You are encouraged to create an account and log in",
|
||||
"154 languages",
|
||||
"move to sidebar",
|
||||
]
|
||||
|
||||
SERP_TEST_URL = "https://www.bing.com/search?q=microsoft+wikipedia"
|
||||
SERP_TEST_STRINGS = [
|
||||
"](https://en.wikipedia.org/wiki/Microsoft",
|
||||
"Microsoft Corporation is **an American multinational corporation and technology company headquartered** in Redmond",
|
||||
"1995–2007: Foray into the Web, Windows 95, Windows XP, and Xbox",
|
||||
]
|
||||
SERP_TEST_EXCLUDES = [
|
||||
"https://www.bing.com/ck/a?!&&p=",
|
||||
"data:image/svg+xml,%3Csvg%20width%3D",
|
||||
]
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_all,
|
||||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_mdconvert_remote():
|
||||
mdconvert = MarkdownConverter()
|
||||
|
||||
# By URL
|
||||
result = mdconvert.convert(PDF_TEST_URL)
|
||||
for test_string in PDF_TEST_STRINGS:
|
||||
assert test_string in result.text_content
|
||||
|
||||
# By stream
|
||||
response = requests.get(PDF_TEST_URL)
|
||||
result = mdconvert.convert_stream(io.BytesIO(response.content), file_extension=".pdf", url=PDF_TEST_URL)
|
||||
for test_string in PDF_TEST_STRINGS:
|
||||
assert test_string in result.text_content
|
||||
|
||||
# # Youtube
|
||||
# result = mdconvert.convert(YOUTUBE_TEST_URL)
|
||||
# for test_string in YOUTUBE_TEST_STRINGS:
|
||||
# assert test_string in result.text_content
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_all,
|
||||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_mdconvert_local():
|
||||
mdconvert = MarkdownConverter()
|
||||
|
||||
# Test XLSX processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test.xlsx"))
|
||||
for test_string in XLSX_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
# Test DOCX processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test.docx"))
|
||||
for test_string in DOCX_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
# Test PPTX processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test.pptx"))
|
||||
for test_string in PPTX_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
# Test HTML processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test_blog.html"), url=BLOG_TEST_URL)
|
||||
for test_string in BLOG_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
# Test Wikipedia processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test_wikipedia.html"), url=WIKIPEDIA_TEST_URL)
|
||||
for test_string in WIKIPEDIA_TEST_EXCLUDES:
|
||||
assert test_string not in result.text_content.replace(r"\-", "-")
|
||||
for test_string in WIKIPEDIA_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
# Test Bing processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test_serp.html"), url=SERP_TEST_URL)
|
||||
for test_string in SERP_TEST_EXCLUDES:
|
||||
assert test_string not in result.text_content.replace(r"\-", "-")
|
||||
for test_string in SERP_TEST_STRINGS:
|
||||
assert test_string in result.text_content.replace(r"\-", "-")
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_exiftool,
|
||||
reason="do not run if exiftool is not installed",
|
||||
)
|
||||
def test_mdconvert_exiftool():
|
||||
mdconvert = MarkdownConverter()
|
||||
|
||||
# Test JPG metadata processing
|
||||
result = mdconvert.convert(os.path.join(TEST_FILES_DIR, "test.jpg"))
|
||||
for key in JPG_TEST_EXIFTOOL:
|
||||
target = f"{key}: {JPG_TEST_EXIFTOOL[key]}"
|
||||
assert target in result.text_content
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""Runs this file's tests from the command line."""
|
||||
test_mdconvert_remote()
|
||||
test_mdconvert_local()
|
||||
test_mdconvert_exiftool()
|
|
@ -0,0 +1,226 @@
|
|||
#!/usr/bin/env python3 -m pytest
|
||||
|
||||
import hashlib
|
||||
import math
|
||||
import os
|
||||
import pathlib
|
||||
import re
|
||||
import sys
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
|
||||
BLOG_POST_URL = "https://microsoft.github.io/autogen/blog/2023/04/21/LLM-tuning-math"
|
||||
BLOG_POST_TITLE = "Does Model and Inference Parameter Matter in LLM Applications? - A Case Study for MATH | AutoGen"
|
||||
BLOG_POST_STRING = "powerful tools that can generate natural language texts for various applications"
|
||||
BLOG_POST_FIND_ON_PAGE_QUERY = "an example where high * complex"
|
||||
BLOG_POST_FIND_ON_PAGE_MATCH = "an example where high cost can easily prevent a generic complex"
|
||||
|
||||
WIKIPEDIA_URL = "https://en.wikipedia.org/wiki/Microsoft"
|
||||
WIKIPEDIA_TITLE = "Microsoft"
|
||||
WIKIPEDIA_STRING = "Redmond"
|
||||
|
||||
PLAIN_TEXT_URL = "https://raw.githubusercontent.com/microsoft/autogen/main/README.md"
|
||||
|
||||
DOWNLOAD_URL = "https://arxiv.org/src/2308.08155"
|
||||
|
||||
PDF_URL = "https://arxiv.org/pdf/2308.08155.pdf"
|
||||
PDF_STRING = "Figure 1: AutoGen enables diverse LLM-based applications using multi-agent conversations."
|
||||
|
||||
DIR_TEST_STRINGS = [
|
||||
"# Index of ",
|
||||
"[.. (parent directory)]",
|
||||
"/test/browser_utils/test_requests_markdown_browser.py",
|
||||
]
|
||||
|
||||
LOCAL_FILE_TEST_STRINGS = [
|
||||
BLOG_POST_STRING,
|
||||
BLOG_POST_FIND_ON_PAGE_MATCH,
|
||||
]
|
||||
|
||||
try:
|
||||
from autogen.browser_utils import BingMarkdownSearch, RequestsMarkdownBrowser
|
||||
except ImportError:
|
||||
skip_all = True
|
||||
else:
|
||||
skip_all = False
|
||||
|
||||
|
||||
def _rm_folder(path):
|
||||
"""Remove all the regular files in a folder, then deletes the folder. Assumes a flat file structure, with no subdirectories."""
|
||||
for fname in os.listdir(path):
|
||||
fpath = os.path.join(path, fname)
|
||||
if os.path.isfile(fpath):
|
||||
os.unlink(fpath)
|
||||
os.rmdir(path)
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_all,
|
||||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_requests_markdown_browser():
|
||||
# Create a downloads folder (removing any leftover ones from prior tests)
|
||||
downloads_folder = os.path.join(os.getcwd(), "downloads")
|
||||
if os.path.isdir(downloads_folder):
|
||||
_rm_folder(downloads_folder)
|
||||
os.mkdir(downloads_folder)
|
||||
|
||||
# Instantiate the browser
|
||||
viewport_size = 1024
|
||||
browser = RequestsMarkdownBrowser(
|
||||
viewport_size=viewport_size,
|
||||
downloads_folder=downloads_folder,
|
||||
search_engine=BingMarkdownSearch(),
|
||||
)
|
||||
|
||||
# Test that we can visit a page and find what we expect there
|
||||
top_viewport = browser.visit_page(BLOG_POST_URL)
|
||||
assert browser.viewport == top_viewport
|
||||
assert browser.page_title.strip() == BLOG_POST_TITLE.strip()
|
||||
assert BLOG_POST_STRING in browser.page_content
|
||||
|
||||
# Check if page splitting works
|
||||
approx_pages = math.ceil(len(browser.page_content) / viewport_size) # May be fewer, since it aligns to word breaks
|
||||
assert len(browser.viewport_pages) <= approx_pages
|
||||
assert abs(len(browser.viewport_pages) - approx_pages) <= 1 # allow only a small deviation
|
||||
assert browser.viewport_pages[0][0] == 0
|
||||
assert browser.viewport_pages[-1][1] == len(browser.page_content)
|
||||
|
||||
# Make sure we can reconstruct the full contents from the split pages
|
||||
buffer = ""
|
||||
for bounds in browser.viewport_pages:
|
||||
buffer += browser.page_content[bounds[0] : bounds[1]]
|
||||
assert buffer == browser.page_content
|
||||
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(1, len(browser.viewport_pages)):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == len(browser.viewport_pages) - 1
|
||||
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(len(browser.viewport_pages) - 2, 0, -1):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == 0
|
||||
|
||||
# Test Wikipedia handling
|
||||
assert WIKIPEDIA_STRING in browser.visit_page(WIKIPEDIA_URL)
|
||||
assert WIKIPEDIA_TITLE.strip() == browser.page_title.strip()
|
||||
|
||||
# Visit a plain-text file
|
||||
# response = requests.get(PLAIN_TEXT_URL)
|
||||
# response.raise_for_status()
|
||||
# expected_results = re.sub(r"\s+", " ", response.text, re.DOTALL).strip()
|
||||
# browser.visit_page(PLAIN_TEXT_URL)
|
||||
# assert re.sub(r"\s+", " ", browser.page_content, re.DOTALL).strip() == expected_results
|
||||
|
||||
# Disrectly download a ZIP file and compute its md5
|
||||
response = requests.get(DOWNLOAD_URL, stream=True)
|
||||
response.raise_for_status()
|
||||
expected_md5 = hashlib.md5(response.raw.read()).hexdigest()
|
||||
|
||||
# Download it with the browser and check for a match
|
||||
viewport = browser.visit_page(DOWNLOAD_URL)
|
||||
m = re.search(r"Saved file to '(.*?)'", viewport)
|
||||
download_loc = m.group(1)
|
||||
with open(download_loc, "rb") as fh:
|
||||
downloaded_md5 = hashlib.md5(fh.read()).hexdigest()
|
||||
|
||||
# MD%s should match
|
||||
assert expected_md5 == downloaded_md5
|
||||
|
||||
# Fetch a PDF
|
||||
viewport = browser.visit_page(PDF_URL)
|
||||
assert PDF_STRING in viewport
|
||||
|
||||
# Test find in page
|
||||
browser.visit_page(BLOG_POST_URL)
|
||||
find_viewport = browser.find_on_page(BLOG_POST_FIND_ON_PAGE_QUERY)
|
||||
assert BLOG_POST_FIND_ON_PAGE_MATCH in find_viewport
|
||||
assert find_viewport is not None
|
||||
|
||||
loc = browser.viewport_current_page
|
||||
find_viewport = browser.find_on_page("LLM app*")
|
||||
assert find_viewport is not None
|
||||
|
||||
# Find next using the same query
|
||||
for i in range(0, 10):
|
||||
find_viewport = browser.find_on_page("LLM app*")
|
||||
assert find_viewport is not None
|
||||
|
||||
new_loc = browser.viewport_current_page
|
||||
assert new_loc != loc
|
||||
loc = new_loc
|
||||
|
||||
# Find next using find_next
|
||||
for i in range(0, 10):
|
||||
find_viewport = browser.find_next()
|
||||
assert find_viewport is not None
|
||||
|
||||
new_loc = browser.viewport_current_page
|
||||
assert new_loc != loc
|
||||
loc = new_loc
|
||||
|
||||
# Bounce around
|
||||
browser.viewport_current_page = 0
|
||||
find_viewport = browser.find_on_page("For Further Reading")
|
||||
assert find_viewport is not None
|
||||
loc = browser.viewport_current_page
|
||||
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page != loc
|
||||
find_viewport = browser.find_on_page("For Further Reading")
|
||||
assert find_viewport is not None
|
||||
assert loc == browser.viewport_current_page
|
||||
|
||||
# Find something that doesn't exist
|
||||
find_viewport = browser.find_on_page("7c748f9a-8dce-461f-a092-4e8d29913f2d")
|
||||
assert find_viewport is None
|
||||
assert loc == browser.viewport_current_page # We didn't move
|
||||
|
||||
# Clean up
|
||||
_rm_folder(downloads_folder)
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
skip_all,
|
||||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_local_file_browsing():
|
||||
directory = os.path.dirname(__file__)
|
||||
test_file = os.path.join(directory, "test_files", "test_blog.html")
|
||||
browser = RequestsMarkdownBrowser()
|
||||
|
||||
# Directory listing via open_local_file
|
||||
viewport = browser.open_local_file(directory)
|
||||
for target_string in DIR_TEST_STRINGS:
|
||||
assert target_string in viewport
|
||||
|
||||
# Directory listing via file URI
|
||||
viewport = browser.visit_page(pathlib.Path(os.path.abspath(directory)).as_uri())
|
||||
for target_string in DIR_TEST_STRINGS:
|
||||
assert target_string in viewport
|
||||
|
||||
# File access via file open_local_file
|
||||
browser.open_local_file(test_file)
|
||||
for target_string in LOCAL_FILE_TEST_STRINGS:
|
||||
assert target_string in browser.page_content
|
||||
|
||||
# File access via file URI
|
||||
browser.visit_page(pathlib.Path(os.path.abspath(test_file)).as_uri())
|
||||
for target_string in LOCAL_FILE_TEST_STRINGS:
|
||||
assert target_string in browser.page_content
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""Runs this file's tests from the command line."""
|
||||
test_requests_markdown_browser()
|
||||
test_local_file_browsing()
|
|
@ -1,10 +1,9 @@
|
|||
#!/usr/bin/env python3 -m pytest
|
||||
|
||||
import hashlib
|
||||
import math
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
|
@ -57,96 +56,91 @@ def _rm_folder(path):
|
|||
reason="do not run if dependency is not installed",
|
||||
)
|
||||
def test_simple_text_browser():
|
||||
# Create a downloads folder (removing any leftover ones from prior tests)
|
||||
downloads_folder = os.path.join(KEY_LOC, "downloads")
|
||||
if os.path.isdir(downloads_folder):
|
||||
_rm_folder(downloads_folder)
|
||||
os.mkdir(downloads_folder)
|
||||
# Create a temp downloads folder (removing any leftover ones from prior tests)
|
||||
with tempfile.TemporaryDirectory() as downloads_folder:
|
||||
# Instantiate the browser
|
||||
user_agent = "python-requests/" + requests.__version__
|
||||
viewport_size = 1024
|
||||
browser = SimpleTextBrowser(
|
||||
downloads_folder=downloads_folder,
|
||||
viewport_size=viewport_size,
|
||||
request_kwargs={
|
||||
"headers": {"User-Agent": user_agent},
|
||||
},
|
||||
)
|
||||
|
||||
# Instantiate the browser
|
||||
user_agent = "python-requests/" + requests.__version__
|
||||
viewport_size = 1024
|
||||
browser = SimpleTextBrowser(
|
||||
downloads_folder=downloads_folder,
|
||||
viewport_size=viewport_size,
|
||||
request_kwargs={
|
||||
"headers": {"User-Agent": user_agent},
|
||||
},
|
||||
)
|
||||
# Test that we can visit a page and find what we expect there
|
||||
top_viewport = browser.visit_page(BLOG_POST_URL)
|
||||
assert browser.viewport == top_viewport
|
||||
assert browser.page_title.strip() == BLOG_POST_TITLE.strip()
|
||||
assert BLOG_POST_STRING in browser.page_content
|
||||
|
||||
# Test that we can visit a page and find what we expect there
|
||||
top_viewport = browser.visit_page(BLOG_POST_URL)
|
||||
assert browser.viewport == top_viewport
|
||||
assert browser.page_title.strip() == BLOG_POST_TITLE.strip()
|
||||
assert BLOG_POST_STRING in browser.page_content.replace("\n\n", " ").replace("\\", "")
|
||||
# Check if page splitting works
|
||||
approx_pages = math.ceil(
|
||||
len(browser.page_content) / viewport_size
|
||||
) # May be fewer, since it aligns to word breaks
|
||||
assert len(browser.viewport_pages) <= approx_pages
|
||||
assert abs(len(browser.viewport_pages) - approx_pages) <= 1 # allow only a small deviation
|
||||
assert browser.viewport_pages[0][0] == 0
|
||||
assert browser.viewport_pages[-1][1] == len(browser.page_content)
|
||||
|
||||
# Check if page splitting works
|
||||
approx_pages = math.ceil(len(browser.page_content) / viewport_size) # May be fewer, since it aligns to word breaks
|
||||
assert len(browser.viewport_pages) <= approx_pages
|
||||
assert abs(len(browser.viewport_pages) - approx_pages) <= 1 # allow only a small deviation
|
||||
assert browser.viewport_pages[0][0] == 0
|
||||
assert browser.viewport_pages[-1][1] == len(browser.page_content)
|
||||
# Make sure we can reconstruct the full contents from the split pages
|
||||
buffer = ""
|
||||
for bounds in browser.viewport_pages:
|
||||
buffer += browser.page_content[bounds[0] : bounds[1]]
|
||||
assert buffer == browser.page_content
|
||||
|
||||
# Make sure we can reconstruct the full contents from the split pages
|
||||
buffer = ""
|
||||
for bounds in browser.viewport_pages:
|
||||
buffer += browser.page_content[bounds[0] : bounds[1]]
|
||||
assert buffer == browser.page_content
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(1, len(browser.viewport_pages)):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == len(browser.viewport_pages) - 1
|
||||
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(1, len(browser.viewport_pages)):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_down()
|
||||
assert browser.viewport_current_page == len(browser.viewport_pages) - 1
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(len(browser.viewport_pages) - 2, 0, -1):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == 0
|
||||
|
||||
# Test scrolling (scroll all the way to the bottom)
|
||||
for i in range(len(browser.viewport_pages) - 2, 0, -1):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == i
|
||||
# Test scrolloing beyond the limits
|
||||
for i in range(0, 5):
|
||||
browser.page_up()
|
||||
assert browser.viewport_current_page == 0
|
||||
# Test Wikipedia handling
|
||||
assert WIKIPEDIA_STRING in browser.visit_page(WIKIPEDIA_URL)
|
||||
assert WIKIPEDIA_TITLE.strip() == browser.page_title.strip()
|
||||
|
||||
# Test Wikipedia handling
|
||||
assert WIKIPEDIA_STRING in browser.visit_page(WIKIPEDIA_URL)
|
||||
assert WIKIPEDIA_TITLE.strip() == browser.page_title.strip()
|
||||
# Visit a plain-text file
|
||||
response = requests.get(PLAIN_TEXT_URL)
|
||||
response.raise_for_status()
|
||||
expected_results = response.text
|
||||
|
||||
# Visit a plain-text file
|
||||
response = requests.get(PLAIN_TEXT_URL)
|
||||
response.raise_for_status()
|
||||
expected_results = response.text
|
||||
browser.visit_page(PLAIN_TEXT_URL)
|
||||
assert browser.page_content.strip() == expected_results.strip()
|
||||
|
||||
browser.visit_page(PLAIN_TEXT_URL)
|
||||
assert browser.page_content.strip() == expected_results.strip()
|
||||
# Directly download an image, and compute its md5
|
||||
response = requests.get(IMAGE_URL, stream=True)
|
||||
response.raise_for_status()
|
||||
expected_md5 = hashlib.md5(response.raw.read()).hexdigest()
|
||||
|
||||
# Directly download an image, and compute its md5
|
||||
response = requests.get(IMAGE_URL, stream=True)
|
||||
response.raise_for_status()
|
||||
expected_md5 = hashlib.md5(response.raw.read()).hexdigest()
|
||||
# Visit an image causing it to be downloaded by the SimpleTextBrowser, then compute its md5
|
||||
viewport = browser.visit_page(IMAGE_URL)
|
||||
m = re.search(r"Downloaded '(.*?)' to '(.*?)'", viewport)
|
||||
fetched_url = m.group(1)
|
||||
download_loc = m.group(2)
|
||||
assert fetched_url == IMAGE_URL
|
||||
|
||||
# Visit an image causing it to be downloaded by the SimpleTextBrowser, then compute its md5
|
||||
viewport = browser.visit_page(IMAGE_URL)
|
||||
m = re.search(r"Downloaded '(.*?)' to '(.*?)'", viewport)
|
||||
fetched_url = m.group(1)
|
||||
download_loc = m.group(2)
|
||||
assert fetched_url == IMAGE_URL
|
||||
with open(download_loc, "rb") as fh:
|
||||
downloaded_md5 = hashlib.md5(fh.read()).hexdigest()
|
||||
|
||||
with open(download_loc, "rb") as fh:
|
||||
downloaded_md5 = hashlib.md5(fh.read()).hexdigest()
|
||||
# MD%s should match
|
||||
assert expected_md5 == downloaded_md5
|
||||
|
||||
# MD%s should match
|
||||
assert expected_md5 == downloaded_md5
|
||||
|
||||
# Fetch a PDF
|
||||
viewport = browser.visit_page(PDF_URL)
|
||||
assert PDF_STRING in viewport
|
||||
|
||||
# Clean up
|
||||
_rm_folder(downloads_folder)
|
||||
# Fetch a PDF
|
||||
viewport = browser.visit_page(PDF_URL)
|
||||
assert PDF_STRING in viewport
|
||||
|
||||
|
||||
@pytest.mark.skipif(
|
||||
|
|
Loading…
Reference in New Issue