Merge branch '0.2' into feature/import-endpoint

2024-10-18 13:14:08 -07:00 · 2024-10-18 13:14:08 -07:00 · a55ffe8307
parent e0206a01eb 76a4bd05d9
commit a55ffe8307
215 changed files with 10368 additions and 1243 deletions
--- a/.devcontainer/README.md
+++ b/.devcontainer/README.md
@ -49,7 +49,7 @@ Feel free to modify these Dockerfiles for your specific project needs. Here are
 - **Setting Environment Variables**: Add environment variables using the `ENV` command for any application-specific configurations. We have prestaged the line needed to inject your OpenAI_key into the docker environment as a environmental variable. Others can be staged in the same way. Just uncomment the line.
    `# ENV OPENAI_API_KEY="{OpenAI-API-Key}"` to `ENV OPENAI_API_KEY="{OpenAI-API-Key}"`
 - **Need a less "Advanced" Autogen build**: If the `./full/Dockerfile` is to much but you need more than advanced then update this line in the Dockerfile file.
-`RUN pip install pyautogen[teachable,lmm,retrievechat,mathchat,blendsearch] autogenra` to install just what you need. `RUN pip install pyautogen[retrievechat,blendsearch] autogenra`
+`RUN pip install autogen-agentchat[teachable,lmm,retrievechat,mathchat,blendsearch]~=0.2 autogenra` to install just what you need. `RUN pip install autogen-agentchat[retrievechat,blendsearch]~=0.2 autogenra`
 - **Can't Dev without your favorite CLI tool**: if you need particular OS tools to be installed in your Docker container you can add those packages here right after the sudo for the `./base/Dockerfile` and `./full/Dockerfile` files. In the example below we are installing net-tools and vim to the environment.

    ```code
--- a/.devcontainer/full/Dockerfile
+++ b/.devcontainer/full/Dockerfile
@ -22,7 +22,7 @@ WORKDIR /home/autogen

 # Install Python packages
 RUN pip install --upgrade pip
-RUN pip install pyautogen[teachable,lmm,retrievechat,mathchat,blendsearch] autogenra
+RUN pip install autogen-agentchat[teachable,lmm,retrievechat,mathchat,blendsearch]~=0.2 autogenra
 RUN pip install numpy pandas matplotlib seaborn scikit-learn requests urllib3 nltk pillow pytest beautifulsoup4

 # Expose port
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -1,57 +0,0 @@
-### Description
-<!-- A clear and concise description of the issue or feature request. -->
-
-### Environment
- AutoGen version: <!-- Specify the AutoGen version (e.g., v0.2.0) -->
- Python version: <!-- Specify the Python version (e.g., 3.8) -->
- Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
-
-### Steps to Reproduce (for bugs)
-<!-- Provide detailed steps to reproduce the issue. Include code snippets, configuration files, or any other relevant information. -->
-
-1. Step 1
-2. Step 2
-3. ...
-
-### Expected Behavior
-<!-- Describe what you expected to happen. -->
-
-### Actual Behavior
-<!-- Describe what actually happened. Include any error messages, stack traces, or unexpected behavior. -->
-
-### Screenshots / Logs (if applicable)
-<!-- If relevant, include screenshots or logs that help illustrate the issue. -->
-
-### Additional Information
-<!-- Include any additional information that might be helpful, such as specific configurations, data samples, or context about the environment. -->
-
-### Possible Solution (if you have one)
-<!-- If you have suggestions on how to address the issue, provide them here. -->
-
-### Is this a Bug or Feature Request?
-<!-- Choose one: Bug | Feature Request -->
-
-### Priority
-<!-- Choose one: High | Medium | Low -->
-
-### Difficulty
-<!-- Choose one: Easy | Moderate | Hard -->
-
-### Any related issues?
-<!-- If this is related to another issue, reference it here. -->
-
-### Any relevant discussions?
-<!-- If there are any discussions or forum threads related to this issue, provide links. -->
-
-### Checklist
-<!-- Please check the items that you have completed -->
- [ ] I have searched for similar issues and didn't find any duplicates.
- [ ] I have provided a clear and concise description of the issue.
- [ ] I have included the necessary environment details.
- [ ] I have outlined the steps to reproduce the issue.
- [ ] I have included any relevant logs or screenshots.
- [ ] I have indicated whether this is a bug or a feature request.
- [ ] I have set the priority and difficulty levels.
-
-### Additional Comments
-<!-- Any additional comments or context that you think would be helpful. -->
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@ -1,53 +1,55 @@
 name: Bug Report
-description: File a bug report
-title: "[Bug]: "
+description: Report a bug
 labels: ["bug"]

 body:
  - type: textarea
-    id: description
    attributes:
-      label: Describe the bug
-      description: A clear and concise description of what the bug is.
-      placeholder: What went wrong?
+      label: What happened?
+      description: Please provide as much information as possible, this helps us address the issue.
+    validations:
+      required: true
  - type: textarea
-    id: reproduce
    attributes:
-      label: Steps to reproduce
-      description: |
-        Steps to reproduce the behavior:
-
-        1. Step 1
-        2. Step 2
-        3. ...
-        4. See error
-      placeholder: How can we replicate the issue?
+      label: What did you expect to happen?
+    validations:
+      required: true
  - type: textarea
-    id: modelused
    attributes:
-      label: Model Used
-      description: A description of the model that was used when the error was encountered
+      label: How can we reproduce it (as minimally and precisely as possible)?
+      description: Please provide steps to reproduce. Provide code that can be run if possible.
+    validations:
+      required: true
+  - type: input
+    attributes:
+      label: AutoGen version
+      description: What version or commit of the library was used
+    validations:
+      required: true
+  - type: dropdown
+    attributes:
+      label: Which package was this bug in
+      options:
+        - Core
+        - AgentChat
+        - Extensions
+        - AutoGen Studio
+        - Magentic One
+        - AutoGen Bench
+        - Other
+    validations:
+      required: true
+  - type: input
+    attributes:
+      label: Model used
+      description: If a model was used, please describe it here, indicating whether it is a local model or a cloud-hosted model
      placeholder: gpt-4, mistral-7B etc
-  - type: textarea
-    id: expected_behavior
+  - type: input
    attributes:
-      label: Expected Behavior
-      description: A clear and concise description of what you expected to happen.
-      placeholder: What should have happened?
-  - type: textarea
-    id: screenshots
+      label: Python version
+  - type: input
    attributes:
-      label: Screenshots and logs
-      description: If applicable, add screenshots and logs to help explain your problem.
-      placeholder: Add screenshots here
+      label: Operating system
  - type: textarea
-    id: additional_information
    attributes:
-      label: Additional Information
-      description: |
-        - AutoGen Version: <!-- Specify the AutoGen version (e.g., v0.2.0) -->
-        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
-        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
-        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
-        - Any other relevant information.
-      placeholder: Any additional details
+      label: Any additional info you think would be helpful for fixing this bug
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@ -1 +1,5 @@
 blank_issues_enabled: true
+contact_links:
+  - name: Questions or general help 💬
+    url: https://github.com/microsoft/autogen/discussions
+    about: Please ask and answer questions here.
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@ -1,26 +1,18 @@
 name: Feature Request
-description: File a feature request
+description: Request a new feature or enhancement
 labels: ["enhancement"]
-title: "[Feature Request]: "

 body:
  - type: textarea
-    id: problem_description
    attributes:
-      label: Is your feature request related to a problem? Please describe.
-      description: A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
-      placeholder: What problem are you trying to solve?
+      label: What feature would you like to be added?
+      description: Please describe the desired feature. Be descriptive, provide examples and if possible, provide a proposed solution.
+    validations:
+      required: true

  - type: textarea
-    id: solution_description
    attributes:
-      label: Describe the solution you'd like
-      description: A clear and concise description of what you want to happen.
-      placeholder: How do you envision the solution?
-
-  - type: textarea
-    id: additional_context
-    attributes:
-      label: Additional context
-      description: Add any other context or screenshots about the feature request here.
-      placeholder: Any additional information
+      label: Why is this needed?
+      description: Why is it important that this feature is implemented? What problem or need does it solve?
+    validations:
+      required: true
--- a/.github/ISSUE_TEMPLATE/general_issue.yml
+++ b/.github/ISSUE_TEMPLATE/general_issue.yml
@ -1,41 +0,0 @@
-name: General Issue
-description: File a general issue
-title: "[Issue]: "
-labels: []
-
-body:
-  - type: textarea
-    id: description
-    attributes:
-      label: Describe the issue
-      description: A clear and concise description of what the issue is.
-      placeholder: What went wrong?
-  - type: textarea
-    id: reproduce
-    attributes:
-      label: Steps to reproduce
-      description: |
-        Steps to reproduce the behavior:
-
-        1. Step 1
-        2. Step 2
-        3. ...
-        4. See error
-      placeholder: How can we replicate the issue?
-  - type: textarea
-    id: screenshots
-    attributes:
-      label: Screenshots and logs
-      description: If applicable, add screenshots and logs to help explain your problem.
-      placeholder: Add screenshots here
-  - type: textarea
-    id: additional_information
-    attributes:
-      label: Additional Information
-      description: |
-        - AutoGen Version: <!-- Specify the AutoGen version (e.g., v0.2.0) -->
-        - Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) -->
-        - Python Version: <!-- Specify the Python version (e.g., 3.8) -->
-        - Related Issues: <!-- Link to any related issues here (e.g., #1) -->
-        - Any other relevant information.
-      placeholder: Any additional details
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@ -5,15 +5,13 @@ name: Build

 on:
  push:
-    branches: ["main"]
+    branches: ["0.2"]
  pull_request:
-    branches: ["main"]
-  merge_group:
-    types: [checks_requested]
+    branches: ["0.2"]

 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
-  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/0.2' }}
 permissions: {}
 jobs:
  paths-filter:
--- a/.github/workflows/contrib-openai.yml
+++ b/.github/workflows/contrib-openai.yml
@ -5,7 +5,7 @@ name: OpenAI4ContribTests

 on:
  pull_request:
-    branches: ["main"]
+    branches: ["0.2"]
    paths:
      - "autogen/**"
      - "test/agentchat/contrib/**"
--- a/.github/workflows/contrib-tests.yml
+++ b/.github/workflows/contrib-tests.yml
@ -5,7 +5,7 @@ name: ContribTests

 on:
  pull_request:
-    branches: ["main"]
+    branches: ["0.2"]
    paths:
      - "autogen/**"
      - "test/agentchat/contrib/**"
@ -16,7 +16,7 @@ on:

 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
-  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/0.2' }}
 permissions:
  {}
  # actions: read
@ -474,6 +474,46 @@ jobs:
          file: ./coverage.xml
          flags: unittests

+  CerebrasTest:
+    runs-on: ${{ matrix.os }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest, macos-latest, windows-2019]
+        python-version: ["3.9", "3.10", "3.11", "3.12"]
+        exclude:
+          - os: macos-latest
+            python-version: "3.9"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          lfs: true
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Install packages and dependencies for all tests
+        run: |
+          python -m pip install --upgrade pip wheel
+          pip install pytest-cov>=5
+      - name: Install packages and dependencies for Cerebras
+        run: |
+          pip install -e .[cerebras_cloud_sdk,test]
+      - name: Set AUTOGEN_USE_DOCKER based on OS
+        shell: bash
+        run: |
+          if [[ ${{ matrix.os }} != ubuntu-latest ]]; then
+            echo "AUTOGEN_USE_DOCKER=False" >> $GITHUB_ENV
+          fi
+      - name: Coverage
+        run: |
+          pytest test/oai/test_cerebras.py --skip-openai
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v3
+        with:
+          file: ./coverage.xml
+          flags: unittests
+
  MistralTest:
    runs-on: ${{ matrix.os }}
    strategy:
@ -669,3 +709,35 @@ jobs:
        with:
          file: ./coverage.xml
          flags: unittests
+
+  OllamaTest:
+    runs-on: ${{ matrix.os }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest, macos-latest, windows-2019]
+        python-version: ["3.9", "3.10", "3.11", "3.12"]
+        exclude:
+          - os: macos-latest
+            python-version: "3.9"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          lfs: true
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Install packages and dependencies for all tests
+        run: |
+          python -m pip install --upgrade pip wheel
+          pip install pytest-cov>=5
+      - name: Install packages and dependencies for Ollama
+        run: |
+          pip install -e .[ollama,test]
+          pytest test/oai/test_ollama.py --skip-openai
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v3
+        with:
+          file: ./coverage.xml
+          flags: unittests
--- a/.github/workflows/deploy-website.yml
+++ b/.github/workflows/deploy-website.yml
@ -2,26 +2,15 @@ name: docs

 on:
  pull_request:
-    branches: [main]
-    path:
-      - "autogen/*"
-      - "website/*"
-      - ".github/workflows/deploy-website.yml"
+    branches: ["0.2"]
  push:
-    branches: [main]
-    path:
-      - "autogen/*"
-      - "website/*"
-      - ".github/workflows/deploy-website.yml"
+    branches: ["0.2"]
  workflow_dispatch:
-  merge_group:
-    types: [checks_requested]
 permissions:
  id-token: write
  pages: write
 jobs:
  checks:
-    if: github.event_name != 'push'
    runs-on: ubuntu-latest
    defaults:
      run:
@ -67,57 +56,3 @@ jobs:
          npm i --legacy-peer-deps
          npm run build
          fi
-  gh-release:
-    if: github.event_name != 'pull_request'
-    runs-on: ubuntu-latest
-    defaults:
-      run:
-        working-directory: website
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          lfs: true
-      - uses: actions/setup-node@v4
-        with:
-          node-version: 18.x
-      - name: setup python
-        uses: actions/setup-python@v5
-        with:
-          python-version: "3.8"
-      - name: pydoc-markdown install
-        run: |
-          python -m pip install --upgrade pip
-          pip install pydoc-markdown pyyaml termcolor
-          # Pin databind packages as version 4.5.0 is not compatible with pydoc-markdown.
-          pip install databind.core==4.4.2 databind.json==4.4.2
-      - name: pydoc-markdown run
-        run: |
-          pydoc-markdown
-      - name: quarto install
-        working-directory: ${{ runner.temp }}
-        run: |
-          wget -q https://github.com/quarto-dev/quarto-cli/releases/download/v1.5.23/quarto-1.5.23-linux-amd64.tar.gz
-          tar -xzf quarto-1.5.23-linux-amd64.tar.gz
-          echo "$(pwd)/quarto-1.5.23/bin/" >> $GITHUB_PATH
-      - name: Process notebooks
-        run: |
-          python process_notebooks.py render
-      - name: Build website
-        run: |
-          if [ -e yarn.lock ]; then
-          yarn install --frozen-lockfile --ignore-engines
-          yarn build
-          elif [ -e package-lock.json ]; then
-          npm ci
-          npm run build
-          else
-          npm i --legacy-peer-deps
-          npm run build
-          fi
-      - name: Upload artifact
-        uses: actions/upload-pages-artifact@v3
-        with:
-          path: "website/build"
-      - name: Deploy to GitHub Pages
-        id: deployment
-        uses: actions/deploy-pages@v4
--- a/.github/workflows/dotnet-build.yml
+++ b/.github/workflows/dotnet-build.yml
@ -6,15 +6,13 @@ name: dotnet-ci
 on:
  workflow_dispatch:
  pull_request:
-    branches: [ "main" ]
+    branches: [ "0.2" ]
  push:
-    branches: [ "main" ]
-  merge_group:
-    types: [checks_requested]
+    branches: [ "0.2" ]

 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
-  cancel-in-progress: ${{ github.ref != 'refs/heads/main' || github.ref != 'refs/heads/dotnet' }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/0.2' || github.ref != 'refs/heads/dotnet' }}

 permissions:
  contents: read
@ -122,7 +120,7 @@ jobs:
    defaults:
      run:
        working-directory: dotnet
-    if: success() && (github.ref == 'refs/heads/main')
+    if: success() && (github.ref == 'refs/heads/0.2')
    needs: aot-test
    steps:
    - uses: actions/checkout@v4
@ -228,4 +226,4 @@ jobs:
      env:
        MYGET_TOKEN: ${{ secrets.MYGET_TOKEN }}
      continue-on-error: true
-      
+
--- a/.github/workflows/issue-needs-triage.yml
+++ b/.github/workflows/issue-needs-triage.yml
@ -0,0 +1,18 @@
+name: Label issues with needs-triage
+on:
+  issues:
+    types:
+      - reopened
+      - opened
+jobs:
+  label_issues:
+    runs-on: ubuntu-latest
+    permissions:
+      issues: write
+    steps:
+      - run: gh issue edit "$NUMBER" --add-label "$LABELS"
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          GH_REPO: ${{ github.repository }}
+          NUMBER: ${{ github.event.issue.number }}
+          LABELS: needs-triage
--- a/.github/workflows/lfs-check.yml
+++ b/.github/workflows/lfs-check.yml
@ -1,6 +1,7 @@
 name: "Git LFS Check"
-
-on: pull_request
+on:
+  pull_request:
+    branches: ["0.2"]
 permissions: {}
 jobs:
  lfs-check:
--- a/.github/workflows/openai.yml
+++ b/.github/workflows/openai.yml
@ -5,7 +5,7 @@ name: OpenAI

 on:
  pull_request:
-    branches: ["main"]
+    branches: ["0.2"]
    paths:
      - "autogen/**"
      - "test/**"
--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@ -3,8 +3,7 @@ name: Code formatting
 # see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
 on:  # Trigger the workflow on pull request or merge
  pull_request:
-  merge_group:
-    types: [checks_requested]
+    branches: ["0.2"]

 defaults:
  run:
--- a/.github/workflows/python-package.yml
+++ b/.github/workflows/python-package.yml
@ -5,13 +5,10 @@
 name: python-package

 on:
-  release:
-    types: [published]
+  push:
+    tags:
+      - "0.2.*"
 permissions: {}
-  # actions: read
-  # checks: read
-  # contents: read
-  # deployments: read
 jobs:
  deploy:
    strategy:
@ -19,38 +16,18 @@ jobs:
        os: ['ubuntu-latest']
        python-version: [3.10]
    runs-on: ${{ matrix.os }}
-    environment: package
+    environment:
+      name: package
+      url: https://pypi.org/p/autogen-agentchat
+    permissions:
+      id-token: write
    steps:
      - name: Checkout
        uses: actions/checkout@v4
-      # - name: Cache conda
-      #   uses: actions/cache@v4
-      #   with:
-      #     path: ~/conda_pkgs_dir
-      #     key: conda-${{ matrix.os }}-python-${{ matrix.python-version }}-${{ hashFiles('environment.yml') }}
-      # - name: Setup Miniconda
-      #   uses: conda-incubator/setup-miniconda@v2
-      #   with:
-      #     auto-update-conda: true
-      #     auto-activate-base: false
-      #     activate-environment: hcrystalball
-      #     python-version: ${{ matrix.python-version }}
-      #     use-only-tar-bz2: true
-      - name: Install from source
-        # This is required for the pre-commit tests
-        shell: pwsh
-        run: pip install .
-      # - name: Conda list
-      #   shell: pwsh
-      #   run: conda list
      - name: Build
        shell: pwsh
        run: |
          pip install twine
          python setup.py sdist bdist_wheel
-      - name: Publish to PyPI
-        env:
-          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
-          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
-        shell: pwsh
-        run: twine upload dist/*
+      - name: Publish package to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
--- a/.github/workflows/samples-tools-tests.yml
+++ b/.github/workflows/samples-tools-tests.yml
@ -5,7 +5,7 @@ name: SamplesToolsTests

 on:
  pull_request:
-    branches: ["main"]
+    branches: ["0.2"]
    paths:
      - "autogen/**"
      - "samples/tools/**"
@ -14,7 +14,7 @@ on:

 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
-  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/0.2' }}
 permissions: {}
 jobs:
  SamplesToolsFineTuningTests:
--- a/.github/workflows/type-check.yml
+++ b/.github/workflows/type-check.yml
@ -2,8 +2,8 @@ name: Type check
 # see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
 on: # Trigger the workflow on pull request or merge
  pull_request:
-  merge_group:
-    types: [checks_requested]
+    branches: ["0.2"]
+
 defaults:
  run:
    shell: bash
--- a/201
+++ b/201
@ -0,0 +1,201 @@
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright 2014 The Kubernetes Authors.
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/1
+++ b/1
@ -1,5 +1,4 @@
 // Please modify the content, remove these four lines of comment and rename this file to OAI_CONFIG_LIST to run the sample code.
-// If using pyautogen v0.1.x with Azure OpenAI, please replace "base_url" with "api_base" (line 14 and line 21 below). Use "pip list" to check version of pyautogen installed.
 //
 // NOTE: This configuration lists GPT-4 as the default model, as this represents our current recommendation, and is known to work well with AutoGen. If you use a model other than GPT-4, you may need to revise various system prompts (especially if using weaker models like GPT-3.5-turbo). Moreover, if you use models other than those hosted by OpenAI or Azure, you may incur additional risks related to alignment and safety. Proceed with caution if updating this default.
 [
--- a/README.md
+++ b/README.md
@ -3,14 +3,11 @@

 <div align="center">

-<img src="https://microsoft.github.io/autogen/img/ag.svg" alt="AutoGen Logo" width="100">
+<img src="https://microsoft.github.io/autogen/0.2/img/ag.svg" alt="AutoGen Logo" width="100">

-![Python Version](https://img.shields.io/badge/3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue) [![PyPI version](https://img.shields.io/badge/PyPI-v0.2.34-blue.svg)](https://pypi.org/project/pyautogen/)
+![Python Version](https://img.shields.io/badge/3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue) [![PyPI - Version](https://img.shields.io/pypi/v/autogen-agentchat)](https://pypi.org/project/autogen-agentchat/)
 [![NuGet version](https://badge.fury.io/nu/AutoGen.Core.svg)](https://badge.fury.io/nu/AutoGen.Core)

-[![Downloads](https://static.pepy.tech/badge/pyautogen/week)](https://pepy.tech/project/pyautogen)
-[![Discord](https://img.shields.io/discord/1153072414184452236?logo=discord&style=flat)](https://aka.ms/autogen-dc)
-
 [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40pyautogen)](https://twitter.com/pyautogen)

 </div>
@ -20,12 +17,16 @@
 AutoGen is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks. AutoGen aims to streamline the development and research of agentic AI, much like PyTorch does for Deep Learning. It offers features such as agents capable of interacting with each other, facilitates the use of various large language models (LLMs) and tool use support, autonomous and human-in-the-loop workflows, and multi-agent conversation patterns.

 > [!IMPORTANT]
-> *Note for contributors and users*</b>: [microsoft/autogen](https://aka.ms/autogen-gh) is the official repository of AutoGen project and it is under active development and maintenance under MIT license. We welcome contributions from developers and organizations worldwide. Our goal is to foster a collaborative and inclusive community where diverse perspectives and expertise can drive innovation and enhance the project's capabilities. We acknowledge the invaluable contributions from our existing contributors, as listed in [contributors.md](./CONTRIBUTORS.md). Whether you are an individual contributor or represent an organization, we invite you to join us in shaping the future of this project. For further information please also see [Microsoft open-source contributing guidelines](https://github.com/microsoft/autogen?tab=readme-ov-file#contributing).
+> In order to better align with a new multi-packaging structure we have coming very soon, AutoGen is now available on PyPi as [`autogen-agentchat`](https://pypi.org/project/autogen-agentchat/) as of version `0.2.36`. This is the official package for the AutoGen project.
+
+
+> [!NOTE]
+> *Note for contributors and users*</b>: [microsoft/autogen](https://aka.ms/autogen-gh) is the original repository of AutoGen project and it is under active development and maintenance under MIT license. We welcome contributions from developers and organizations worldwide. Our goal is to foster a collaborative and inclusive community where diverse perspectives and expertise can drive innovation and enhance the project's capabilities. We acknowledge the invaluable contributions from our existing contributors, as listed in [contributors.md](./CONTRIBUTORS.md). Whether you are an individual contributor or represent an organization, we invite you to join us in shaping the future of this project. For further information please also see [Microsoft open-source contributing guidelines](https://github.com/microsoft/autogen?tab=readme-ov-file#contributing).
 >
 > -_Maintainers (Sept 6th, 2024)_


-![AutoGen Overview](https://github.com/microsoft/autogen/blob/main/website/static/img/autogen_agentchat.png)
+![AutoGen Overview](https://github.com/microsoft/autogen/blob/0.2/website/static/img/autogen_agentchat.png)

 - AutoGen enables building next-gen LLM applications based on [multi-agent conversations](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.
 - It supports [diverse conversation patterns](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns) for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,
@ -135,14 +136,14 @@ Find detailed instructions for users [here](https://microsoft.github.io/autogen/
 AutoGen requires **Python version >= 3.8, < 3.13**. It can be installed from pip:

 ```bash
-pip install pyautogen
+pip install autogen-agentchat~=0.2
 ```

 Minimal dependencies are installed without extra options. You can install extra options based on the feature you need.

 <!-- For example, use the following to install the dependencies needed by the [`blendsearch`](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function#blendsearch-economical-hyperparameter-optimization-with-blended-search-strategy) option.
 ```bash
-pip install "pyautogen[blendsearch]"
+pip install "autogen-agentchat[blendsearch]~=0.2"
 ``` -->

 Find more options in [Installation](https://microsoft.github.io/autogen/docs/Installation#option-2-install-autogen-locally-using-virtual-environment).
@ -170,7 +171,7 @@ Features of this use case include:
 - **Customization**: AutoGen agents can be customized to meet the specific needs of an application. This includes the ability to choose the LLMs to use, the types of human input to allow, and the tools to employ.
 - **Human participation**: AutoGen seamlessly allows human participation. This means that humans can provide input and feedback to the agents as needed.

-For [example](https://github.com/microsoft/autogen/blob/main/test/twoagent.py),
+For [example](https://github.com/microsoft/autogen/blob/0.2/test/twoagent.py),

 ```python
 from autogen import AssistantAgent, UserProxyAgent, config_list_from_json
@ -193,9 +194,9 @@ python test/twoagent.py

 After the repo is cloned.
 The figure below shows an example conversation flow with AutoGen.
-![Agent Chat Example](https://github.com/microsoft/autogen/blob/main/website/static/img/chat_example.png)
+![Agent Chat Example](https://github.com/microsoft/autogen/blob/0.2/website/static/img/chat_example.png)

-Alternatively, the [sample code](https://github.com/microsoft/autogen/blob/main/samples/simple_chat.py) here allows a user to chat with an AutoGen agent in ChatGPT style.
+Alternatively, the [sample code](https://github.com/microsoft/autogen/blob/0.2/samples/simple_chat.py) here allows a user to chat with an AutoGen agent in ChatGPT style.
 Please find more [code examples](https://microsoft.github.io/autogen/docs/Examples#automated-multi-agent-chat) for this feature.

 <p align="right" style="font-size: 14px; color: #555; margin-top: 20px;">
@ -239,9 +240,7 @@ You can find detailed documentation about AutoGen [here](https://microsoft.githu

 In addition, you can find:

- [Research](https://microsoft.github.io/autogen/docs/Research), [blogposts](https://microsoft.github.io/autogen/blog) around AutoGen, and [Transparency FAQs](https://github.com/microsoft/autogen/blob/main/TRANSPARENCY_FAQS.md)
-
- [Discord](https://aka.ms/autogen-dc)
+- [Research](https://microsoft.github.io/autogen/docs/Research), [blogposts](https://microsoft.github.io/autogen/blog) around AutoGen, and [Transparency FAQs](https://github.com/microsoft/autogen/blob/0.2/TRANSPARENCY_FAQS.md)

 - [Contributing guide](https://microsoft.github.io/autogen/docs/Contribute)

--- a/autogen/agentchat/contrib/agent_builder.py
+++ b/autogen/agentchat/contrib/agent_builder.py
@ -172,6 +172,26 @@ Match roles in the role set to each expert in expert set.
 ```
 """

+    AGENT_FUNCTION_MAP_PROMPT = """Consider the following function.
+    Function Name: {function_name}
+    Function Description: {function_description}
+
+    The agent details are given in the format: {format_agent_details}
+
+    Which one of the following agents should be able to execute this function, preferably an agent with programming background?
+    {agent_details}
+
+    Hint:
+    # Only respond with the name of the agent that is most suited to execute the function and nothing else.
+    """
+
+    UPDATED_AGENT_SYSTEM_MESSAGE = """
+    {agent_system_message}
+
+    You have access to execute the function: {function_name}.
+    With following description: {function_description}
+    """
+
    def __init__(
        self,
        config_file_or_env: Optional[str] = "OAI_CONFIG_LIST",
@ -358,6 +378,7 @@ Match roles in the role set to each expert in expert set.
        self,
        building_task: str,
        default_llm_config: Dict,
+        list_of_functions: Optional[List[Dict]] = None,
        coding: Optional[bool] = None,
        code_execution_config: Optional[Dict] = None,
        use_oai_assistant: Optional[bool] = False,
@ -373,6 +394,7 @@ Match roles in the role set to each expert in expert set.
            coding: use to identify if the user proxy (a code interpreter) should be added.
            code_execution_config: specific configs for user proxy (e.g., last_n_messages, work_dir, ...).
            default_llm_config: specific configs for LLM (e.g., config_list, seed, temperature, ...).
+            list_of_functions: list of functions to be associated with Agents
            use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
            user_proxy: user proxy's class that can be used to replace the default user proxy.

@ -480,8 +502,9 @@ Match roles in the role set to each expert in expert set.
                "code_execution_config": code_execution_config,
            }
        )
+
        _config_check(self.cached_configs)
-        return self._build_agents(use_oai_assistant, user_proxy=user_proxy, **kwargs)
+        return self._build_agents(use_oai_assistant, list_of_functions, user_proxy=user_proxy, **kwargs)

    def build_from_library(
        self,
@ -653,13 +676,18 @@ Match roles in the role set to each expert in expert set.
        return self._build_agents(use_oai_assistant, user_proxy=user_proxy, **kwargs)

    def _build_agents(
-        self, use_oai_assistant: Optional[bool] = False, user_proxy: Optional[autogen.ConversableAgent] = None, **kwargs
+        self,
+        use_oai_assistant: Optional[bool] = False,
+        list_of_functions: Optional[List[Dict]] = None,
+        user_proxy: Optional[autogen.ConversableAgent] = None,
+        **kwargs,
    ) -> Tuple[List[autogen.ConversableAgent], Dict]:
        """
        Build agents with generated configs.

        Args:
            use_oai_assistant: use OpenAI assistant api instead of self-constructed agent.
+            list_of_functions: list of functions to be associated to Agents
            user_proxy: user proxy's class that can be used to replace the default user proxy.

        Returns:
@ -695,6 +723,53 @@ Match roles in the role set to each expert in expert set.
                )
            agent_list = agent_list + [user_proxy]

+            agent_details = []
+
+            for agent in agent_list[:-1]:
+                agent_details.append({"name": agent.name, "description": agent.description})
+
+            if list_of_functions:
+                for func in list_of_functions:
+                    resp = (
+                        self.builder_model.create(
+                            messages=[
+                                {
+                                    "role": "user",
+                                    "content": self.AGENT_FUNCTION_MAP_PROMPT.format(
+                                        function_name=func["name"],
+                                        function_description=func["description"],
+                                        format_agent_details='[{"name": "agent_name", "description": "agent description"}, ...]',
+                                        agent_details=str(json.dumps(agent_details)),
+                                    ),
+                                }
+                            ]
+                        )
+                        .choices[0]
+                        .message.content
+                    )
+
+                    autogen.agentchat.register_function(
+                        func["function"],
+                        caller=self.agent_procs_assign[resp][0],
+                        executor=agent_list[0],
+                        name=func["name"],
+                        description=func["description"],
+                    )
+
+                    agents_current_system_message = [
+                        agent["system_message"] for agent in agent_configs if agent["name"] == resp
+                    ][0]
+
+                    self.agent_procs_assign[resp][0].update_system_message(
+                        self.UPDATED_AGENT_SYSTEM_MESSAGE.format(
+                            agent_system_message=agents_current_system_message,
+                            function_name=func["name"],
+                            function_description=func["description"],
+                        )
+                    )
+
+                    print(f"Function {func['name']} is registered to agent {resp}.")
+
        return agent_list, self.cached_configs.copy()

    def save(self, filepath: Optional[str] = None) -> str:
--- a/autogen/agentchat/contrib/capabilities/text_compressors.py
+++ b/autogen/agentchat/contrib/capabilities/text_compressors.py
@ -5,7 +5,7 @@ try:
    import llmlingua
 except ImportError:
    IMPORT_ERROR = ImportError(
-        "LLMLingua is not installed. Please install it with `pip install pyautogen[long-context]`"
+        "LLMLingua is not installed. Please install it with `pip install autogen-agentchat[long-context]~=0.2`"
    )
    PromptCompressor = object
 else:
--- a/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
+++ b/autogen/agentchat/contrib/retrieve_user_proxy_agent.py
@ -9,7 +9,9 @@ from IPython import get_ipython
 try:
    import chromadb
 except ImportError as e:
-    raise ImportError(f"{e}. You can try `pip install pyautogen[retrievechat]`, or install `chromadb` manually.")
+    raise ImportError(
+        f"{e}. You can try `pip install autogen-agentchat[retrievechat]~=0.2`, or install `chromadb` manually."
+    )
 from autogen.agentchat import UserProxyAgent
 from autogen.agentchat.agent import Agent
 from autogen.agentchat.contrib.vectordb.base import Document, QueryResults, VectorDB, VectorDBFactory
--- a/autogen/agentchat/contrib/vectordb/couchbase.py
+++ b/autogen/agentchat/contrib/vectordb/couchbase.py
@ -56,16 +56,7 @@ class CouchbaseVectorDB(VectorDB):
            wait_until_index_ready (float | None): Blocking call to wait until the database indexes are ready. None means no wait. Default is None.
            wait_until_document_ready (float | None): Blocking call to wait until the database documents are ready. None means no wait. Default is None.
        """
-        print(
-            "CouchbaseVectorDB",
-            connection_string,
-            username,
-            password,
-            bucket_name,
-            scope_name,
-            collection_name,
-            index_name,
-        )
+
        self.embedding_function = embedding_function
        self.index_name = index_name

@ -119,6 +110,7 @@ class CouchbaseVectorDB(VectorDB):
        try:
            collection_mgr = self.bucket.collections()
            collection_mgr.create_collection(self.scope.name, collection_name)
+            self.cluster.query(f"CREATE PRIMARY INDEX ON {self.bucket.name}.{self.scope.name}.{collection_name}")

        except Exception:
            if not get_or_create:
@ -287,7 +279,12 @@ class CouchbaseVectorDB(VectorDB):
                    [doc["content"]]
                ).tolist()  # Gets new embedding even in case of document update

-                doc_content = {TEXT_KEY: doc["content"], "metadata": doc.get("metadata", {}), EMBEDDING_KEY: embedding}
+                doc_content = {
+                    TEXT_KEY: doc["content"],
+                    "metadata": doc.get("metadata", {}),
+                    EMBEDDING_KEY: embedding,
+                    "id": doc_id,
+                }
                docs_to_upsert[doc_id] = doc_content
            collection.upsert_multi(docs_to_upsert)

--- a/autogen/agentchat/contrib/vectordb/pgvectordb.py
+++ b/autogen/agentchat/contrib/vectordb/pgvectordb.py
@ -4,16 +4,17 @@ import urllib.parse
 from typing import Callable, List, Optional, Union

 import numpy as np
+
+# try:
+import pgvector
+from pgvector.psycopg import register_vector
 from sentence_transformers import SentenceTransformer

 from .base import Document, ItemID, QueryResults, VectorDB
 from .utils import get_logger

-try:
-    import pgvector
-    from pgvector.psycopg import register_vector
-except ImportError:
-    raise ImportError("Please install pgvector: `pip install pgvector`")
+# except ImportError:
+#     raise ImportError("Please install pgvector: `pip install pgvector`")

 try:
    import psycopg
@ -416,6 +417,7 @@ class Collection:
        results = []
        for query_text in query_texts:
            vector = self.embedding_function(query_text)
+            vector_string = "[" + ",".join([f"{x:.8f}" for x in vector]) + "]"

            if distance_type.lower() == "cosine":
                index_function = "<=>"
@ -428,7 +430,7 @@ class Collection:
            query = (
                f"SELECT id, documents, embedding, metadatas "
                f"FROM {self.name} "
-                f"{clause} embedding {index_function} '{str(vector)}' {distance_threshold} "
+                f"{clause} embedding {index_function} '{vector_string}' {distance_threshold} "
                f"LIMIT {n_results}"
            )
            cursor.execute(query)
--- a/autogen/agentchat/conversable_agent.py
+++ b/autogen/agentchat/conversable_agent.py
@ -7,7 +7,7 @@ import logging
 import re
 import warnings
 from collections import defaultdict
-from typing import Any, Callable, Dict, List, Literal, Optional, Tuple, Type, TypeVar, Union
+from typing import Any, Callable, Coroutine, Dict, List, Literal, Optional, Tuple, Type, TypeVar, Union

 from openai import BadRequestError

@ -247,10 +247,13 @@ class ConversableAgent(LLMAgent):

        # Registered hooks are kept in lists, indexed by hookable method, to be called in their order of registration.
        # New hookable methods should be added to this list as required to support new agent capabilities.
-        self.hook_lists: Dict[str, List[Callable]] = {
+        self.hook_lists: Dict[str, List[Union[Callable, Callable[..., Coroutine]]]] = {
            "process_last_received_message": [],
+            "a_process_last_received_message": [],
            "process_all_messages_before_reply": [],
+            "a_process_all_messages_before_reply": [],
            "process_message_before_send": [],
+            "a_process_message_before_send": [],
        }

    def _validate_llm_config(self, llm_config):
@ -680,11 +683,24 @@ class ConversableAgent(LLMAgent):
        """Process the message before sending it to the recipient."""
        hook_list = self.hook_lists["process_message_before_send"]
        for hook in hook_list:
+            if inspect.iscoroutinefunction(hook):
+                continue
            message = hook(
                sender=self, message=message, recipient=recipient, silent=ConversableAgent._is_silent(self, silent)
            )
        return message

+    async def _a_process_message_before_send(
+        self, message: Union[Dict, str], recipient: Agent, silent: bool
+    ) -> Union[Dict, str]:
+        """(async) Process the message before sending it to the recipient."""
+        hook_list = self.hook_lists["a_process_message_before_send"]
+        for hook in hook_list:
+            if not inspect.iscoroutinefunction(hook):
+                continue
+            message = await hook(sender=self, message=message, recipient=recipient, silent=silent)
+        return message
+
    def send(
        self,
        message: Union[Dict, str],
@ -774,7 +790,9 @@ class ConversableAgent(LLMAgent):
        Raises:
            ValueError: if the message can't be converted into a valid ChatCompletion message.
        """
-        message = self._process_message_before_send(message, recipient, ConversableAgent._is_silent(self, silent))
+        message = await self._a_process_message_before_send(
+            message, recipient, ConversableAgent._is_silent(self, silent)
+        )
        # When the agent composes and sends the message, the role of the message is "assistant"
        # unless it's "function".
        valid = self._append_oai_message(message, "assistant", recipient, is_sending=True)
@ -2104,11 +2122,11 @@ class ConversableAgent(LLMAgent):

        # Call the hookable method that gives registered hooks a chance to process all messages.
        # Message modifications do not affect the incoming messages or self._oai_messages.
-        messages = self.process_all_messages_before_reply(messages)
+        messages = await self.a_process_all_messages_before_reply(messages)

        # Call the hookable method that gives registered hooks a chance to process the last message.
        # Message modifications do not affect the incoming messages or self._oai_messages.
-        messages = self.process_last_received_message(messages)
+        messages = await self.a_process_last_received_message(messages)

        for reply_func_tuple in self._reply_func_list:
            reply_func = reply_func_tuple["reply_func"]
@ -2786,6 +2804,19 @@ class ConversableAgent(LLMAgent):
        assert hookable_method in self.hook_lists, f"{hookable_method} is not a hookable method."
        hook_list = self.hook_lists[hookable_method]
        assert hook not in hook_list, f"{hook} is already registered as a hook."
+
+        # async hookable checks
+        expected_async = hookable_method.startswith("a_")
+        hook_is_async = inspect.iscoroutinefunction(hook)
+        if expected_async != hook_is_async:
+            context_type = "asynchronous" if expected_async else "synchronous"
+            warnings.warn(
+                f"Hook '{hook.__name__}' is {'asynchronous' if hook_is_async else 'synchronous'}, "
+                f"but it's being registered in a {context_type} context ('{hookable_method}'). "
+                "Ensure the hook matches the expected execution context.",
+                UserWarning,
+            )
+
        hook_list.append(hook)

    def process_all_messages_before_reply(self, messages: List[Dict]) -> List[Dict]:
@ -2800,9 +2831,28 @@ class ConversableAgent(LLMAgent):
        # Call each hook (in order of registration) to process the messages.
        processed_messages = messages
        for hook in hook_list:
+            if inspect.iscoroutinefunction(hook):
+                continue
            processed_messages = hook(processed_messages)
        return processed_messages

+    async def a_process_all_messages_before_reply(self, messages: List[Dict]) -> List[Dict]:
+        """
+        Calls any registered capability hooks to process all messages, potentially modifying the messages.
+        """
+        hook_list = self.hook_lists["a_process_all_messages_before_reply"]
+        # If no hooks are registered, or if there are no messages to process, return the original message list.
+        if len(hook_list) == 0 or messages is None:
+            return messages
+
+        # Call each hook (in order of registration) to process the messages.
+        processed_messages = messages
+        for hook in hook_list:
+            if not inspect.iscoroutinefunction(hook):
+                continue
+            processed_messages = await hook(processed_messages)
+        return processed_messages
+
    def process_last_received_message(self, messages: List[Dict]) -> List[Dict]:
        """
        Calls any registered capability hooks to use and potentially modify the text of the last message,
@ -2836,6 +2886,8 @@ class ConversableAgent(LLMAgent):
        # Call each hook (in order of registration) to process the user's message.
        processed_user_content = user_content
        for hook in hook_list:
+            if inspect.iscoroutinefunction(hook):
+                continue
            processed_user_content = hook(processed_user_content)

        if processed_user_content == user_content:
@ -2846,6 +2898,51 @@ class ConversableAgent(LLMAgent):
        messages[-1]["content"] = processed_user_content
        return messages

+    async def a_process_last_received_message(self, messages: List[Dict]) -> List[Dict]:
+        """
+        Calls any registered capability hooks to use and potentially modify the text of the last message,
+        as long as the last message is not a function call or exit command.
+        """
+
+        # If any required condition is not met, return the original message list.
+        hook_list = self.hook_lists["a_process_last_received_message"]
+        if len(hook_list) == 0:
+            return messages  # No hooks registered.
+        if messages is None:
+            return None  # No message to process.
+        if len(messages) == 0:
+            return messages  # No message to process.
+        last_message = messages[-1]
+        if "function_call" in last_message:
+            return messages  # Last message is a function call.
+        if "context" in last_message:
+            return messages  # Last message contains a context key.
+        if "content" not in last_message:
+            return messages  # Last message has no content.
+
+        user_content = last_message["content"]
+        if not isinstance(user_content, str) and not isinstance(user_content, list):
+            # if the user_content is a string, it is for regular LLM
+            # if the user_content is a list, it should follow the multimodal LMM format.
+            return messages
+        if user_content == "exit":
+            return messages  # Last message is an exit command.
+
+        # Call each hook (in order of registration) to process the user's message.
+        processed_user_content = user_content
+        for hook in hook_list:
+            if not inspect.iscoroutinefunction(hook):
+                continue
+            processed_user_content = await hook(processed_user_content)
+
+        if processed_user_content == user_content:
+            return messages  # No hooks actually modified the user's message.
+
+        # Replace the last user message with the expanded one.
+        messages = messages.copy()
+        messages[-1]["content"] = processed_user_content
+        return messages
+
    def print_usage_summary(self, mode: Union[str, List[str]] = ["actual", "total"]) -> None:
        """Print the usage summary."""
        iostream = IOStream.get_default()
--- a/autogen/agentchat/groupchat.py
+++ b/autogen/agentchat/groupchat.py
@ -12,9 +12,9 @@ from ..exception_utils import AgentNameConflict, NoEligibleSpeaker, UndefinedNex
 from ..formatting_utils import colored
 from ..graph_utils import check_graph_validity, invert_disallowed_to_allowed
 from ..io.base import IOStream
+from ..oai.client import ModelClient
 from ..runtime_logging import log_new_agent, logging_enabled
 from .agent import Agent
-from .chat import ChatResult
 from .conversable_agent import ConversableAgent

 try:
@ -105,6 +105,8 @@ class GroupChat:
        "clear history" phrase in user prompt. This is experimental feature.
        See description of GroupChatManager.clear_agents_history function for more info.
    - send_introductions: send a round of introductions at the start of the group chat, so agents know who they can speak to (default: False)
+    - select_speaker_auto_model_client_cls: Custom model client class for the internal speaker select agent used during 'auto' speaker selection (optional)
+    - select_speaker_auto_llm_config: LLM config for the internal speaker select agent used during 'auto' speaker selection (optional)
    - role_for_select_speaker_messages: sets the role name for speaker selection when in 'auto' mode, typically 'user' or 'system'. (default: 'system')
    """

@ -142,6 +144,8 @@ class GroupChat:
    Respond with ONLY the name of the speaker and DO NOT provide a reason."""
    select_speaker_transform_messages: Optional[Any] = None
    select_speaker_auto_verbose: Optional[bool] = False
+    select_speaker_auto_model_client_cls: Optional[Union[ModelClient, List[ModelClient]]] = None
+    select_speaker_auto_llm_config: Optional[Union[Dict, Literal[False]]] = None
    role_for_select_speaker_messages: Optional[str] = "system"

    _VALID_SPEAKER_SELECTION_METHODS = ["auto", "manual", "random", "round_robin"]
@ -591,6 +595,79 @@ class GroupChat:
        agent = self.agent_by_name(name)
        return agent if agent else self.next_agent(last_speaker, agents)

+    def _register_client_from_config(self, agent: Agent, config: Dict):
+        model_client_cls_to_match = config.get("model_client_cls")
+        if model_client_cls_to_match:
+            if not self.select_speaker_auto_model_client_cls:
+                raise ValueError(
+                    "A custom model was detected in the config but no 'model_client_cls' "
+                    "was supplied for registration in GroupChat."
+                )
+
+            if isinstance(self.select_speaker_auto_model_client_cls, list):
+                # Register the first custom model client class matching the name specified in the config
+                matching_model_cls = [
+                    client_cls
+                    for client_cls in self.select_speaker_auto_model_client_cls
+                    if client_cls.__name__ == model_client_cls_to_match
+                ]
+                if len(set(matching_model_cls)) > 1:
+                    raise RuntimeError(
+                        f"More than one unique 'model_client_cls' with __name__ '{model_client_cls_to_match}'."
+                    )
+                if not matching_model_cls:
+                    raise ValueError(
+                        "No model's __name__ matches the model client class "
+                        f"'{model_client_cls_to_match}' specified in select_speaker_auto_llm_config."
+                    )
+                select_speaker_auto_model_client_cls = matching_model_cls[0]
+            else:
+                # Register the only custom model client
+                select_speaker_auto_model_client_cls = self.select_speaker_auto_model_client_cls
+
+            agent.register_model_client(select_speaker_auto_model_client_cls)
+
+    def _register_custom_model_clients(self, agent: ConversableAgent):
+        if not self.select_speaker_auto_llm_config:
+            return
+
+        config_format_is_list = "config_list" in self.select_speaker_auto_llm_config.keys()
+        if config_format_is_list:
+            for config in self.select_speaker_auto_llm_config["config_list"]:
+                self._register_client_from_config(agent, config)
+        elif not config_format_is_list:
+            self._register_client_from_config(agent, self.select_speaker_auto_llm_config)
+
+    def _create_internal_agents(
+        self, agents, max_attempts, messages, validate_speaker_name, selector: Optional[ConversableAgent] = None
+    ):
+        checking_agent = ConversableAgent("checking_agent", default_auto_reply=max_attempts)
+
+        # Register the speaker validation function with the checking agent
+        checking_agent.register_reply(
+            [ConversableAgent, None],
+            reply_func=validate_speaker_name,  # Validate each response
+            remove_other_reply_funcs=True,
+        )
+
+        # Override the selector's config if one was passed as a parameter to this class
+        speaker_selection_llm_config = self.select_speaker_auto_llm_config or selector.llm_config
+
+        # Agent for selecting a single agent name from the response
+        speaker_selection_agent = ConversableAgent(
+            "speaker_selection_agent",
+            system_message=self.select_speaker_msg(agents),
+            chat_messages={checking_agent: messages},
+            llm_config=speaker_selection_llm_config,
+            human_input_mode="NEVER",
+            # Suppresses some extra terminal outputs, outputs will be handled by select_speaker_auto_verbose
+        )
+
+        # Register any custom model passed in select_speaker_auto_llm_config with the speaker_selection_agent
+        self._register_custom_model_clients(speaker_selection_agent)
+
+        return checking_agent, speaker_selection_agent
+
    def _auto_select_speaker(
        self,
        last_speaker: Agent,
@ -644,28 +721,8 @@ class GroupChat:
        # Two-agent chat for speaker selection

        # Agent for checking the response from the speaker_select_agent
-        checking_agent = ConversableAgent("checking_agent", default_auto_reply=max_attempts)
-
-        # Register the speaker validation function with the checking agent
-        checking_agent.register_reply(
-            [ConversableAgent, None],
-            reply_func=validate_speaker_name,  # Validate each response
-            remove_other_reply_funcs=True,
-        )
-
-        # NOTE: Do we have a speaker prompt (select_speaker_prompt_template is not None)? If we don't, we need to feed in the last message to start the nested chat
-
-        # Agent for selecting a single agent name from the response
-        speaker_selection_agent = ConversableAgent(
-            "speaker_selection_agent",
-            system_message=self.select_speaker_msg(agents),
-            chat_messages=(
-                {checking_agent: messages}
-                if self.select_speaker_prompt_template is not None
-                else {checking_agent: messages[:-1]}
-            ),
-            llm_config=selector.llm_config,
-            human_input_mode="NEVER",  # Suppresses some extra terminal outputs, outputs will be handled by select_speaker_auto_verbose
+        checking_agent, speaker_selection_agent = self._create_internal_agents(
+            agents, max_attempts, messages, validate_speaker_name, selector
        )

        # Create the starting message
@ -747,24 +804,8 @@ class GroupChat:
        # Two-agent chat for speaker selection

        # Agent for checking the response from the speaker_select_agent
-        checking_agent = ConversableAgent("checking_agent", default_auto_reply=max_attempts)
-
-        # Register the speaker validation function with the checking agent
-        checking_agent.register_reply(
-            [ConversableAgent, None],
-            reply_func=validate_speaker_name,  # Validate each response
-            remove_other_reply_funcs=True,
-        )
-
-        # NOTE: Do we have a speaker prompt (select_speaker_prompt_template is not None)? If we don't, we need to feed in the last message to start the nested chat
-
-        # Agent for selecting a single agent name from the response
-        speaker_selection_agent = ConversableAgent(
-            "speaker_selection_agent",
-            system_message=self.select_speaker_msg(agents),
-            chat_messages={checking_agent: messages},
-            llm_config=selector.llm_config,
-            human_input_mode="NEVER",  # Suppresses some extra terminal outputs, outputs will be handled by select_speaker_auto_verbose
+        checking_agent, speaker_selection_agent = self._create_internal_agents(
+            agents, max_attempts, messages, validate_speaker_name, selector
        )

        # Create the starting message
--- a/autogen/coding/kubernetes/init.py
+++ b/autogen/coding/kubernetes/init.py
@ -0,0 +1,5 @@
+from .pod_commandline_code_executor import PodCommandLineCodeExecutor
+
+__all__ = [
+    "PodCommandLineCodeExecutor",
+]
--- a/autogen/coding/kubernetes/pod_commandline_code_executor.py
+++ b/autogen/coding/kubernetes/pod_commandline_code_executor.py
@ -0,0 +1,323 @@
+from __future__ import annotations
+
+import atexit
+import importlib
+import sys
+import textwrap
+import uuid
+from hashlib import md5
+from pathlib import Path
+from time import sleep
+from types import TracebackType
+from typing import Any, ClassVar, Dict, List, Optional, Type, Union
+
+client = importlib.import_module("kubernetes.client")
+config = importlib.import_module("kubernetes.config")
+ApiException = importlib.import_module("kubernetes.client.rest").ApiException
+stream = importlib.import_module("kubernetes.stream").stream
+
+from ...code_utils import TIMEOUT_MSG, _cmd
+from ..base import CodeBlock, CodeExecutor, CodeExtractor, CommandLineCodeResult
+from ..markdown_code_extractor import MarkdownCodeExtractor
+from ..utils import _get_file_name_from_content, silence_pip
+
+if sys.version_info >= (3, 11):
+    from typing import Self
+else:
+    from typing_extensions import Self
+
+
+class PodCommandLineCodeExecutor(CodeExecutor):
+    DEFAULT_EXECUTION_POLICY: ClassVar[Dict[str, bool]] = {
+        "bash": True,
+        "shell": True,
+        "sh": True,
+        "pwsh": False,
+        "powershell": False,
+        "ps1": False,
+        "python": True,
+        "javascript": False,
+        "html": False,
+        "css": False,
+    }
+    LANGUAGE_ALIASES: ClassVar[Dict[str, str]] = {
+        "py": "python",
+        "js": "javascript",
+    }
+    LANGUAGE_FILE_EXTENSION: ClassVar[Dict[str, str]] = {
+        "python": "py",
+        "javascript": "js",
+        "bash": "sh",
+        "shell": "sh",
+        "sh": "sh",
+    }
+
+    def __init__(
+        self,
+        image: str = "python:3-slim",
+        pod_name: Optional[str] = None,
+        namespace: Optional[str] = None,
+        pod_spec: Optional[client.V1Pod] = None,  # type: ignore
+        container_name: Optional[str] = "autogen-code-exec",
+        timeout: int = 60,
+        work_dir: Union[Path, str] = Path("/workspace"),
+        kube_config_file: Optional[str] = None,
+        stop_container: bool = True,
+        execution_policies: Optional[Dict[str, bool]] = None,
+    ):
+        """(Experimental) A code executor class that executes code through
+        a command line environment in a kubernetes pod.
+
+        The executor first saves each code block in a file in the working
+        directory, and then executes the code file in the container.
+        The executor executes the code blocks in the order they are received.
+        Currently, the executor only supports Python and shell scripts.
+        For Python code, use the language "python" for the code block.
+        For shell scripts, use the language "bash", "shell", or "sh" for the code
+        block.
+
+        Args:
+            image (_type_, optional): Docker image to use for code execution.
+                Defaults to "python:3-slim".
+            pod_name (Optional[str], optional): Name of the kubernetes pod
+                which is created. If None, will autogenerate a name. Defaults to None.
+            namespace (Optional[str], optional): Namespace of kubernetes pod
+                which is created. If None, will use current namespace of this instance
+            pod_spec (Optional[client.V1Pod], optional): Specification of kubernetes pod.
+                custom pod spec can be provided with this param.
+                if pod_spec is provided, params above(image, pod_name, namespace) are neglected.
+            container_name (Optional[str], optional): Name of the container where code block will be
+                executed. if pod_spec param is provided, container_name must be provided also.
+            timeout (int, optional): The timeout for code execution. Defaults to 60.
+            work_dir (Union[Path, str], optional): The working directory for the code
+                execution. Defaults to Path("/workspace").
+            kube_config_file (Optional[str], optional): kubernetes configuration file path.
+                If None, will use KUBECONFIG environment variables or service account token(incluster config)
+            stop_container (bool, optional): If true, will automatically stop the
+                container when stop is called, when the context manager exits or when
+                the Python process exits with atext. Defaults to True.
+            execution_policies (dict[str, bool], optional): defines supported execution language
+
+        Raises:
+            ValueError: On argument error, or if the container fails to start.
+        """
+        if kube_config_file is None:
+            config.load_config()
+        else:
+            config.load_config(config_file=kube_config_file)
+
+        self._api_client = client.CoreV1Api()
+
+        if timeout < 1:
+            raise ValueError("Timeout must be greater than or equal to 1.")
+        self._timeout = timeout
+
+        if isinstance(work_dir, str):
+            work_dir = Path(work_dir)
+        self._work_dir: Path = work_dir
+
+        if container_name is None:
+            container_name = "autogen-code-exec"
+        self._container_name = container_name
+
+        # Start a container from the image, read to exec commands later
+        if pod_spec:
+            pod = pod_spec
+        else:
+            if pod_name is None:
+                pod_name = f"autogen-code-exec-{uuid.uuid4()}"
+            if namespace is None:
+                namespace_path = "/var/run/secrets/kubernetes.io/serviceaccount/namespace"
+                if not Path(namespace_path).is_file():
+                    raise ValueError("Namespace where the pod will be launched must be provided")
+                with open(namespace_path, "r") as f:
+                    namespace = f.read()
+
+            pod = client.V1Pod(
+                metadata=client.V1ObjectMeta(name=pod_name, namespace=namespace),
+                spec=client.V1PodSpec(
+                    restart_policy="Never",
+                    containers=[
+                        client.V1Container(
+                            args=["-c", "while true;do sleep 5; done"],
+                            command=["/bin/sh"],
+                            name=container_name,
+                            image=image,
+                        )
+                    ],
+                ),
+            )
+
+        try:
+            pod_name = pod.metadata.name
+            namespace = pod.metadata.namespace
+            self._pod = self._api_client.create_namespaced_pod(namespace=namespace, body=pod)
+        except ApiException as e:
+            raise ValueError(f"Creating pod failed: {e}")
+
+        self._wait_for_ready()
+
+        def cleanup() -> None:
+            try:
+                self._api_client.delete_namespaced_pod(pod_name, namespace)
+            except ApiException:
+                pass
+            atexit.unregister(cleanup)
+
+        self._cleanup = cleanup
+
+        if stop_container:
+            atexit.register(cleanup)
+
+        self.execution_policies = self.DEFAULT_EXECUTION_POLICY.copy()
+        if execution_policies is not None:
+            self.execution_policies.update(execution_policies)
+
+    def _wait_for_ready(self, stop_time: float = 0.1) -> None:
+        elapsed_time = 0.0
+        name = self._pod.metadata.name
+        namespace = self._pod.metadata.namespace
+        while True:
+            sleep(stop_time)
+            elapsed_time += stop_time
+            if elapsed_time > self._timeout:
+                raise ValueError(
+                    f"pod name {name} on namespace {namespace} is not Ready after timeout {self._timeout} seconds"
+                )
+            try:
+                pod_status = self._api_client.read_namespaced_pod_status(name, namespace)
+                if pod_status.status.phase == "Running":
+                    break
+            except ApiException as e:
+                raise ValueError(f"reading pod status failed: {e}")
+
+    @property
+    def timeout(self) -> int:
+        """(Experimental) The timeout for code execution."""
+        return self._timeout
+
+    @property
+    def work_dir(self) -> Path:
+        """(Experimental) The working directory for the code execution."""
+        return self._work_dir
+
+    @property
+    def code_extractor(self) -> CodeExtractor:
+        """(Experimental) Export a code extractor that can be used by an agent."""
+        return MarkdownCodeExtractor()
+
+    def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CommandLineCodeResult:
+        """(Experimental) Execute the code blocks and return the result.
+
+        Args:
+            code_blocks (List[CodeBlock]): The code blocks to execute.
+
+        Returns:
+            CommandlineCodeResult: The result of the code execution."""
+
+        if len(code_blocks) == 0:
+            raise ValueError("No code blocks to execute.")
+
+        outputs = []
+        files = []
+        last_exit_code = 0
+        for code_block in code_blocks:
+            lang = self.LANGUAGE_ALIASES.get(code_block.language.lower(), code_block.language.lower())
+            if lang not in self.DEFAULT_EXECUTION_POLICY:
+                outputs.append(f"Unsupported language {lang}\n")
+                last_exit_code = 1
+                break
+
+            execute_code = self.execution_policies.get(lang, False)
+            code = silence_pip(code_block.code, lang)
+            if lang in ["bash", "shell", "sh"]:
+                code = "\n".join(["#!/bin/bash", code])
+
+            try:
+                filename = _get_file_name_from_content(code, self._work_dir)
+            except ValueError:
+                outputs.append("Filename is not in the workspace")
+                last_exit_code = 1
+                break
+
+            if not filename:
+                extension = self.LANGUAGE_FILE_EXTENSION.get(lang, lang)
+                filename = f"tmp_code_{md5(code.encode()).hexdigest()}.{extension}"
+
+            code_path = self._work_dir / filename
+
+            exec_script = textwrap.dedent(
+                """
+                if [ ! -d "{workspace}" ]; then
+                  mkdir {workspace}
+                fi
+                cat <<EOM >{code_path}\n
+                {code}
+                EOM
+                chmod +x {code_path}"""
+            )
+            exec_script = exec_script.format(workspace=str(self._work_dir), code_path=code_path, code=code)
+            stream(
+                self._api_client.connect_get_namespaced_pod_exec,
+                self._pod.metadata.name,
+                self._pod.metadata.namespace,
+                command=["/bin/sh", "-c", exec_script],
+                container=self._container_name,
+                stderr=True,
+                stdin=False,
+                stdout=True,
+                tty=False,
+            )
+
+            files.append(code_path)
+
+            if not execute_code:
+                outputs.append(f"Code saved to {str(code_path)}\n")
+                continue
+
+            resp = stream(
+                self._api_client.connect_get_namespaced_pod_exec,
+                self._pod.metadata.name,
+                self._pod.metadata.namespace,
+                command=["timeout", str(self._timeout), _cmd(lang), str(code_path)],
+                container=self._container_name,
+                stderr=True,
+                stdin=False,
+                stdout=True,
+                tty=False,
+                _preload_content=False,
+            )
+
+            stdout_messages = []
+            stderr_messages = []
+            while resp.is_open():
+                resp.update(timeout=1)
+                if resp.peek_stderr():
+                    stderr_messages.append(resp.read_stderr())
+                if resp.peek_stdout():
+                    stdout_messages.append(resp.read_stdout())
+            outputs.extend(stdout_messages + stderr_messages)
+            exit_code = resp.returncode
+            resp.close()
+
+            if exit_code == 124:
+                outputs.append("\n" + TIMEOUT_MSG)
+
+            last_exit_code = exit_code
+            if exit_code != 0:
+                break
+
+        code_file = str(files[0]) if files else None
+        return CommandLineCodeResult(exit_code=last_exit_code, output="".join(outputs), code_file=code_file)
+
+    def stop(self) -> None:
+        """(Experimental) Stop the code executor."""
+        self._cleanup()
+
+    def __enter__(self) -> Self:
+        return self
+
+    def __exit__(
+        self, exc_type: Optional[Type[BaseException]], exc_val: Optional[BaseException], exc_tb: Optional[TracebackType]
+    ) -> None:
+        self.stop()
--- a/autogen/logger/file_logger.py
+++ b/autogen/logger/file_logger.py
@ -19,10 +19,12 @@ if TYPE_CHECKING:
    from autogen import Agent, ConversableAgent, OpenAIWrapper
    from autogen.oai.anthropic import AnthropicClient
    from autogen.oai.bedrock import BedrockClient
+    from autogen.oai.cerebras import CerebrasClient
    from autogen.oai.cohere import CohereClient
    from autogen.oai.gemini import GeminiClient
    from autogen.oai.groq import GroqClient
    from autogen.oai.mistral import MistralAIClient
+    from autogen.oai.ollama import OllamaClient
    from autogen.oai.together import TogetherClient

 logger = logging.getLogger(__name__)
@ -210,12 +212,14 @@ class FileLogger(BaseLogger):
        client: (
            AzureOpenAI
            | OpenAI
+            | CerebrasClient
            | GeminiClient
            | AnthropicClient
            | MistralAIClient
            | TogetherClient
            | GroqClient
            | CohereClient
+            | OllamaClient
            | BedrockClient
        ),
        wrapper: OpenAIWrapper,
--- a/autogen/logger/sqlite_logger.py
+++ b/autogen/logger/sqlite_logger.py
@ -20,10 +20,12 @@ if TYPE_CHECKING:
    from autogen import Agent, ConversableAgent, OpenAIWrapper
    from autogen.oai.anthropic import AnthropicClient
    from autogen.oai.bedrock import BedrockClient
+    from autogen.oai.cerebras import CerebrasClient
    from autogen.oai.cohere import CohereClient
    from autogen.oai.gemini import GeminiClient
    from autogen.oai.groq import GroqClient
    from autogen.oai.mistral import MistralAIClient
+    from autogen.oai.ollama import OllamaClient
    from autogen.oai.together import TogetherClient

 logger = logging.getLogger(__name__)
@ -397,12 +399,14 @@ class SqliteLogger(BaseLogger):
        client: Union[
            AzureOpenAI,
            OpenAI,
+            CerebrasClient,
            GeminiClient,
            AnthropicClient,
            MistralAIClient,
            TogetherClient,
            GroqClient,
            CohereClient,
+            OllamaClient,
            BedrockClient,
        ],
        wrapper: OpenAIWrapper,
--- a/autogen/oai/cerebras.py
+++ b/autogen/oai/cerebras.py
@ -0,0 +1,270 @@
+"""Create an OpenAI-compatible client using Cerebras's API.
+
+Example:
+    llm_config={
+        "config_list": [{
+            "api_type": "cerebras",
+            "model": "llama3.1-8b",
+            "api_key": os.environ.get("CEREBRAS_API_KEY")
+        }]
+    }
+
+    agent = autogen.AssistantAgent("my_agent", llm_config=llm_config)
+
+Install Cerebras's python library using: pip install --upgrade cerebras_cloud_sdk
+
+Resources:
+- https://inference-docs.cerebras.ai/quickstart
+"""
+
+from __future__ import annotations
+
+import copy
+import os
+import time
+import warnings
+from typing import Any, Dict, List
+
+from cerebras.cloud.sdk import Cerebras, Stream
+from openai.types.chat import ChatCompletion, ChatCompletionMessageToolCall
+from openai.types.chat.chat_completion import ChatCompletionMessage, Choice
+from openai.types.completion_usage import CompletionUsage
+
+from autogen.oai.client_utils import should_hide_tools, validate_parameter
+
+CEREBRAS_PRICING_1K = {
+    # Convert pricing per million to per thousand tokens.
+    "llama3.1-8b": (0.10 / 1000, 0.10 / 1000),
+    "llama3.1-70b": (0.60 / 1000, 0.60 / 1000),
+}
+
+
+class CerebrasClient:
+    """Client for Cerebras's API."""
+
+    def __init__(self, api_key=None, **kwargs):
+        """Requires api_key or environment variable to be set
+
+        Args:
+            api_key (str): The API key for using Cerebras (or environment variable CEREBRAS_API_KEY needs to be set)
+        """
+        # Ensure we have the api_key upon instantiation
+        self.api_key = api_key
+        if not self.api_key:
+            self.api_key = os.getenv("CEREBRAS_API_KEY")
+
+        assert (
+            self.api_key
+        ), "Please include the api_key in your config list entry for Cerebras or set the CEREBRAS_API_KEY env variable."
+
+    def message_retrieval(self, response: ChatCompletion) -> List:
+        """
+        Retrieve and return a list of strings or a list of Choice.Message from the response.
+
+        NOTE: if a list of Choice.Message is returned, it currently needs to contain the fields of OpenAI's ChatCompletion Message object,
+        since that is expected for function or tool calling in the rest of the codebase at the moment, unless a custom agent is being used.
+        """
+        return [choice.message for choice in response.choices]
+
+    def cost(self, response: ChatCompletion) -> float:
+        # Note: This field isn't explicitly in `ChatCompletion`, but is injected during chat creation.
+        return response.cost
+
+    @staticmethod
+    def get_usage(response: ChatCompletion) -> Dict:
+        """Return usage summary of the response using RESPONSE_USAGE_KEYS."""
+        # ...  # pragma: no cover
+        return {
+            "prompt_tokens": response.usage.prompt_tokens,
+            "completion_tokens": response.usage.completion_tokens,
+            "total_tokens": response.usage.total_tokens,
+            "cost": response.cost,
+            "model": response.model,
+        }
+
+    def parse_params(self, params: Dict[str, Any]) -> Dict[str, Any]:
+        """Loads the parameters for Cerebras API from the passed in parameters and returns a validated set. Checks types, ranges, and sets defaults"""
+        cerebras_params = {}
+
+        # Check that we have what we need to use Cerebras's API
+        # We won't enforce the available models as they are likely to change
+        cerebras_params["model"] = params.get("model", None)
+        assert cerebras_params[
+            "model"
+        ], "Please specify the 'model' in your config list entry to nominate the Cerebras model to use."
+
+        # Validate allowed Cerebras parameters
+        # https://inference-docs.cerebras.ai/api-reference/chat-completions
+        cerebras_params["max_tokens"] = validate_parameter(params, "max_tokens", int, True, None, (0, None), None)
+        cerebras_params["seed"] = validate_parameter(params, "seed", int, True, None, None, None)
+        cerebras_params["stream"] = validate_parameter(params, "stream", bool, True, False, None, None)
+        cerebras_params["temperature"] = validate_parameter(
+            params, "temperature", (int, float), True, 1, (0, 1.5), None
+        )
+        cerebras_params["top_p"] = validate_parameter(params, "top_p", (int, float), True, None, None, None)
+
+        return cerebras_params
+
+    def create(self, params: Dict) -> ChatCompletion:
+
+        messages = params.get("messages", [])
+
+        # Convert AutoGen messages to Cerebras messages
+        cerebras_messages = oai_messages_to_cerebras_messages(messages)
+
+        # Parse parameters to the Cerebras API's parameters
+        cerebras_params = self.parse_params(params)
+
+        # Add tools to the call if we have them and aren't hiding them
+        if "tools" in params:
+            hide_tools = validate_parameter(
+                params, "hide_tools", str, False, "never", None, ["if_all_run", "if_any_run", "never"]
+            )
+            if not should_hide_tools(cerebras_messages, params["tools"], hide_tools):
+                cerebras_params["tools"] = params["tools"]
+
+        cerebras_params["messages"] = cerebras_messages
+
+        # We use chat model by default, and set max_retries to 5 (in line with typical retries loop)
+        client = Cerebras(api_key=self.api_key, max_retries=5)
+
+        # Token counts will be returned
+        prompt_tokens = 0
+        completion_tokens = 0
+        total_tokens = 0
+
+        # Streaming tool call recommendations
+        streaming_tool_calls = []
+
+        ans = None
+        try:
+            response = client.chat.completions.create(**cerebras_params)
+        except Exception as e:
+            raise RuntimeError(f"Cerebras exception occurred: {e}")
+        else:
+
+            if cerebras_params["stream"]:
+                # Read in the chunks as they stream, taking in tool_calls which may be across
+                # multiple chunks if more than one suggested
+                ans = ""
+                for chunk in response:
+                    # Grab first choice, which _should_ always be generated.
+                    ans = ans + (chunk.choices[0].delta.content or "")
+
+                    if chunk.choices[0].delta.tool_calls:
+                        # We have a tool call recommendation
+                        for tool_call in chunk.choices[0].delta.tool_calls:
+                            streaming_tool_calls.append(
+                                ChatCompletionMessageToolCall(
+                                    id=tool_call.id,
+                                    function={
+                                        "name": tool_call.function.name,
+                                        "arguments": tool_call.function.arguments,
+                                    },
+                                    type="function",
+                                )
+                            )
+
+                    if chunk.choices[0].finish_reason:
+                        prompt_tokens = chunk.x_cerebras.usage.prompt_tokens
+                        completion_tokens = chunk.x_cerebras.usage.completion_tokens
+                        total_tokens = chunk.x_cerebras.usage.total_tokens
+            else:
+                # Non-streaming finished
+                ans: str = response.choices[0].message.content
+
+                prompt_tokens = response.usage.prompt_tokens
+                completion_tokens = response.usage.completion_tokens
+                total_tokens = response.usage.total_tokens
+
+        if response is not None:
+            if isinstance(response, Stream):
+                # Streaming response
+                if chunk.choices[0].finish_reason == "tool_calls":
+                    cerebras_finish = "tool_calls"
+                    tool_calls = streaming_tool_calls
+                else:
+                    cerebras_finish = "stop"
+                    tool_calls = None
+
+                response_content = ans
+                response_id = chunk.id
+            else:
+                # Non-streaming response
+                # If we have tool calls as the response, populate completed tool calls for our return OAI response
+                if response.choices[0].finish_reason == "tool_calls":
+                    cerebras_finish = "tool_calls"
+                    tool_calls = []
+                    for tool_call in response.choices[0].message.tool_calls:
+                        tool_calls.append(
+                            ChatCompletionMessageToolCall(
+                                id=tool_call.id,
+                                function={"name": tool_call.function.name, "arguments": tool_call.function.arguments},
+                                type="function",
+                            )
+                        )
+                else:
+                    cerebras_finish = "stop"
+                    tool_calls = None
+
+                response_content = response.choices[0].message.content
+                response_id = response.id
+        else:
+            raise RuntimeError("Failed to get response from Cerebras after retrying 5 times.")
+
+        # 3. convert output
+        message = ChatCompletionMessage(
+            role="assistant",
+            content=response_content,
+            function_call=None,
+            tool_calls=tool_calls,
+        )
+        choices = [Choice(finish_reason=cerebras_finish, index=0, message=message)]
+
+        response_oai = ChatCompletion(
+            id=response_id,
+            model=cerebras_params["model"],
+            created=int(time.time()),
+            object="chat.completion",
+            choices=choices,
+            usage=CompletionUsage(
+                prompt_tokens=prompt_tokens,
+                completion_tokens=completion_tokens,
+                total_tokens=total_tokens,
+            ),
+            # Note: This seems to be a field that isn't in the schema of `ChatCompletion`, so Pydantic
+            #       just adds it dynamically.
+            cost=calculate_cerebras_cost(prompt_tokens, completion_tokens, cerebras_params["model"]),
+        )
+
+        return response_oai
+
+
+def oai_messages_to_cerebras_messages(messages: list[Dict[str, Any]]) -> list[dict[str, Any]]:
+    """Convert messages from OAI format to Cerebras's format.
+    We correct for any specific role orders and types.
+    """
+
+    cerebras_messages = copy.deepcopy(messages)
+
+    # Remove the name field
+    for message in cerebras_messages:
+        if "name" in message:
+            message.pop("name", None)
+
+    return cerebras_messages
+
+
+def calculate_cerebras_cost(input_tokens: int, output_tokens: int, model: str) -> float:
+    """Calculate the cost of the completion using the Cerebras pricing."""
+    total = 0.0
+
+    if model in CEREBRAS_PRICING_1K:
+        input_cost_per_k, output_cost_per_k = CEREBRAS_PRICING_1K[model]
+        input_cost = (input_tokens / 1000) * input_cost_per_k
+        output_cost = (output_tokens / 1000) * output_cost_per_k
+        total = input_cost + output_cost
+    else:
+        warnings.warn(f"Cost calculation not available for model {model}", UserWarning)
+
+    return total
--- a/autogen/oai/client.py
+++ b/autogen/oai/client.py
@ -6,7 +6,6 @@ import sys
 import uuid
 from typing import Any, Callable, Dict, List, Optional, Protocol, Tuple, Union

-from flaml.automl.logger import logger_formatter
 from pydantic import BaseModel

 from autogen.cache import Cache
@ -16,6 +15,7 @@ from autogen.oai.openai_utils import OAI_PRICE1K, get_key
 from autogen.runtime_logging import log_chat_completion, log_new_client, log_new_wrapper, logging_enabled
 from autogen.token_count_utils import count_token

+from .client_utils import logger_formatter
 from .rate_limiters import RateLimiter, TimeRateLimiter

 TOOL_ENABLED = False
@ -44,6 +44,13 @@ else:
        TOOL_ENABLED = True
    ERROR = None

+try:
+    from autogen.oai.cerebras import CerebrasClient
+
+    cerebras_import_exception: Optional[ImportError] = None
+except ImportError as e:
+    cerebras_import_exception = e
+
 try:
    from autogen.oai.gemini import GeminiClient

@ -86,6 +93,13 @@ try:
 except ImportError as e:
    cohere_import_exception = e

+try:
+    from autogen.oai.ollama import OllamaClient
+
+    ollama_import_exception: Optional[ImportError] = None
+except ImportError as e:
+    ollama_import_exception = e
+
 try:
    from autogen.oai.bedrock import BedrockClient

@ -165,10 +179,6 @@ class OpenAIClient:

    def __init__(self, client: Union[OpenAI, AzureOpenAI]):
        self._oai_client = client
-        if not isinstance(client, openai.AzureOpenAI) and str(client.base_url).startswith(OPEN_API_BASE_URL_PREFIX):
-            logger.warning(
-                "The API key specified is not a valid OpenAI format; it won't work with the OpenAI-hosted model."
-            )

    def message_retrieval(
        self, response: Union[ChatCompletion, Completion]
@ -505,6 +515,11 @@ class OpenAIWrapper:
                self._configure_azure_openai(config, openai_config)
                client = AzureOpenAI(**openai_config)
                self._clients.append(OpenAIClient(client))
+            elif api_type is not None and api_type.startswith("cerebras"):
+                if cerebras_import_exception:
+                    raise ImportError("Please install `cerebras_cloud_sdk` to use Cerebras OpenAI API.")
+                client = CerebrasClient(**openai_config)
+                self._clients.append(client)
            elif api_type is not None and api_type.startswith("google"):
                if gemini_import_exception:
                    raise ImportError("Please install `google-generativeai` to use Google OpenAI API.")
@ -537,6 +552,11 @@ class OpenAIWrapper:
                    raise ImportError("Please install `cohere` to use the Cohere API.")
                client = CohereClient(**openai_config)
                self._clients.append(client)
+            elif api_type is not None and api_type.startswith("ollama"):
+                if ollama_import_exception:
+                    raise ImportError("Please install with `[ollama]` option to use the Ollama API.")
+                client = OllamaClient(**openai_config)
+                self._clients.append(client)
            elif api_type is not None and api_type.startswith("bedrock"):
                self._configure_openai_config_for_bedrock(config, openai_config)
                if bedrock_import_exception:
--- a/autogen/oai/client_utils.py
+++ b/autogen/oai/client_utils.py
@ -1,8 +1,13 @@
 """Utilities for client classes"""

+import logging
 import warnings
 from typing import Any, Dict, List, Optional, Tuple

+logger_formatter = logging.Formatter(
+    "[%(name)s: %(asctime)s] {%(lineno)d} %(levelname)s - %(message)s", "%m-%d %H:%M:%S"
+)
+

 def validate_parameter(
    params: Dict[str, Any],
--- a/autogen/oai/cohere.py
+++ b/autogen/oai/cohere.py
@ -31,12 +31,11 @@ from typing import Any, Dict, List

 from cohere import Client as Cohere
 from cohere.types import ToolParameterDefinitionsValue, ToolResult
-from flaml.automl.logger import logger_formatter
 from openai.types.chat import ChatCompletion, ChatCompletionMessageToolCall
 from openai.types.chat.chat_completion import ChatCompletionMessage, Choice
 from openai.types.completion_usage import CompletionUsage

-from autogen.oai.client_utils import validate_parameter
+from .client_utils import logger_formatter, validate_parameter

 logger = logging.getLogger(__name__)
 if not logger.handlers:
--- a/autogen/oai/completion.py
+++ b/autogen/oai/completion.py
@ -8,9 +8,9 @@ from typing import Callable, Dict, List, Optional, Union

 import numpy as np
 from flaml import BlendSearch, tune
-from flaml.automl.logger import logger_formatter
 from flaml.tune.space import is_constant

+from .client_utils import logger_formatter
 from .openai_utils import get_key

 try:
--- a/autogen/oai/gemini.py
+++ b/autogen/oai/gemini.py
@ -32,6 +32,8 @@ Resources:
 from __future__ import annotations

 import base64
+import copy
+import json
 import logging
 import os
 import random
@ -39,24 +41,39 @@ import re
 import time
 import warnings
 from io import BytesIO
-from typing import Any, Dict, List, Mapping, Union
+from typing import Any, Dict, List, Union

 import google.generativeai as genai
 import requests
 import vertexai
-from google.ai.generativelanguage import Content, Part
+from google.ai.generativelanguage import Content, FunctionCall, FunctionDeclaration, FunctionResponse, Part, Tool
 from google.api_core.exceptions import InternalServerError
 from google.auth.credentials import Credentials
-from openai.types.chat import ChatCompletion
+from openai.types.chat import ChatCompletion, ChatCompletionMessageToolCall
 from openai.types.chat.chat_completion import ChatCompletionMessage, Choice
+from openai.types.chat.chat_completion_message_tool_call import Function
 from openai.types.completion_usage import CompletionUsage
 from PIL import Image
-from vertexai.generative_models import Content as VertexAIContent
+from vertexai.generative_models import (
+    Content as VertexAIContent,
+)
+from vertexai.generative_models import (
+    FunctionDeclaration as VertexAIFunctionDeclaration,
+)
+from vertexai.generative_models import (
+    GenerationConfig as VertexAIGenerationConfig,
+)
 from vertexai.generative_models import GenerativeModel
 from vertexai.generative_models import HarmBlockThreshold as VertexAIHarmBlockThreshold
 from vertexai.generative_models import HarmCategory as VertexAIHarmCategory
 from vertexai.generative_models import Part as VertexAIPart
 from vertexai.generative_models import SafetySetting as VertexAISafetySetting
+from vertexai.generative_models import (
+    Tool as VertexAITool,
+)
+from vertexai.generative_models import (
+    ToolConfig as VertexAIToolConfig,
+)

 logger = logging.getLogger(__name__)

@ -107,7 +124,7 @@ class GeminiClient:

        Args:
            api_key (str): The API key for using Gemini.
-                credentials (google.auth.credentials.Credentials): credentials to be used for authentication with vertexai.
+            credentials (google.auth.credentials.Credentials): credentials to be used for authentication with vertexai.
            google_application_credentials (str): Path to the JSON service account key file of the service account.
                Alternatively, the GOOGLE_APPLICATION_CREDENTIALS environment variable
                can also be set instead of using this argument.
@ -171,6 +188,8 @@ class GeminiClient:

        params.get("api_type", "google")  # not used
        messages = params.get("messages", [])
+        tools = params.get("tools", [])
+        tool_config = params.get("tool_config", {})
        stream = params.get("stream", False)
        n_response = params.get("n", 1)
        system_instruction = params.get("system_instruction", None)
@ -183,6 +202,7 @@ class GeminiClient:
        }
        if self.use_vertexai:
            safety_settings = GeminiClient._to_vertexai_safety_settings(params.get("safety_settings", {}))
+            tool_config = GeminiClient._to_vertexai_tool_config(tool_config, tools)
        else:
            safety_settings = params.get("safety_settings", {})

@ -198,12 +218,15 @@ class GeminiClient:
        if "vision" not in model_name:
            # A. create and call the chat model.
            gemini_messages = self._oai_messages_to_gemini_messages(messages)
+            gemini_tools = self._oai_tools_to_gemini_tools(tools)
            if self.use_vertexai:
                model = GenerativeModel(
                    model_name,
                    generation_config=generation_config,
                    safety_settings=safety_settings,
                    system_instruction=system_instruction,
+                    tools=gemini_tools,
+                    tool_config=tool_config,
                )
                chat = model.start_chat(history=gemini_messages[:-1], response_validation=response_validation)
            else:
@ -213,12 +236,13 @@ class GeminiClient:
                    generation_config=generation_config,
                    safety_settings=safety_settings,
                    system_instruction=system_instruction,
+                    tools=gemini_tools,
                )
                genai.configure(api_key=self.api_key)
                chat = model.start_chat(history=gemini_messages[:-1])
            max_retries = 5
            for attempt in range(max_retries):
-                ans = None
+                ans: Union[Content, VertexAIContent] = None
                try:
                    response = chat.send_message(
                        gemini_messages[-1].parts, stream=stream, safety_settings=safety_settings
@ -234,7 +258,7 @@ class GeminiClient:
                    raise RuntimeError(f"Google GenAI exception occurred while calling Gemini API: {e}")
                else:
                    # `ans = response.text` is unstable. Use the following code instead.
-                    ans: str = chat.history[-1].parts[0].text
+                    ans: Union[Content, VertexAIContent] = chat.history[-1]
                    break

            if ans is None:
@ -262,7 +286,7 @@ class GeminiClient:
            # Gemini's vision model does not support chat history yet
            # chat = model.start_chat(history=gemini_messages[:-1])
            # response = chat.send_message(gemini_messages[-1].parts)
-            user_message = self._oai_content_to_gemini_content(messages[-1]["content"])
+            user_message = self._oai_content_to_gemini_content(messages[-1])
            if len(messages) > 2:
                warnings.warn(
                    "Warning: Gemini's vision model does not support chat history yet.",
@ -273,16 +297,14 @@ class GeminiClient:
            response = model.generate_content(user_message, stream=stream)
            # ans = response.text
            if self.use_vertexai:
-                ans: str = response.candidates[0].content.parts[0].text
+                ans: VertexAIContent = response.candidates[0].content
            else:
-                ans: str = response._result.candidates[0].content.parts[0].text
+                ans: Content = response._result.candidates[0].content

            prompt_tokens = model.count_tokens(user_message).total_tokens
-            completion_tokens = model.count_tokens(ans).total_tokens
+            completion_tokens = model.count_tokens(ans.parts[0].text).total_tokens

-        # 3. convert output
-        message = ChatCompletionMessage(role="assistant", content=ans, function_call=None, tool_calls=None)
-        choices = [Choice(finish_reason="stop", index=0, message=message)]
+        choices = self._gemini_content_to_oai_choices(ans)

        response_oai = ChatCompletion(
            id=str(random.randint(0, 1000)),
@ -295,31 +317,87 @@ class GeminiClient:
                completion_tokens=completion_tokens,
                total_tokens=prompt_tokens + completion_tokens,
            ),
-            cost=calculate_gemini_cost(prompt_tokens, completion_tokens, model_name),
+            cost=self._calculate_gemini_cost(prompt_tokens, completion_tokens, model_name),
        )

        return response_oai

-    def _oai_content_to_gemini_content(self, content: Union[str, List]) -> List:
+    # If str is not a json string return str as is
+    def _to_json(self, str) -> dict:
+        try:
+            return json.loads(str)
+        except ValueError:
+            return str
+
+    def _oai_content_to_gemini_content(self, message: Dict[str, Any]) -> List:
        """Convert content from OAI format to Gemini format"""
        rst = []
-        if isinstance(content, str):
-            if content == "":
-                content = "empty"  # Empty content is not allowed.
+        if isinstance(message["content"], str):
+            if message["content"] == "":
+                message["content"] = "empty"  # Empty content is not allowed.
            if self.use_vertexai:
-                rst.append(VertexAIPart.from_text(content))
+                rst.append(VertexAIPart.from_text(message["content"]))
            else:
-                rst.append(Part(text=content))
+                rst.append(Part(text=message["content"]))
            return rst

-        assert isinstance(content, list)
+        if "tool_calls" in message:
+            if self.use_vertexai:
+                for tool_call in message["tool_calls"]:
+                    rst.append(
+                        VertexAIPart.from_dict(
+                            {
+                                "functionCall": {
+                                    "name": tool_call["function"]["name"],
+                                    "args": json.loads(tool_call["function"]["arguments"]),
+                                }
+                            }
+                        )
+                    )
+            else:
+                for tool_call in message["tool_calls"]:
+                    rst.append(
+                        Part(
+                            function_call=FunctionCall(
+                                name=tool_call["function"]["name"],
+                                args=json.loads(tool_call["function"]["arguments"]),
+                            )
+                        )
+                    )
+            return rst

-        for msg in content:
+        if message["role"] == "tool":
+            if self.use_vertexai:
+                rst.append(
+                    VertexAIPart.from_function_response(
+                        name=message["name"], response={"result": self._to_json(message["content"])}
+                    )
+                )
+            else:
+                rst.append(
+                    Part(
+                        function_response=FunctionResponse(
+                            name=message["name"], response={"result": self._to_json(message["content"])}
+                        )
+                    )
+                )
+            return rst
+
+        if isinstance(message["content"], str):
+            if self.use_vertexai:
+                rst.append(VertexAIPart.from_text(message["content"]))
+            else:
+                rst.append(Part(text=message["content"]))
+            return rst
+
+        assert isinstance(message["content"], list)
+
+        for msg in message["content"]:
            if isinstance(msg, dict):
                assert "type" in msg, f"Missing 'type' field in message: {msg}"
                if msg["type"] == "text":
                    if self.use_vertexai:
-                        rst.append(VertexAIPart.from_text(text=msg["text"]))
+                        rst.append(VertexAIPart.from_text(msg["text"]))
                    else:
                        rst.append(Part(text=msg["text"]))
                elif msg["type"] == "image_url":
@ -340,34 +418,32 @@ class GeminiClient:
                raise ValueError(f"Unsupported message type: {type(msg)}")
        return rst

-    def _concat_parts(self, parts: List[Part]) -> List:
-        """Concatenate parts with the same type.
-        If two adjacent parts both have the "text" attribute, then it will be joined into one part.
-        """
-        if not parts:
-            return []
+    def _calculate_gemini_cost(self, input_tokens: int, output_tokens: int, model_name: str) -> float:
+        if "1.5-pro" in model_name:
+            if (input_tokens + output_tokens) <= 128000:
+                # "gemini-1.5-pro"
+                # When total tokens is less than 128K cost is $3.5 per million input tokens and $10.5 per million output tokens
+                return 3.5 * input_tokens / 1e6 + 10.5 * output_tokens / 1e6
+            # "gemini-1.5-pro"
+            # Cost is $7 per million input tokens and $21 per million output tokens
+            return 7.0 * input_tokens / 1e6 + 21.0 * output_tokens / 1e6

-        concatenated_parts = []
-        previous_part = parts[0]
+        if "1.5-flash" in model_name:
+            if (input_tokens + output_tokens) <= 128000:
+                # "gemini-1.5-flash"
+                # Cost is $0.35 per million input tokens and $1.05 per million output tokens
+                return 0.35 * input_tokens / 1e6 + 1.05 * output_tokens / 1e6
+            # "gemini-1.5-flash"
+            # When total tokens is less than 128K cost is $0.70 per million input tokens and $2.10 per million output tokens
+            return 0.70 * input_tokens / 1e6 + 2.10 * output_tokens / 1e6

-        for current_part in parts[1:]:
-            if previous_part.text != "":
-                if self.use_vertexai:
-                    previous_part = VertexAIPart.from_text(previous_part.text + current_part.text)
-                else:
-                    previous_part.text += current_part.text
-            else:
-                concatenated_parts.append(previous_part)
-                previous_part = current_part
+        if "gemini-pro" not in model_name and "gemini-1.0-pro" not in model_name:
+            warnings.warn(
+                f"Cost calculation is not implemented for model {model_name}. Using Gemini-1.0-Pro.", UserWarning
+            )

-        if previous_part.text == "":
-            if self.use_vertexai:
-                previous_part = VertexAIPart.from_text("empty")
-            else:
-                previous_part.text = "empty"  # Empty content is not allowed.
-        concatenated_parts.append(previous_part)
-
-        return concatenated_parts
+        # Cost is $0.5 per million input tokens and $1.5 per million output tokens
+        return 0.5 * input_tokens / 1e6 + 1.5 * output_tokens / 1e6

    def _oai_messages_to_gemini_messages(self, messages: list[Dict[str, Any]]) -> list[dict[str, Any]]:
        """Convert messages from OAI format to Gemini format.
@ -376,38 +452,154 @@ class GeminiClient:
        """
        prev_role = None
        rst = []
-        curr_parts = []
-        for i, message in enumerate(messages):
-            parts = self._oai_content_to_gemini_content(message["content"])
-            role = "user" if message["role"] in ["user", "system"] else "model"
-            if (prev_role is None) or (role == prev_role):
-                curr_parts += parts
-            elif role != prev_role:
-                if self.use_vertexai:
-                    rst.append(VertexAIContent(parts=curr_parts, role=prev_role))
-                else:
-                    rst.append(Content(parts=curr_parts, role=prev_role))
-                curr_parts = parts
-            prev_role = role

-        # handle the last message
-        if self.use_vertexai:
-            rst.append(VertexAIContent(parts=curr_parts, role=role))
-        else:
-            rst.append(Content(parts=curr_parts, role=role))
+        def append_parts(parts, role):
+            if self.use_vertexai:
+                rst.append(VertexAIContent(parts=parts, role=role))
+            else:
+                rst.append(Content(parts=parts, role=role))
+
+        def append_text_to_last(text):
+            if self.use_vertexai:
+                rst[-1] = VertexAIContent(parts=[*rst[-1].parts, VertexAIPart.from_text(text)], role=rst[-1].role)
+            else:
+                rst[-1] = Content(parts=[*rst[-1].parts, Part(text=text)], role=rst[-1].role)
+
+        def is_function_call(parts):
+            return self.use_vertexai and parts[0].function_call or not self.use_vertexai and "function_call" in parts[0]
+
+        for i, message in enumerate(messages):
+
+            # Since the tool call message does not have the "name" field, we need to find the corresponding tool message.
+            if message["role"] == "tool":
+                message["name"] = [
+                    m["tool_calls"][i]["function"]["name"]
+                    for m in messages
+                    if "tool_calls" in m
+                    for i, tc in enumerate(m["tool_calls"])
+                    if tc["id"] == message["tool_call_id"]
+                ][0]
+
+            parts = self._oai_content_to_gemini_content(message)
+            role = "user" if message["role"] in ["user", "system"] else "model"
+
+            # In Gemini if the current message is a function call then previous message should not be a model message.
+            if is_function_call(parts):
+                # If the previous message is a model message then add a dummy "continue" user message before the function call
+                if prev_role == "model":
+                    append_parts(self._oai_content_to_gemini_content({"content": "continue"}), "user")
+                append_parts(parts, role)
+            # In Gemini if the current message is a function response then next message should be a model message.
+            elif role == "function":
+                append_parts(parts, "function")
+                # If the next message is not a model message then add a dummy "continue" model message after the function response
+                if len(messages) > (i + 1) and messages[i + 1]["role"] in ["user", "system"]:
+                    append_parts(self._oai_content_to_gemini_content({"content": "continue"}), "model")
+            # If the role is the same as the previous role and both are text messages then concatenate the text
+            elif role == prev_role:
+                append_text_to_last(parts[0].text)
+            # If this is first message or the role is different from the previous role then append the parts
+            else:
+                # If the previous text message is empty then update the text to "empty" as Gemini does not support empty messages
+                if (
+                    (len(rst) > 0)
+                    and hasattr(rst[-1].parts[0], "_raw_part")
+                    and hasattr(rst[-1].parts[0]._raw_part, "text")
+                    and (rst[-1].parts[0]._raw_part.text == "")
+                ):
+                    append_text_to_last("empty")
+                append_parts(parts, role)
+
+            prev_role = role

        # The Gemini is restrict on order of roles, such that
        # 1. The messages should be interleaved between user and model.
        # 2. The last message must be from the user role.
        # We add a dummy message "continue" if the last role is not the user.
-        if rst[-1].role != "user":
+        if rst[-1].role != "user" and rst[-1].role != "function":
            if self.use_vertexai:
-                rst.append(VertexAIContent(parts=self._oai_content_to_gemini_content("continue"), role="user"))
+                rst.append(
+                    VertexAIContent(parts=self._oai_content_to_gemini_content({"content": "continue"}), role="user")
+                )
            else:
-                rst.append(Content(parts=self._oai_content_to_gemini_content("continue"), role="user"))
-
+                rst.append(Content(parts=self._oai_content_to_gemini_content({"content": "continue"}), role="user"))
        return rst

+    def _oai_tools_to_gemini_tools(self, tools: List[Dict[str, Any]]) -> List[Tool]:
+        """Convert tools from OAI format to Gemini format."""
+        if len(tools) == 0:
+            return None
+        function_declarations = []
+        for tool in tools:
+            if self.use_vertexai:
+                function_declaration = VertexAIFunctionDeclaration(
+                    name=tool["function"]["name"],
+                    description=tool["function"]["description"],
+                    parameters=tool["function"]["parameters"],
+                )
+            else:
+                function_declaration = FunctionDeclaration(
+                    name=tool["function"]["name"],
+                    description=tool["function"]["description"],
+                    parameters=self._oai_function_parameters_to_gemini_function_parameters(
+                        copy.deepcopy(tool["function"]["parameters"])
+                    ),
+                )
+            function_declarations.append(function_declaration)
+        if self.use_vertexai:
+            return [VertexAITool(function_declarations=function_declarations)]
+        else:
+            return [Tool(function_declarations=function_declarations)]
+
+    def _oai_function_parameters_to_gemini_function_parameters(
+        self, function_definition: dict[str, any]
+    ) -> dict[str, any]:
+        """
+        Convert OpenAPI function definition parameters to Gemini function parameters definition.
+        The type key is renamed to type_ and the value is capitalized.
+        """
+        assert "anyOf" not in function_definition, "Union types are not supported for function parameter in Gemini."
+        # Delete the default key as it is not supported in Gemini
+        if "default" in function_definition:
+            del function_definition["default"]
+
+        function_definition["type_"] = function_definition["type"].upper()
+        del function_definition["type"]
+        if "properties" in function_definition:
+            for key in function_definition["properties"]:
+                function_definition["properties"][key] = self._oai_function_parameters_to_gemini_function_parameters(
+                    function_definition["properties"][key]
+                )
+        if "items" in function_definition:
+            function_definition["items"] = self._oai_function_parameters_to_gemini_function_parameters(
+                function_definition["items"]
+            )
+        return function_definition
+
+    def _gemini_content_to_oai_choices(self, response: Union[Content, VertexAIContent]) -> List[Choice]:
+        """Convert response from Gemini format to OAI format."""
+        text = None
+        tool_calls = []
+        for part in response.parts:
+            if part.function_call:
+                if self.use_vertexai:
+                    arguments = VertexAIPart.to_dict(part)["function_call"]["args"]
+                else:
+                    arguments = Part.to_dict(part)["function_call"]["args"]
+                tool_calls.append(
+                    ChatCompletionMessageToolCall(
+                        id=str(random.randint(0, 1000)),
+                        type="function",
+                        function=Function(name=part.function_call.name, arguments=json.dumps(arguments)),
+                    )
+                )
+            elif part.text:
+                text = part.text
+        message = ChatCompletionMessage(
+            role="assistant", content=text, function_call=None, tool_calls=tool_calls if len(tool_calls) > 0 else None
+        )
+        return [Choice(finish_reason="tool_calls" if tool_calls else "stop", index=0, message=message)]
+
    @staticmethod
    def _to_vertexai_safety_settings(safety_settings):
        """Convert safety settings to VertexAI format if needed,
@ -437,6 +629,49 @@ class GeminiClient:
        else:
            return safety_settings

+    @staticmethod
+    def _to_vertexai_tool_config(tool_config, tools):
+        """Convert tool config to VertexAI format,
+        like when specifying them in the OAI_CONFIG_LIST
+        """
+        if (
+            isinstance(tool_config, dict)
+            and (len(tool_config) > 0)
+            and all([isinstance(tool_config[tool_config_entry], dict) for tool_config_entry in tool_config])
+        ):
+            if (
+                tool_config["function_calling_config"]["mode"]
+                not in VertexAIToolConfig.FunctionCallingConfig.Mode.__members__
+            ):
+                invalid_mode = tool_config["function_calling_config"]
+                logger.error(f"Function calling mode {invalid_mode} is invalid")
+                return None
+            else:
+                # Currently, there is only function calling config
+                func_calling_config_params = {}
+                func_calling_config_params["mode"] = VertexAIToolConfig.FunctionCallingConfig.Mode[
+                    tool_config["function_calling_config"]["mode"]
+                ]
+                if (
+                    (func_calling_config_params["mode"] == VertexAIToolConfig.FunctionCallingConfig.Mode.ANY)
+                    and (len(tools) > 0)
+                    and all(["function_name" in tool for tool in tools])
+                ):
+                    # The function names are not yet known when parsing the OAI_CONFIG_LIST
+                    func_calling_config_params["allowed_function_names"] = [tool["function_name"] for tool in tools]
+                vertexai_tool_config = VertexAIToolConfig(
+                    function_calling_config=VertexAIToolConfig.FunctionCallingConfig(**func_calling_config_params)
+                )
+                return vertexai_tool_config
+        elif isinstance(tool_config, VertexAIToolConfig):
+            return tool_config
+        elif len(tool_config) == 0 and len(tools) == 0:
+            logger.debug("VertexAI tool config is empty!")
+            return None
+        else:
+            logger.error("Invalid VertexAI tool config!")
+            return None
+

 def _to_pil(data: str) -> Image.Image:
    """
@ -470,16 +705,3 @@ def get_image_data(image_file: str, use_b64=True) -> bytes:
        return base64.b64encode(content).decode("utf-8")
    else:
        return content
-
-
-def calculate_gemini_cost(input_tokens: int, output_tokens: int, model_name: str) -> float:
-    if "1.5" in model_name or "gemini-experimental" in model_name:
-        # "gemini-1.5-pro-preview-0409"
-        # Cost is $7 per million input tokens and $21 per million output tokens
-        return 7.0 * input_tokens / 1e6 + 21.0 * output_tokens / 1e6
-
-    if "gemini-pro" not in model_name and "gemini-1.0-pro" not in model_name:
-        warnings.warn(f"Cost calculation is not implemented for model {model_name}. Using Gemini-1.0-Pro.", UserWarning)
-
-    # Cost is $0.5 per million input tokens and $1.5 per million output tokens
-    return 0.5 * input_tokens / 1e6 + 1.5 * output_tokens / 1e6
--- a/autogen/oai/ollama.py
+++ b/autogen/oai/ollama.py
@ -0,0 +1,579 @@
+"""Create an OpenAI-compatible client using Ollama's API.
+
+Example:
+    llm_config={
+        "config_list": [{
+            "api_type": "ollama",
+            "model": "mistral:7b-instruct-v0.3-q6_K"
+            }
+    ]}
+
+    agent = autogen.AssistantAgent("my_agent", llm_config=llm_config)
+
+Install Ollama's python library using: pip install --upgrade ollama
+
+Resources:
+- https://github.com/ollama/ollama-python
+"""
+
+from __future__ import annotations
+
+import copy
+import json
+import random
+import re
+import time
+import warnings
+from typing import Any, Dict, List, Tuple
+
+import ollama
+from fix_busted_json import repair_json
+from ollama import Client
+from openai.types.chat import ChatCompletion, ChatCompletionMessageToolCall
+from openai.types.chat.chat_completion import ChatCompletionMessage, Choice
+from openai.types.completion_usage import CompletionUsage
+
+from autogen.oai.client_utils import should_hide_tools, validate_parameter
+
+
+class OllamaClient:
+    """Client for Ollama's API."""
+
+    # Defaults for manual tool calling
+    # Instruction is added to the first system message and provides directions to follow a two step
+    # process
+    # 1. (before tools have been called) Return JSON with the functions to call
+    # 2. (directly after tools have been called) Return Text describing the results of the function calls in text format
+
+    # Override using "manual_tool_call_instruction" config parameter
+    TOOL_CALL_MANUAL_INSTRUCTION = (
+        "You are to follow a strict two step process that will occur over "
+        "a number of interactions, so pay attention to what step you are in based on the full "
+        "conversation. We will be taking turns so only do one step at a time so don't perform step "
+        "2 until step 1 is complete and I've told you the result. The first step is to choose one "
+        "or more functions based on the request given and return only JSON with the functions and "
+        "arguments to use. The second step is to analyse the given output of the function and summarise "
+        "it returning only TEXT and not Python or JSON. "
+        "For argument values, be sure numbers aren't strings, they should not have double quotes around them. "
+        "In terms of your response format, for step 1 return only JSON and NO OTHER text, "
+        "for step 2 return only text and NO JSON/Python/Markdown. "
+        'The format for running a function is [{"name": "function_name1", "arguments":{"argument_name": "argument_value"}},{"name": "function_name2", "arguments":{"argument_name": "argument_value"}}] '
+        'Make sure the keys "name" and "arguments" are as described. '
+        "If you don't get the format correct, try again. "
+        "The following functions are available to you:[FUNCTIONS_LIST]"
+    )
+
+    # Appended to the last user message if no tools have been called
+    # Override using "manual_tool_call_step1" config parameter
+    TOOL_CALL_MANUAL_STEP1 = " (proceed with step 1)"
+
+    # Appended to the user message after tools have been executed. Will create a 'user' message if one doesn't exist.
+    # Override using "manual_tool_call_step2" config parameter
+    TOOL_CALL_MANUAL_STEP2 = " (proceed with step 2)"
+
+    def __init__(self, **kwargs):
+        """Note that no api_key or environment variable is required for Ollama.
+
+        Args:
+            None
+        """
+
+    def message_retrieval(self, response) -> List:
+        """
+        Retrieve and return a list of strings or a list of Choice.Message from the response.
+
+        NOTE: if a list of Choice.Message is returned, it currently needs to contain the fields of OpenAI's ChatCompletion Message object,
+        since that is expected for function or tool calling in the rest of the codebase at the moment, unless a custom agent is being used.
+        """
+        return [choice.message for choice in response.choices]
+
+    def cost(self, response) -> float:
+        return response.cost
+
+    @staticmethod
+    def get_usage(response) -> Dict:
+        """Return usage summary of the response using RESPONSE_USAGE_KEYS."""
+        # ...  # pragma: no cover
+        return {
+            "prompt_tokens": response.usage.prompt_tokens,
+            "completion_tokens": response.usage.completion_tokens,
+            "total_tokens": response.usage.total_tokens,
+            "cost": response.cost,
+            "model": response.model,
+        }
+
+    def parse_params(self, params: Dict[str, Any]) -> Dict[str, Any]:
+        """Loads the parameters for Ollama API from the passed in parameters and returns a validated set. Checks types, ranges, and sets defaults"""
+        ollama_params = {}
+
+        # Check that we have what we need to use Ollama's API
+        # https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion
+
+        # The main parameters are model, prompt, stream, and options
+        # Options is a dictionary of parameters for the model
+        # There are other, advanced, parameters such as format, system (to override system message), template, raw, etc. - not used
+
+        # We won't enforce the available models
+        ollama_params["model"] = params.get("model", None)
+        assert ollama_params[
+            "model"
+        ], "Please specify the 'model' in your config list entry to nominate the Ollama model to use."
+
+        ollama_params["stream"] = validate_parameter(params, "stream", bool, True, False, None, None)
+
+        # Build up the options dictionary
+        # https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
+        options_dict = {}
+
+        if "num_predict" in params:
+            # Maximum number of tokens to predict, note: -1 is infinite, -2 is fill context, 128 is default
+            options_dict["num_predict"] = validate_parameter(params, "num_predict", int, False, 128, None, None)
+
+        if "repeat_penalty" in params:
+            options_dict["repeat_penalty"] = validate_parameter(
+                params, "repeat_penalty", (int, float), False, 1.1, None, None
+            )
+
+        if "seed" in params:
+            options_dict["seed"] = validate_parameter(params, "seed", int, False, 42, None, None)
+
+        if "temperature" in params:
+            options_dict["temperature"] = validate_parameter(
+                params, "temperature", (int, float), False, 0.8, None, None
+            )
+
+        if "top_k" in params:
+            options_dict["top_k"] = validate_parameter(params, "top_k", int, False, 40, None, None)
+
+        if "top_p" in params:
+            options_dict["top_p"] = validate_parameter(params, "top_p", (int, float), False, 0.9, None, None)
+
+        if self._native_tool_calls and self._tools_in_conversation and not self._should_hide_tools:
+            ollama_params["tools"] = params["tools"]
+
+            # Ollama doesn't support streaming with tools natively
+            if ollama_params["stream"] and self._native_tool_calls:
+                warnings.warn(
+                    "Streaming is not supported when using tools and 'Native' tool calling, streaming will be disabled.",
+                    UserWarning,
+                )
+
+                ollama_params["stream"] = False
+
+        if not self._native_tool_calls and self._tools_in_conversation:
+            # For manual tool calling we have injected the available tools into the prompt
+            # and we don't want to force JSON mode
+            ollama_params["format"] = ""  # Don't force JSON for manual tool calling mode
+
+        if len(options_dict) != 0:
+            ollama_params["options"] = options_dict
+
+        return ollama_params
+
+    def create(self, params: Dict) -> ChatCompletion:
+
+        messages = params.get("messages", [])
+
+        # Are tools involved in this conversation?
+        self._tools_in_conversation = "tools" in params
+
+        # We provide second-level filtering out of tools to avoid LLMs re-calling tools continuously
+        if self._tools_in_conversation:
+            hide_tools = validate_parameter(
+                params, "hide_tools", str, False, "never", None, ["if_all_run", "if_any_run", "never"]
+            )
+            self._should_hide_tools = should_hide_tools(messages, params["tools"], hide_tools)
+        else:
+            self._should_hide_tools = False
+
+        # Are we using native Ollama tool calling, otherwise we're doing manual tool calling
+        # We allow the user to decide if they want to use Ollama's tool calling
+        # or for tool calling to be handled manually through text messages
+        # Default is True = Ollama's tool calling
+        self._native_tool_calls = validate_parameter(params, "native_tool_calls", bool, False, True, None, None)
+
+        if not self._native_tool_calls:
+            # Load defaults
+            self._manual_tool_call_instruction = validate_parameter(
+                params, "manual_tool_call_instruction", str, False, self.TOOL_CALL_MANUAL_INSTRUCTION, None, None
+            )
+            self._manual_tool_call_step1 = validate_parameter(
+                params, "manual_tool_call_step1", str, False, self.TOOL_CALL_MANUAL_STEP1, None, None
+            )
+            self._manual_tool_call_step2 = validate_parameter(
+                params, "manual_tool_call_step2", str, False, self.TOOL_CALL_MANUAL_STEP2, None, None
+            )
+
+        # Convert AutoGen messages to Ollama messages
+        ollama_messages = self.oai_messages_to_ollama_messages(
+            messages,
+            (
+                params["tools"]
+                if (not self._native_tool_calls and self._tools_in_conversation) and not self._should_hide_tools
+                else None
+            ),
+        )
+
+        # Parse parameters to the Ollama API's parameters
+        ollama_params = self.parse_params(params)
+
+        ollama_params["messages"] = ollama_messages
+
+        # Token counts will be returned
+        prompt_tokens = 0
+        completion_tokens = 0
+        total_tokens = 0
+
+        ans = None
+        try:
+            if "client_host" in params:
+                client = Client(host=params["client_host"])
+                response = client.chat(**ollama_params)
+            else:
+                response = ollama.chat(**ollama_params)
+        except Exception as e:
+            raise RuntimeError(f"Ollama exception occurred: {e}")
+        else:
+
+            if ollama_params["stream"]:
+                # Read in the chunks as they stream, taking in tool_calls which may be across
+                # multiple chunks if more than one suggested
+                ans = ""
+                for chunk in response:
+                    ans = ans + (chunk["message"]["content"] or "")
+
+                    if "done_reason" in chunk:
+                        prompt_tokens = chunk["prompt_eval_count"] if "prompt_eval_count" in chunk else 0
+                        completion_tokens = chunk["eval_count"] if "eval_count" in chunk else 0
+                        total_tokens = prompt_tokens + completion_tokens
+            else:
+                # Non-streaming finished
+                ans: str = response["message"]["content"]
+
+                prompt_tokens = response["prompt_eval_count"] if "prompt_eval_count" in response else 0
+                completion_tokens = response["eval_count"] if "eval_count" in response else 0
+                total_tokens = prompt_tokens + completion_tokens
+
+        if response is not None:
+
+            # Defaults
+            ollama_finish = "stop"
+            tool_calls = None
+
+            # Id and streaming text into response
+            if ollama_params["stream"]:
+                response_content = ans
+                response_id = chunk["created_at"]
+            else:
+                response_content = response["message"]["content"]
+                response_id = response["created_at"]
+
+            # Process tools in the response
+            if self._tools_in_conversation:
+
+                if self._native_tool_calls:
+
+                    if not ollama_params["stream"]:
+                        response_content = response["message"]["content"]
+
+                        # Native tool calling
+                        if "tool_calls" in response["message"]:
+                            ollama_finish = "tool_calls"
+                            tool_calls = []
+                            random_id = random.randint(0, 10000)
+                            for tool_call in response["message"]["tool_calls"]:
+                                tool_calls.append(
+                                    ChatCompletionMessageToolCall(
+                                        id="ollama_func_{}".format(random_id),
+                                        function={
+                                            "name": tool_call["function"]["name"],
+                                            "arguments": json.dumps(tool_call["function"]["arguments"]),
+                                        },
+                                        type="function",
+                                    )
+                                )
+
+                                random_id += 1
+
+                elif not self._native_tool_calls:
+
+                    # Try to convert the response to a tool call object
+                    response_toolcalls = response_to_tool_call(ans)
+
+                    # If we can, then we've got tool call(s)
+                    if response_toolcalls is not None:
+                        ollama_finish = "tool_calls"
+                        tool_calls = []
+                        random_id = random.randint(0, 10000)
+
+                        for json_function in response_toolcalls:
+                            tool_calls.append(
+                                ChatCompletionMessageToolCall(
+                                    id="ollama_manual_func_{}".format(random_id),
+                                    function={
+                                        "name": json_function["name"],
+                                        "arguments": (
+                                            json.dumps(json_function["arguments"])
+                                            if "arguments" in json_function
+                                            else "{}"
+                                        ),
+                                    },
+                                    type="function",
+                                )
+                            )
+
+                            random_id += 1
+
+                        # Blank the message content
+                        response_content = ""
+
+        else:
+            raise RuntimeError("Failed to get response from Ollama.")
+
+        # Convert response to AutoGen response
+        message = ChatCompletionMessage(
+            role="assistant",
+            content=response_content,
+            function_call=None,
+            tool_calls=tool_calls,
+        )
+        choices = [Choice(finish_reason=ollama_finish, index=0, message=message)]
+
+        response_oai = ChatCompletion(
+            id=response_id,
+            model=ollama_params["model"],
+            created=int(time.time()),
+            object="chat.completion",
+            choices=choices,
+            usage=CompletionUsage(
+                prompt_tokens=prompt_tokens,
+                completion_tokens=completion_tokens,
+                total_tokens=total_tokens,
+            ),
+            cost=0,  # Local models, FREE!
+        )
+
+        return response_oai
+
+    def oai_messages_to_ollama_messages(self, messages: list[Dict[str, Any]], tools: list) -> list[dict[str, Any]]:
+        """Convert messages from OAI format to Ollama's format.
+        We correct for any specific role orders and types, and convert tools to messages (as Ollama can't use tool messages)
+        """
+
+        ollama_messages = copy.deepcopy(messages)
+
+        # Remove the name field
+        for message in ollama_messages:
+            if "name" in message:
+                message.pop("name", None)
+
+        # Having a 'system' message on the end does not work well with Ollama, so we change it to 'user'
+        # 'system' messages on the end are typical of the summarisation message: summary_method="reflection_with_llm"
+        if len(ollama_messages) > 1 and ollama_messages[-1]["role"] == "system":
+            ollama_messages[-1]["role"] = "user"
+
+        # Process messages for tool calling manually
+        if tools is not None and not self._native_tool_calls:
+            # 1. We need to append instructions to the starting system message on function calling
+            # 2. If we have not yet called tools we append "step 1 instruction" to the latest user message
+            # 3. If we have already called tools we append "step 2 instruction" to the latest user message
+
+            have_tool_calls = False
+            have_tool_results = False
+            last_tool_result_index = -1
+
+            for i, message in enumerate(ollama_messages):
+                if "tool_calls" in message:
+                    have_tool_calls = True
+                if "tool_call_id" in message:
+                    have_tool_results = True
+                    last_tool_result_index = i
+
+            tool_result_is_last_msg = have_tool_results and last_tool_result_index == len(ollama_messages) - 1
+
+            if ollama_messages[0]["role"] == "system":
+                manual_instruction = self._manual_tool_call_instruction
+
+                # Build a string of the functions available
+                functions_string = ""
+                for function in tools:
+                    functions_string += f"""\n{function}\n"""
+
+                # Replace single quotes with double questions - Not sure why this helps the LLM perform
+                # better, but it seems to. Monitor and remove if not necessary.
+                functions_string = functions_string.replace("'", '"')
+
+                manual_instruction = manual_instruction.replace("[FUNCTIONS_LIST]", functions_string)
+
+                # Update the system message with the instructions and functions
+                ollama_messages[0]["content"] = ollama_messages[0]["content"] + manual_instruction.rstrip()
+
+            # If we are still in the function calling or evaluating process, append the steps instruction
+            if not have_tool_calls or tool_result_is_last_msg:
+                if ollama_messages[0]["role"] == "system":
+                    # NOTE: we require a system message to exist for the manual steps texts
+                    # Append the manual step instructions
+                    content_to_append = (
+                        self._manual_tool_call_step1 if not have_tool_results else self._manual_tool_call_step2
+                    )
+
+                    if content_to_append != "":
+                        # Append the relevant tool call instruction to the latest user message
+                        if ollama_messages[-1]["role"] == "user":
+                            ollama_messages[-1]["content"] = ollama_messages[-1]["content"] + content_to_append
+                        else:
+                            ollama_messages.append({"role": "user", "content": content_to_append})
+
+        # Convert tool call and tool result messages to normal text messages for Ollama
+        for i, message in enumerate(ollama_messages):
+            if "tool_calls" in message:
+                # Recommended tool calls
+                content = "Run the following function(s):"
+                for tool_call in message["tool_calls"]:
+                    content = content + "\n" + str(tool_call)
+                ollama_messages[i] = {"role": "assistant", "content": content}
+            if "tool_call_id" in message:
+                # Executed tool results
+                message["result"] = message["content"]
+                del message["content"]
+                del message["role"]
+                content = "The following function was run: " + str(message)
+                ollama_messages[i] = {"role": "user", "content": content}
+
+        # As we are changing messages, let's merge if they have two user messages on the end and the last one is tool call step instructions
+        if (
+            len(ollama_messages) >= 2
+            and not self._native_tool_calls
+            and ollama_messages[-2]["role"] == "user"
+            and ollama_messages[-1]["role"] == "user"
+            and (
+                ollama_messages[-1]["content"] == self._manual_tool_call_step1
+                or ollama_messages[-1]["content"] == self._manual_tool_call_step2
+            )
+        ):
+            ollama_messages[-2]["content"] = ollama_messages[-2]["content"] + ollama_messages[-1]["content"]
+            del ollama_messages[-1]
+
+        # Ensure the last message is a user / system message, if not, add a user message
+        if ollama_messages[-1]["role"] != "user" and ollama_messages[-1]["role"] != "system":
+            ollama_messages.append({"role": "user", "content": "Please continue."})
+
+        return ollama_messages
+
+
+def response_to_tool_call(response_string: str) -> Any:
+    """Attempts to convert the response to an object, aimed to align with function format [{},{}]"""
+
+    # We try and detect the list[dict] format:
+    # Pattern 1 is [{},{}]
+    # Pattern 2 is {} (without the [], so could be a single function call)
+    patterns = [r"\[[\s\S]*?\]", r"\{[\s\S]*\}"]
+
+    for i, pattern in enumerate(patterns):
+        # Search for the pattern in the input string
+        matches = re.findall(pattern, response_string.strip())
+
+        for match in matches:
+
+            # It has matched, extract it and load it
+            json_str = match.strip()
+            data_object = None
+
+            try:
+                # Attempt to convert it as is
+                data_object = json.loads(json_str)
+            except Exception:
+                try:
+                    # If that fails, attempt to repair it
+
+                    if i == 0:
+                        # Enclose to a JSON object for repairing, which is restored upon fix
+                        fixed_json = repair_json("{'temp':" + json_str + "}")
+                        data_object = json.loads(fixed_json)
+                        data_object = data_object["temp"]
+                    else:
+                        fixed_json = repair_json(json_str)
+                        data_object = json.loads(fixed_json)
+                except json.JSONDecodeError as e:
+                    if e.msg == "Invalid \\escape":
+                        # Handle Mistral/Mixtral trying to escape underlines with \\
+                        try:
+                            json_str = json_str.replace("\\_", "_")
+                            if i == 0:
+                                fixed_json = repair_json("{'temp':" + json_str + "}")
+                                data_object = json.loads(fixed_json)
+                                data_object = data_object["temp"]
+                            else:
+                                fixed_json = repair_json("{'temp':" + json_str + "}")
+                                data_object = json.loads(fixed_json)
+                        except Exception:
+                            pass
+                except Exception:
+                    pass
+
+            if data_object is not None:
+                data_object = _object_to_tool_call(data_object)
+
+                if data_object is not None:
+                    return data_object
+
+    # There's no tool call in the response
+    return None
+
+
+def _object_to_tool_call(data_object: Any) -> List[Dict]:
+    """Attempts to convert an object to a valid tool call object List[Dict] and returns it, if it can, otherwise None"""
+
+    # If it's a dictionary and not a list then wrap in a list
+    if isinstance(data_object, dict):
+        data_object = [data_object]
+
+    # Validate that the data is a list of dictionaries
+    if isinstance(data_object, list) and all(isinstance(item, dict) for item in data_object):
+        # Perfect format, a list of dictionaries
+
+        # Check that each dictionary has at least 'name', optionally 'arguments' and no other keys
+        is_invalid = False
+        for item in data_object:
+            if not is_valid_tool_call_item(item):
+                is_invalid = True
+                break
+
+        # All passed, name and (optionally) arguments exist for all entries.
+        if not is_invalid:
+            return data_object
+    elif isinstance(data_object, list):
+        # If it's a list but the items are not dictionaries, check if they are strings that can be converted to dictionaries
+        data_copy = data_object.copy()
+        is_invalid = False
+        for i, item in enumerate(data_copy):
+            try:
+                new_item = eval(item)
+                if isinstance(new_item, dict):
+                    if is_valid_tool_call_item(new_item):
+                        data_object[i] = new_item
+                    else:
+                        is_invalid = True
+                        break
+                else:
+                    is_invalid = True
+                    break
+            except Exception:
+                is_invalid = True
+                break
+
+        if not is_invalid:
+            return data_object
+
+    return None
+
+
+def is_valid_tool_call_item(call_item: dict) -> bool:
+    """Check that a dictionary item has at least 'name', optionally 'arguments' and no other keys to match a tool call JSON"""
+    if "name" not in call_item or not isinstance(call_item["name"], str):
+        return False
+
+    if set(call_item.keys()) - {"name", "arguments"}:
+        return False
+
+    return True
--- a/autogen/oai/openai_utils.py
+++ b/autogen/oai/openai_utils.py
@ -21,6 +21,7 @@ NON_CACHE_KEY = [
    "azure_ad_token",
    "azure_ad_token_provider",
    "credentials",
+    "tool_config",
 ]
 DEFAULT_AZURE_API_VERSION = "2024-02-01"
 OAI_PRICE1K = {
--- a/autogen/runtime_logging.py
+++ b/autogen/runtime_logging.py
@ -15,10 +15,12 @@ if TYPE_CHECKING:
    from autogen import Agent, ConversableAgent, OpenAIWrapper
    from autogen.oai.anthropic import AnthropicClient
    from autogen.oai.bedrock import BedrockClient
+    from autogen.oai.cerebras import CerebrasClient
    from autogen.oai.cohere import CohereClient
    from autogen.oai.gemini import GeminiClient
    from autogen.oai.groq import GroqClient
    from autogen.oai.mistral import MistralAIClient
+    from autogen.oai.ollama import OllamaClient
    from autogen.oai.together import TogetherClient

 logger = logging.getLogger(__name__)
@ -116,12 +118,14 @@ def log_new_client(
    client: Union[
        AzureOpenAI,
        OpenAI,
+        CerebrasClient,
        GeminiClient,
        AnthropicClient,
        MistralAIClient,
        TogetherClient,
        GroqClient,
        CohereClient,
+        OllamaClient,
        BedrockClient,
    ],
    wrapper: OpenAIWrapper,
--- a/autogen/version.py
+++ b/autogen/version.py
@ -1 +1 @@
-__version__ = "0.2.35"
+__version__ = "0.2.36"
--- a/dotnet/nuget/NUGET.md
+++ b/dotnet/nuget/NUGET.md
@ -2,7 +2,6 @@
 `AutoGen for .NET` is the official .NET SDK for [AutoGen](https://github.com/microsoft/autogen). It enables you to create LLM agents and construct multi-agent workflows with ease. It also provides integration with popular platforms like OpenAI, Semantic Kernel, and LM Studio.

 ### Gettings started
- Find documents and examples on our [document site](https://microsoft.github.io/autogen-for-net/) 
- Join our [Discord channel](https://discord.gg/pAbnFJrkgZ) to get help and discuss with the community
+- Find documents and examples on our [document site](https://microsoft.github.io/autogen-for-net/)
 - Report a bug or request a feature by creating a new issue in our [github repo](https://github.com/microsoft/autogen)
 - Consume the nightly build package from one of the [nightly build feeds](https://microsoft.github.io/autogen-for-net/articles/Installation.html#nighly-build)
--- a/notebook/Async_human_input.ipynb
+++ b/notebook/Async_human_input.ipynb
@ -2,7 +2,7 @@
 "cells": [
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
@ -10,179 +10,9 @@
    "id": "tLIs1YRdr8jM",
    "outputId": "909c1c70-1a22-4e9d-b7f4-a40e2d737fb0"
   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Requirement already satisfied: pyautogen>=0.2.3 in /home/vscode/.local/lib/python3.10/site-packages (0.2.3)\n",
-      "Requirement already satisfied: openai>=1.3 in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (1.6.1)\n",
-      "Requirement already satisfied: diskcache in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (5.6.3)\n",
-      "Requirement already satisfied: termcolor in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (2.4.0)\n",
-      "Requirement already satisfied: flaml in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (2.1.1)\n",
-      "Requirement already satisfied: python-dotenv in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (1.0.0)\n",
-      "Requirement already satisfied: tiktoken in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (0.5.2)\n",
-      "Requirement already satisfied: pydantic<3,>=1.10 in /home/vscode/.local/lib/python3.10/site-packages (from pyautogen>=0.2.3) (1.10.9)\n",
-      "Requirement already satisfied: anyio<5,>=3.5.0 in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (4.2.0)\n",
-      "Requirement already satisfied: distro<2,>=1.7.0 in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (1.9.0)\n",
-      "Requirement already satisfied: httpx<1,>=0.23.0 in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (0.26.0)\n",
-      "Requirement already satisfied: sniffio in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (1.3.0)\n",
-      "Requirement already satisfied: tqdm>4 in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (4.66.1)\n",
-      "Requirement already satisfied: typing-extensions<5,>=4.7 in /home/vscode/.local/lib/python3.10/site-packages (from openai>=1.3->pyautogen>=0.2.3) (4.9.0)\n",
-      "Requirement already satisfied: NumPy>=1.17.0rc1 in /home/vscode/.local/lib/python3.10/site-packages (from flaml->pyautogen>=0.2.3) (1.26.3)\n",
-      "Requirement already satisfied: regex>=2022.1.18 in /home/vscode/.local/lib/python3.10/site-packages (from tiktoken->pyautogen>=0.2.3) (2023.12.25)\n",
-      "Requirement already satisfied: requests>=2.26.0 in /usr/local/lib/python3.10/site-packages (from tiktoken->pyautogen>=0.2.3) (2.31.0)\n",
-      "Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/site-packages (from anyio<5,>=3.5.0->openai>=1.3->pyautogen>=0.2.3) (3.6)\n",
-      "Requirement already satisfied: exceptiongroup>=1.0.2 in /home/vscode/.local/lib/python3.10/site-packages (from anyio<5,>=3.5.0->openai>=1.3->pyautogen>=0.2.3) (1.2.0)\n",
-      "Requirement already satisfied: certifi in /usr/local/lib/python3.10/site-packages (from httpx<1,>=0.23.0->openai>=1.3->pyautogen>=0.2.3) (2023.11.17)\n",
-      "Requirement already satisfied: httpcore==1.* in /home/vscode/.local/lib/python3.10/site-packages (from httpx<1,>=0.23.0->openai>=1.3->pyautogen>=0.2.3) (1.0.2)\n",
-      "Requirement already satisfied: h11<0.15,>=0.13 in /home/vscode/.local/lib/python3.10/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai>=1.3->pyautogen>=0.2.3) (0.14.0)\n",
-      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken->pyautogen>=0.2.3) (3.3.2)\n",
-      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/vscode/.local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken->pyautogen>=0.2.3) (1.26.18)\n",
-      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Requirement already satisfied: chromadb in /home/vscode/.local/lib/python3.10/site-packages (0.4.22)\n",
-      "Requirement already satisfied: build>=1.0.3 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.0.3)\n",
-      "Requirement already satisfied: requests>=2.28 in /usr/local/lib/python3.10/site-packages (from chromadb) (2.31.0)\n",
-      "Requirement already satisfied: pydantic>=1.9 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.10.9)\n",
-      "Requirement already satisfied: chroma-hnswlib==0.7.3 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.7.3)\n",
-      "Requirement already satisfied: fastapi>=0.95.2 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.108.0)\n",
-      "Requirement already satisfied: uvicorn>=0.18.3 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.25.0)\n",
-      "Requirement already satisfied: numpy>=1.22.5 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.26.3)\n",
-      "Requirement already satisfied: posthog>=2.4.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (3.1.0)\n",
-      "Requirement already satisfied: typing-extensions>=4.5.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (4.9.0)\n",
-      "Requirement already satisfied: pulsar-client>=3.1.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (3.4.0)\n",
-      "Requirement already satisfied: onnxruntime>=1.14.1 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.16.3)\n",
-      "Requirement already satisfied: opentelemetry-api>=1.2.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.22.0)\n",
-      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc>=1.2.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.22.0)\n",
-      "Requirement already satisfied: opentelemetry-instrumentation-fastapi>=0.41b0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.43b0)\n",
-      "Requirement already satisfied: opentelemetry-sdk>=1.2.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.22.0)\n",
-      "Requirement already satisfied: tokenizers>=0.13.2 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.15.0)\n",
-      "Requirement already satisfied: pypika>=0.48.9 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.48.9)\n",
-      "Requirement already satisfied: tqdm>=4.65.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (4.66.1)\n",
-      "Requirement already satisfied: overrides>=7.3.1 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (7.4.0)\n",
-      "Requirement already satisfied: importlib-resources in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (6.1.1)\n",
-      "Requirement already satisfied: grpcio>=1.58.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (1.60.0)\n",
-      "Requirement already satisfied: bcrypt>=4.0.1 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (4.1.2)\n",
-      "Requirement already satisfied: typer>=0.9.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (0.9.0)\n",
-      "Requirement already satisfied: kubernetes>=28.1.0 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (28.1.0)\n",
-      "Requirement already satisfied: tenacity>=8.2.3 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (8.2.3)\n",
-      "Requirement already satisfied: PyYAML>=6.0.0 in /usr/local/lib/python3.10/site-packages (from chromadb) (6.0.1)\n",
-      "Requirement already satisfied: mmh3>=4.0.1 in /home/vscode/.local/lib/python3.10/site-packages (from chromadb) (4.0.1)\n",
-      "Requirement already satisfied: packaging>=19.0 in /usr/local/lib/python3.10/site-packages (from build>=1.0.3->chromadb) (23.2)\n",
-      "Requirement already satisfied: pyproject_hooks in /home/vscode/.local/lib/python3.10/site-packages (from build>=1.0.3->chromadb) (1.0.0)\n",
-      "Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.10/site-packages (from build>=1.0.3->chromadb) (2.0.1)\n",
-      "Requirement already satisfied: starlette<0.33.0,>=0.29.0 in /home/vscode/.local/lib/python3.10/site-packages (from fastapi>=0.95.2->chromadb) (0.32.0.post1)\n",
-      "Requirement already satisfied: certifi>=14.05.14 in /usr/local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (2023.11.17)\n",
-      "Requirement already satisfied: six>=1.9.0 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (1.16.0)\n",
-      "Requirement already satisfied: python-dateutil>=2.5.3 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (2.8.2)\n",
-      "Requirement already satisfied: google-auth>=1.0.1 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (2.26.1)\n",
-      "Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (1.7.0)\n",
-      "Requirement already satisfied: requests-oauthlib in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (1.3.1)\n",
-      "Requirement already satisfied: oauthlib>=3.2.2 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (3.2.2)\n",
-      "Requirement already satisfied: urllib3<2.0,>=1.24.2 in /home/vscode/.local/lib/python3.10/site-packages (from kubernetes>=28.1.0->chromadb) (1.26.18)\n",
-      "Requirement already satisfied: coloredlogs in /home/vscode/.local/lib/python3.10/site-packages (from onnxruntime>=1.14.1->chromadb) (15.0.1)\n",
-      "Requirement already satisfied: flatbuffers in /home/vscode/.local/lib/python3.10/site-packages (from onnxruntime>=1.14.1->chromadb) (23.5.26)\n",
-      "Requirement already satisfied: protobuf in /home/vscode/.local/lib/python3.10/site-packages (from onnxruntime>=1.14.1->chromadb) (4.25.1)\n",
-      "Requirement already satisfied: sympy in /home/vscode/.local/lib/python3.10/site-packages (from onnxruntime>=1.14.1->chromadb) (1.12)\n",
-      "Requirement already satisfied: deprecated>=1.2.6 in /usr/local/lib/python3.10/site-packages (from opentelemetry-api>=1.2.0->chromadb) (1.2.14)\n",
-      "Requirement already satisfied: importlib-metadata<7.0,>=6.0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-api>=1.2.0->chromadb) (6.11.0)\n",
-      "Requirement already satisfied: backoff<3.0.0,>=1.10.0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (2.2.1)\n",
-      "Requirement already satisfied: googleapis-common-protos~=1.52 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.62.0)\n",
-      "Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.22.0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.22.0)\n",
-      "Requirement already satisfied: opentelemetry-proto==1.22.0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb) (1.22.0)\n",
-      "Requirement already satisfied: opentelemetry-instrumentation-asgi==0.43b0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.43b0)\n",
-      "Requirement already satisfied: opentelemetry-instrumentation==0.43b0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.43b0)\n",
-      "Requirement already satisfied: opentelemetry-semantic-conventions==0.43b0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.43b0)\n",
-      "Requirement already satisfied: opentelemetry-util-http==0.43b0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (0.43b0)\n",
-      "Requirement already satisfied: setuptools>=16.0 in /usr/local/lib/python3.10/site-packages (from opentelemetry-instrumentation==0.43b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (69.0.2)\n",
-      "Requirement already satisfied: wrapt<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/site-packages (from opentelemetry-instrumentation==0.43b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (1.16.0)\n",
-      "Requirement already satisfied: asgiref~=3.0 in /home/vscode/.local/lib/python3.10/site-packages (from opentelemetry-instrumentation-asgi==0.43b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb) (3.7.2)\n",
-      "Requirement already satisfied: monotonic>=1.5 in /home/vscode/.local/lib/python3.10/site-packages (from posthog>=2.4.0->chromadb) (1.6)\n",
-      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/site-packages (from requests>=2.28->chromadb) (3.3.2)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests>=2.28->chromadb) (3.6)\n",
-      "Requirement already satisfied: huggingface_hub<1.0,>=0.16.4 in /home/vscode/.local/lib/python3.10/site-packages (from tokenizers>=0.13.2->chromadb) (0.20.2)\n",
-      "Requirement already satisfied: click<9.0.0,>=7.1.1 in /usr/local/lib/python3.10/site-packages (from typer>=0.9.0->chromadb) (8.1.7)\n",
-      "Requirement already satisfied: h11>=0.8 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn>=0.18.3->uvicorn[standard]>=0.18.3->chromadb) (0.14.0)\n",
-      "Requirement already satisfied: httptools>=0.5.0 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.6.1)\n",
-      "Requirement already satisfied: python-dotenv>=0.13 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (1.0.0)\n",
-      "Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.19.0)\n",
-      "Requirement already satisfied: watchfiles>=0.13 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.21.0)\n",
-      "Requirement already satisfied: websockets>=10.4 in /home/vscode/.local/lib/python3.10/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (12.0)\n",
-      "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/vscode/.local/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (5.3.2)\n",
-      "Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/vscode/.local/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (0.3.0)\n",
-      "Requirement already satisfied: rsa<5,>=3.1.4 in /home/vscode/.local/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (4.9)\n",
-      "Requirement already satisfied: filelock in /home/vscode/.local/lib/python3.10/site-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (3.13.1)\n",
-      "Requirement already satisfied: fsspec>=2023.5.0 in /home/vscode/.local/lib/python3.10/site-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers>=0.13.2->chromadb) (2023.12.2)\n",
-      "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.10/site-packages (from importlib-metadata<7.0,>=6.0->opentelemetry-api>=1.2.0->chromadb) (3.17.0)\n",
-      "Requirement already satisfied: anyio<5,>=3.4.0 in /home/vscode/.local/lib/python3.10/site-packages (from starlette<0.33.0,>=0.29.0->fastapi>=0.95.2->chromadb) (4.2.0)\n",
-      "Requirement already satisfied: humanfriendly>=9.1 in /home/vscode/.local/lib/python3.10/site-packages (from coloredlogs->onnxruntime>=1.14.1->chromadb) (10.0)\n",
-      "Requirement already satisfied: mpmath>=0.19 in /home/vscode/.local/lib/python3.10/site-packages (from sympy->onnxruntime>=1.14.1->chromadb) (1.3.0)\n",
-      "Requirement already satisfied: sniffio>=1.1 in /home/vscode/.local/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette<0.33.0,>=0.29.0->fastapi>=0.95.2->chromadb) (1.3.0)\n",
-      "Requirement already satisfied: exceptiongroup>=1.0.2 in /home/vscode/.local/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette<0.33.0,>=0.29.0->fastapi>=0.95.2->chromadb) (1.2.0)\n",
-      "Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /home/vscode/.local/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes>=28.1.0->chromadb) (0.5.1)\n",
-      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Requirement already satisfied: sentence_transformers in /home/vscode/.local/lib/python3.10/site-packages (2.2.2)\n",
-      "Requirement already satisfied: transformers<5.0.0,>=4.6.0 in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (4.36.2)\n",
-      "Requirement already satisfied: tqdm in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (4.66.1)\n",
-      "Requirement already satisfied: torch>=1.6.0 in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (2.1.2)\n",
-      "Requirement already satisfied: torchvision in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (0.16.2)\n",
-      "Requirement already satisfied: numpy in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (1.26.3)\n",
-      "Requirement already satisfied: scikit-learn in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (1.3.2)\n",
-      "Requirement already satisfied: scipy in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (1.11.4)\n",
-      "Requirement already satisfied: nltk in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (3.8.1)\n",
-      "Requirement already satisfied: sentencepiece in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (0.1.99)\n",
-      "Requirement already satisfied: huggingface-hub>=0.4.0 in /home/vscode/.local/lib/python3.10/site-packages (from sentence_transformers) (0.20.2)\n",
-      "Requirement already satisfied: filelock in /home/vscode/.local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (3.13.1)\n",
-      "Requirement already satisfied: fsspec>=2023.5.0 in /home/vscode/.local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (2023.12.2)\n",
-      "Requirement already satisfied: requests in /usr/local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (2.31.0)\n",
-      "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (6.0.1)\n",
-      "Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/vscode/.local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (4.9.0)\n",
-      "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/site-packages (from huggingface-hub>=0.4.0->sentence_transformers) (23.2)\n",
-      "Requirement already satisfied: sympy in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (1.12)\n",
-      "Requirement already satisfied: networkx in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (3.2.1)\n",
-      "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (3.1.2)\n",
-      "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.105)\n",
-      "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.105)\n",
-      "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.105)\n",
-      "Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (8.9.2.26)\n",
-      "Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.3.1)\n",
-      "Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (11.0.2.54)\n",
-      "Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (10.3.2.106)\n",
-      "Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (11.4.5.107)\n",
-      "Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.0.106)\n",
-      "Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (2.18.1)\n",
-      "Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (12.1.105)\n",
-      "Requirement already satisfied: triton==2.1.0 in /home/vscode/.local/lib/python3.10/site-packages (from torch>=1.6.0->sentence_transformers) (2.1.0)\n",
-      "Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/vscode/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch>=1.6.0->sentence_transformers) (12.3.101)\n",
-      "Requirement already satisfied: regex!=2019.12.17 in /home/vscode/.local/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (2023.12.25)\n",
-      "Requirement already satisfied: tokenizers<0.19,>=0.14 in /home/vscode/.local/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (0.15.0)\n",
-      "Requirement already satisfied: safetensors>=0.3.1 in /home/vscode/.local/lib/python3.10/site-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (0.4.1)\n",
-      "Requirement already satisfied: click in /usr/local/lib/python3.10/site-packages (from nltk->sentence_transformers) (8.1.7)\n",
-      "Requirement already satisfied: joblib in /home/vscode/.local/lib/python3.10/site-packages (from nltk->sentence_transformers) (1.3.2)\n",
-      "Requirement already satisfied: threadpoolctl>=2.0.0 in /home/vscode/.local/lib/python3.10/site-packages (from scikit-learn->sentence_transformers) (3.2.0)\n",
-      "Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /home/vscode/.local/lib/python3.10/site-packages (from torchvision->sentence_transformers) (10.2.0)\n",
-      "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/site-packages (from jinja2->torch>=1.6.0->sentence_transformers) (2.1.3)\n",
-      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence_transformers) (3.3.2)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence_transformers) (3.6)\n",
-      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/vscode/.local/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence_transformers) (1.26.18)\n",
-      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests->huggingface-hub>=0.4.0->sentence_transformers) (2023.11.17)\n",
-      "Requirement already satisfied: mpmath>=0.19 in /home/vscode/.local/lib/python3.10/site-packages (from sympy->torch>=1.6.0->sentence_transformers) (1.3.0)\n",
-      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Requirement already satisfied: tiktoken in /home/vscode/.local/lib/python3.10/site-packages (0.5.2)\n",
-      "Requirement already satisfied: regex>=2022.1.18 in /home/vscode/.local/lib/python3.10/site-packages (from tiktoken) (2023.12.25)\n",
-      "Requirement already satisfied: requests>=2.26.0 in /usr/local/lib/python3.10/site-packages (from tiktoken) (2.31.0)\n",
-      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken) (3.3.2)\n",
-      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken) (3.6)\n",
-      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/vscode/.local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken) (1.26.18)\n",
-      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests>=2.26.0->tiktoken) (2023.11.17)\n",
-      "Defaulting to user installation because normal site-packages is not writeable\n",
-      "Requirement already satisfied: pypdf in /home/vscode/.local/lib/python3.10/site-packages (3.17.4)\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
-    "!pip install \"pyautogen>=0.2.3\"\n",
+    "!pip install \"autogen-agentchat~=0.2\"\n",
    "!pip install chromadb\n",
    "!pip install sentence_transformers\n",
    "!pip install tiktoken\n",
--- a/notebook/JSON_mode_example.ipynb
+++ b/notebook/JSON_mode_example.ipynb
@ -29,7 +29,7 @@
    "JSON mode is a feature of OpenAI API, however strong models (such as Claude 3 Opus), can generate appropriate json as well.\n",
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```"
   ]
  },
@ -40,7 +40,7 @@
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
-    "# %pip install \"pyautogen>=0.2.3\"\n",
+    "# %pip install \"autogen-agentchat~=0.2.3\"\n",
    "\n",
    "# In Your OAI_CONFIG_LIST file, you must have two configs,\n",
    "# one with:           \"response_format\": { \"type\": \"text\" }\n",
--- a/notebook/agent_memory_using_zep.ipynb
+++ b/notebook/agent_memory_using_zep.ipynb
@ -0,0 +1,532 @@
+{
+    "cells": [
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "# Building an Agent with Long-term Memory using Autogen and Zep\n",
+                "\n",
+                "This notebook walks through how to build an Autogen Agent with long-term memory. Zep builds a knowledge graph from user interactions with the agent, enabling the agent to recall relevant facts from previous conversations or user interactions.\n",
+                "\n",
+                "In this notebook we will:\n",
+                "- Create an Autogen Agent class that extends `ConversableAgent` by adding long-term memory\n",
+                "- Create a Mental Health Assistant Agent, CareBot, that acts as a counselor and coach.\n",
+                "- Create a user Agent, Cathy, who stands in for our expected user.\n",
+                "- Demonstrate preloading chat history into Zep.\n",
+                "- Demonstrate the agents in conversation, with CareBot recalling facts from previous conversations with Cathy.\n",
+                "- Inspect Facts within Zep, and demonstrate how to use Zep's Fact Ratings to improve the quality of returned facts.\n"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Requirements\n",
+                "\n",
+                "````{=mdx}\n",
+                ":::info Requirements\n",
+                "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
+                "\n",
+                "```bash\n",
+                "pip install autogen~=0.3 zep-cloud python-dotenv\n",
+                "```\n",
+                "\n",
+                "For more information, please refer to the [installation guide](/docs/installation/).\n",
+                ":::\n",
+                "````"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 1,
+            "metadata": {},
+            "outputs": [
+                {
+                    "name": "stderr",
+                    "output_type": "stream",
+                    "text": [
+                        "flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.\n"
+                    ]
+                }
+            ],
+            "source": [
+                "import os\n",
+                "import uuid\n",
+                "from typing import Dict, Union\n",
+                "\n",
+                "from dotenv import load_dotenv\n",
+                "\n",
+                "from autogen import Agent, ConversableAgent\n",
+                "\n",
+                "load_dotenv()\n",
+                "\n",
+                "config_list = [\n",
+                "    {\n",
+                "        \"model\": \"gpt-4o-mini\",\n",
+                "        \"api_key\": os.environ.get(\"OPENAI_API_KEY\"),\n",
+                "        \"max_tokens\": 1024,\n",
+                "    }\n",
+                "]"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## initiualize the Zep Client\n",
+                "\n",
+                "You can sign up for a Zep account here: https://www.getzep.com/"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 2,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "from zep_cloud import FactRatingExamples, FactRatingInstruction, Message\n",
+                "from zep_cloud.client import AsyncZep\n",
+                "\n",
+                "MIN_FACT_RATING = 0.3\n",
+                "\n",
+                "# Configure Zep\n",
+                "zep = AsyncZep(api_key=os.environ.get(\"ZEP_API_KEY\"))"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 3,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "def convert_to_zep_messages(chat_history: list[dict[str, str | None]]) -> list[Message]:\n",
+                "    \"\"\"\n",
+                "    Convert chat history to Zep messages.\n",
+                "\n",
+                "    Args:\n",
+                "    chat_history (list): List of dictionaries containing chat messages.\n",
+                "\n",
+                "    Returns:\n",
+                "    list: List of Zep Message objects.\n",
+                "    \"\"\"\n",
+                "    return [\n",
+                "        Message(\n",
+                "            role_type=msg[\"role\"],\n",
+                "            role=msg.get(\"name\", None),\n",
+                "            content=msg[\"content\"],\n",
+                "        )\n",
+                "        for msg in chat_history\n",
+                "    ]"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## ZepConversableAgent\n",
+                "\n",
+                "The `ZepConversableAgent` is a custom implementation of the `ConversableAgent` that integrates with Zep for long-term memory management. This class extends the functionality of the base `ConversableAgent` by adding Zep-specific features for persisting and retrieving facts from long-term memory."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 4,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "class ZepConversableAgent(ConversableAgent):\n",
+                "    \"\"\"\n",
+                "    A custom ConversableAgent that integrates with Zep for long-term memory.\n",
+                "    \"\"\"\n",
+                "\n",
+                "    def __init__(\n",
+                "        self,\n",
+                "        name: str,\n",
+                "        system_message: str,\n",
+                "        llm_config: dict,\n",
+                "        function_map: dict,\n",
+                "        human_input_mode: str,\n",
+                "        zep_session_id: str,\n",
+                "    ):\n",
+                "        super().__init__(\n",
+                "            name=name,\n",
+                "            system_message=system_message,\n",
+                "            llm_config=llm_config,\n",
+                "            function_map=function_map,\n",
+                "            human_input_mode=human_input_mode,\n",
+                "        )\n",
+                "        self.zep_session_id = zep_session_id\n",
+                "        # store the original system message as we will update it with relevant facts from Zep\n",
+                "        self.original_system_message = system_message\n",
+                "        self.register_hook(\"a_process_last_received_message\", self.persist_user_messages)\n",
+                "        self.register_hook(\"a_process_message_before_send\", self.persist_assistant_messages)\n",
+                "\n",
+                "    async def persist_assistant_messages(\n",
+                "        self, sender: Agent, message: Union[Dict, str], recipient: Agent, silent: bool\n",
+                "    ):\n",
+                "        \"\"\"Agent sends a message to the user. Add the message to Zep.\"\"\"\n",
+                "\n",
+                "        # Assume message is a string\n",
+                "        zep_messages = convert_to_zep_messages([{\"role\": \"assistant\", \"name\": self.name, \"content\": message}])\n",
+                "        await zep.memory.add(session_id=self.zep_session_id, messages=zep_messages)\n",
+                "\n",
+                "        return message\n",
+                "\n",
+                "    async def persist_user_messages(self, messages: list[dict[str, str]] | str):\n",
+                "        \"\"\"\n",
+                "        User sends a message to the agent. Add the message to Zep and\n",
+                "        update the system message with relevant facts from Zep.\n",
+                "        \"\"\"\n",
+                "        # Assume messages is a string\n",
+                "        zep_messages = convert_to_zep_messages([{\"role\": \"user\", \"content\": messages}])\n",
+                "        await zep.memory.add(session_id=self.zep_session_id, messages=zep_messages)\n",
+                "\n",
+                "        memory = await zep.memory.get(self.zep_session_id, min_rating=MIN_FACT_RATING)\n",
+                "\n",
+                "        # Update the system message with the relevant facts retrieved from Zep\n",
+                "        self.update_system_message(\n",
+                "            self.original_system_message\n",
+                "            + f\"\\n\\nRelevant facts about the user and their prior conversation:\\n{memory.relevant_facts}\"\n",
+                "        )\n",
+                "\n",
+                "        return messages"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Zep User and Session Management\n",
+                "\n",
+                "### Zep User\n",
+                "A Zep User represents an individual interacting with your application. Each User can have multiple Sessions associated with them, allowing you to track and manage interactions over time. The unique identifier for each user is their `UserID`, which can be any string value (e.g., username, email address, or UUID).\n",
+                "\n",
+                "### Zep Session\n",
+                "A Session represents a conversation and can be associated with Users in a one-to-many relationship. Chat messages are added to Sessions, with each session having many messages.\n",
+                "\n",
+                "### Fact Rating\n",
+                " \n",
+                "Fact Rating is a feature in Zep that allows you to rate the importance or relevance of facts extracted from conversations. This helps in prioritizing and filtering information when retrieving memory artifacts. Here, we rate facts based on poignancy. We provide a definition of poignancy and several examples of highly poignant and low-poignancy facts. When retrieving memory, you can use the `min_rating` parameter to filter facts based on their importance.\n",
+                " \n",
+                "Fact Rating helps ensure the most relevant information, especially in long or complex conversations, is used to ground the agent.\n"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 5,
+            "metadata": {},
+            "outputs": [
+                {
+                    "data": {
+                        "text/plain": [
+                            "Session(classifications=None, created_at='2024-10-07T21:12:13.952672Z', deleted_at=None, ended_at=None, fact_rating_instruction=FactRatingInstruction(examples=FactRatingExamples(high=\"The user received news of a family member's serious illness.\", low='The user bought a new brand of toothpaste.', medium='The user completed a challenging marathon.'), instruction='Rate the facts by poignancy. Highly poignant \\nfacts have a significant emotional impact or relevance to the user. \\nLow poignant facts are minimally relevant or of little emotional \\nsignificance.'), fact_version_uuid=None, facts=None, id=774, metadata=None, project_uuid='00000000-0000-0000-0000-000000000000', session_id='f3854ad0-5bd4-4814-a814-ec0880817953', updated_at='2024-10-07T21:12:13.952672Z', user_id='Cathy1023', uuid_='31ab3314-5ac8-4361-ad11-848fb7befedf')"
+                        ]
+                    },
+                    "execution_count": 5,
+                    "metadata": {},
+                    "output_type": "execute_result"
+                }
+            ],
+            "source": [
+                "bot_name = \"CareBot\"\n",
+                "user_name = \"Cathy\"\n",
+                "\n",
+                "user_id = user_name + str(uuid.uuid4())[:4]\n",
+                "session_id = str(uuid.uuid4())\n",
+                "\n",
+                "await zep.user.add(user_id=user_id)\n",
+                "\n",
+                "fact_rating_instruction = \"\"\"Rate the facts by poignancy. Highly poignant\n",
+                " facts have a significant emotional impact or relevance to the user.\n",
+                " Low poignant facts are minimally relevant or of little emotional significance.\n",
+                "\"\"\"\n",
+                "\n",
+                "fact_rating_examples = FactRatingExamples(\n",
+                "    high=\"The user received news of a family member's serious illness.\",\n",
+                "    medium=\"The user completed a challenging marathon.\",\n",
+                "    low=\"The user bought a new brand of toothpaste.\",\n",
+                ")\n",
+                "\n",
+                "await zep.memory.add_session(\n",
+                "    user_id=user_id,\n",
+                "    session_id=session_id,\n",
+                "    fact_rating_instruction=FactRatingInstruction(\n",
+                "        instruction=fact_rating_instruction,\n",
+                "        examples=fact_rating_examples,\n",
+                "    ),\n",
+                ")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Preload a prior conversation into Zep\n",
+                "\n",
+                "We'll load a prior conversation into long-term memory. We'll use facts derived from this conversation when Cathy restarts the conversation with CareBot, ensuring Carebot has context."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 6,
+            "metadata": {},
+            "outputs": [
+                {
+                    "data": {
+                        "text/plain": [
+                            "SuccessResponse(message='OK')"
+                        ]
+                    },
+                    "execution_count": 6,
+                    "metadata": {},
+                    "output_type": "execute_result"
+                }
+            ],
+            "source": [
+                "chat_history = [\n",
+                "    {\n",
+                "        \"role\": \"assistant\",\n",
+                "        \"name\": \"carebot\",\n",
+                "        \"content\": \"Hi Cathy, how are you doing today?\",\n",
+                "    },\n",
+                "    {\n",
+                "        \"role\": \"user\",\n",
+                "        \"name\": \"Cathy\",\n",
+                "        \"content\": \"To be honest, I've been feeling a bit down and demotivated lately. It's been tough.\",\n",
+                "    },\n",
+                "    {\n",
+                "        \"role\": \"assistant\",\n",
+                "        \"name\": \"CareBot\",\n",
+                "        \"content\": \"I'm sorry to hear that you're feeling down and demotivated, Cathy. It's understandable given the challenges you're facing. Can you tell me more about what's been going on?\",\n",
+                "    },\n",
+                "    {\n",
+                "        \"role\": \"user\",\n",
+                "        \"name\": \"Cathy\",\n",
+                "        \"content\": \"Well, I'm really struggling to process the passing of my mother.\",\n",
+                "    },\n",
+                "    {\n",
+                "        \"role\": \"assistant\",\n",
+                "        \"name\": \"CareBot\",\n",
+                "        \"content\": \"I'm deeply sorry for your loss, Cathy. Losing a parent is incredibly difficult. It's normal to struggle with grief, and there's no 'right' way to process it. Would you like to talk about your mother or how you're coping?\",\n",
+                "    },\n",
+                "    {\n",
+                "        \"role\": \"user\",\n",
+                "        \"name\": \"Cathy\",\n",
+                "        \"content\": \"Yes, I'd like to talk about my mother. She was a kind and loving person.\",\n",
+                "    },\n",
+                "]\n",
+                "\n",
+                "# Convert chat history to Zep messages\n",
+                "zep_messages = convert_to_zep_messages(chat_history)\n",
+                "\n",
+                "await zep.memory.add(session_id=session_id, messages=zep_messages)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Review all facts in Zep\n",
+                "\n",
+                "We query all session facts for this user session. Only facts that meet the `MIN_FACT_RATING` threshold are returned."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 7,
+            "metadata": {},
+            "outputs": [
+                {
+                    "name": "stdout",
+                    "output_type": "stream",
+                    "text": [
+                        "created_at='2024-10-07T21:12:15.96584Z' fact='Cathy describes her mother as a kind and loving person.' rating=0.5 uuid_='6a086a73-d4b8-4c1b-9b2f-08d5d326d813'\n",
+                        "created_at='2024-10-07T21:12:15.96584Z' fact='Cathy has been feeling down and demotivated lately.' rating=0.5 uuid_='e19d959c-2a01-4cc7-9d49-108719f1a749'\n",
+                        "created_at='2024-10-07T21:12:15.96584Z' fact='Cathy is struggling to process the passing of her mother.' rating=0.75 uuid_='d6c12a5d-d2a0-486e-b25d-3d4bdc5ff466'\n"
+                    ]
+                }
+            ],
+            "source": [
+                "response = await zep.memory.get_session_facts(session_id=session_id, min_rating=MIN_FACT_RATING)\n",
+                "\n",
+                "for r in response.facts:\n",
+                "    print(r)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Create the Autogen agent, CareBot, an instance of `ZepConversableAgent`\n",
+                "\n",
+                "We pass in the current `session_id` into the CareBot agent which allows it to retrieve relevant facts related to the conversation with Cathy."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 8,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "carebot_system_message = \"\"\"\n",
+                "You are a compassionate mental health bot and caregiver. Review information about the user and their prior conversation below and respond accordingly.\n",
+                "Keep responses empathetic and supportive. And remember, always prioritize the user's well-being and mental health. Keep your responses very concise and to the point.\n",
+                "\"\"\"\n",
+                "\n",
+                "agent = ZepConversableAgent(\n",
+                "    bot_name,\n",
+                "    system_message=carebot_system_message,\n",
+                "    llm_config={\"config_list\": config_list},\n",
+                "    function_map=None,  # No registered functions, by default it is None.\n",
+                "    human_input_mode=\"NEVER\",  # Never ask for human input.\n",
+                "    zep_session_id=session_id,\n",
+                ")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Create the Autogen agent, Cathy\n",
+                "\n",
+                "Cathy is a stand-in for a human. When building a production application, you'd replace Cathy with a human-in-the-loop pattern.\n",
+                "\n",
+                "**Note** that we're instructing Cathy to start the conversation with CareBit by asking about her previous session. This is an opportunity for us to test whether fact retrieval from Zep's long-term memory is working. "
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 9,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "cathy = ConversableAgent(\n",
+                "    user_name,\n",
+                "    system_message=\"You are returning to your conversation with CareBot, a mental health bot. Ask the bot about your previous session.\",\n",
+                "    llm_config={\"config_list\": config_list},\n",
+                "    human_input_mode=\"NEVER\",  # Never ask for human input.\n",
+                ")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Start the conversation\n",
+                "\n",
+                "We use Autogen's `a_initiate_chat` method to get the two agents conversing. CareBot is the primary agent.\n",
+                "\n",
+                "**NOTE** how Carebot is able to recall the past conversation about Cathy's mother in detail, having had relevant facts from Zep added to its system prompt."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "result = await agent.a_initiate_chat(\n",
+                "    cathy,\n",
+                "    message=\"Hi Cathy, nice to see you again. How are you doing today?\",\n",
+                "    max_turns=3,\n",
+                ")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Review current facts in Zep\n",
+                "\n",
+                "Let's see how the facts have evolved as the conversation has progressed."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 11,
+            "metadata": {},
+            "outputs": [
+                {
+                    "name": "stdout",
+                    "output_type": "stream",
+                    "text": [
+                        "created_at='2024-10-07T20:04:28.397184Z' fact=\"Cathy wants to reflect on a previous conversation about her mother and explore the topic of her mother's passing further.\" rating=0.75 uuid_='56488eeb-d8ac-4b2f-8acc-75f71b56ad76'\n",
+                        "created_at='2024-10-07T20:04:28.397184Z' fact='Cathy is struggling to process the passing of her mother and has been feeling down and demotivated lately.' rating=0.75 uuid_='0fea3f05-ed1a-4e39-a092-c91f8af9e501'\n",
+                        "created_at='2024-10-07T20:04:28.397184Z' fact='Cathy describes her mother as a kind and loving person.' rating=0.5 uuid_='131de203-2984-4cba-9aef-e500611f06d9'\n"
+                    ]
+                }
+            ],
+            "source": [
+                "response = await zep.memory.get_session_facts(session_id, min_rating=MIN_FACT_RATING)\n",
+                "\n",
+                "for r in response.facts:\n",
+                "    print(r)"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "## Search over Facts in Zep's long-term memory\n",
+                "\n",
+                "In addition to the `memory.get` method which uses the current conversation to retrieve facts, we can also search Zep with our own keywords. Here, we retrieve facts using a query. Again, we use fact ratings to limit the returned facts to only those with a high poignancy rating.\n",
+                "\n",
+                "The `memory.search_sessions` API may be used as an Agent tool, enabling an agent to search across user memory for relevant facts."
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 12,
+            "metadata": {},
+            "outputs": [
+                {
+                    "name": "stdout",
+                    "output_type": "stream",
+                    "text": [
+                        "created_at='2024-10-07T20:04:28.397184Z' fact='Cathy describes her mother as a kind and loving person.' rating=0.5 uuid_='131de203-2984-4cba-9aef-e500611f06d9'\n",
+                        "created_at='2024-10-07T20:04:28.397184Z' fact='Cathy is struggling to process the passing of her mother and has been feeling down and demotivated lately.' rating=0.75 uuid_='0fea3f05-ed1a-4e39-a092-c91f8af9e501'\n",
+                        "created_at='2024-10-07T20:04:28.397184Z' fact=\"Cathy wants to reflect on a previous conversation about her mother and explore the topic of her mother's passing further.\" rating=0.75 uuid_='56488eeb-d8ac-4b2f-8acc-75f71b56ad76'\n"
+                    ]
+                }
+            ],
+            "source": [
+                "response = await zep.memory.search_sessions(\n",
+                "    text=\"What do you know about Cathy's family?\",\n",
+                "    user_id=user_id,\n",
+                "    search_scope=\"facts\",\n",
+                "    min_fact_rating=MIN_FACT_RATING,\n",
+                ")\n",
+                "\n",
+                "for r in response.results:\n",
+                "    print(r.fact)"
+            ]
+        }
+    ],
+    "metadata": {
+        "front_matter": {
+            "tags": [
+                "memory"
+            ],
+            "description": "Agent Memory with Zep."
+        },
+        "kernelspec": {
+            "display_name": ".venv",
+            "language": "python",
+            "name": "python3"
+        },
+        "language_info": {
+            "codemirror_mode": {
+                "name": "ipython",
+                "version": 3
+            },
+            "file_extension": ".py",
+            "mimetype": "text/x-python",
+            "name": "python",
+            "nbconvert_exporter": "python",
+            "pygments_lexer": "ipython3",
+            "version": "3.11.9"
+        }
+    },
+    "nbformat": 4,
+    "nbformat_minor": 2
+}
--- a/notebook/agentchat_MathChat.ipynb
+++ b/notebook/agentchat_MathChat.ipynb
@ -17,14 +17,14 @@
    "\n",
    "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
    "\n",
-    "MathChat is an experimental conversational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://microsoft.github.io/autogen/blog/2023/06/28/MathChat).\n",
+    "MathChat is an experimental conversational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://microsoft.github.io/autogen/blog/2023/06/28/MathChat).\n",
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[mathchat]\n",
+    "pip install autogen-agentchat[mathchat]~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -57,9 +57,7 @@
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
-    "            \"gpt-4-1106-preview\",\n",
-    "            \"gpt-3.5-turbo\",\n",
-    "            \"gpt-35-turbo\",\n",
+    "            \"gpt-4o\",\n",
    "        }\n",
    "    },\n",
    ")"
--- a/notebook/agentchat_RetrieveChat.ipynb
+++ b/notebook/agentchat_RetrieveChat.ipynb
@ -10,7 +10,7 @@
    "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
    "\n",
-    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
+    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
    "\n",
    "## Table of Contents\n",
    "We'll demonstrate six examples of using RetrieveChat for code generation and question answering:\n",
@ -28,7 +28,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[retrievechat] flaml[automl]\n",
+    "pip install autogen-agentchat[retrievechat]~=0.2 flaml[automl]\n",
    "```\n",
    "\n",
    "*You'll need to install `chromadb<=0.5.0` if you see issue like [#3551](https://github.com/microsoft/autogen/issues/3551).*\n",
--- a/notebook/agentchat_RetrieveChat_couchbase.ipynb
+++ b/notebook/agentchat_RetrieveChat_couchbase.ipynb
@ -0,0 +1,579 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Using RetrieveChat Powered by Couchbase Capella for Retrieve Augmented Code Generation and Question Answering\n",
+    "\n",
+    "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
+    "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
+    "\n",
+    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
+    "\n",
+    "## Table of Contents\n",
+    "We'll demonstrate six examples of using RetrieveChat for code generation and question answering:\n",
+    "\n",
+    "- [Example 1: Generate code based off docstrings w/o human feedback](#example-1)\n",
+    "\n",
+    "````{=mdx}\n",
+    ":::info Requirements\n",
+    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
+    "\n",
+    "```bash\n",
+    "pip install pyautogen[retrievechat-couchbase] flaml[automl]\n",
+    "```\n",
+    "\n",
+    "For more information, please refer to the [installation guide](/docs/installation/).\n",
+    ":::\n",
+    "````\n",
+    "\n",
+    "Ensure you have a Couchbase Capella cluster running. Read more on how to get started [here](https://docs.couchbase.com/cloud/get-started/intro.html)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set your API Endpoint\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "models to use:  ['gpt-4o-mini']\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "\n",
+    "from autogen import AssistantAgent\n",
+    "\n",
+    "sys.path.append(os.path.abspath(\"/workspaces/autogen/autogen/agentchat/contrib\"))\n",
+    "\n",
+    "from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent\n",
+    "\n",
+    "# Accepted file formats for that can be stored in\n",
+    "# a vector database instance\n",
+    "from autogen.retrieve_utils import TEXT_FORMATS\n",
+    "\n",
+    "config_list = [{\"model\": \"gpt-4o-mini\", \"api_key\": os.environ[\"OPENAI_API_KEY\"], \"api_type\": \"openai\"}]\n",
+    "assert len(config_list) > 0\n",
+    "print(\"models to use: \", [config_list[i][\"model\"] for i in range(len(config_list))])"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "````{=mdx}\n",
+    ":::tip\n",
+    "Learn more about configuring LLMs for agents [here](/docs/topics/llm_configuration).\n",
+    ":::\n",
+    "````\n",
+    "\n",
+    "## Construct agents for RetrieveChat\n",
+    "\n",
+    "We start by initializing the `AssistantAgent` and `RetrieveUserProxyAgent`. The system message needs to be set to \"You are a helpful assistant.\" for AssistantAgent. The detailed instructions are given in the user message. Later we will use the `RetrieveUserProxyAgent.message_generator` to combine the instructions and a retrieval augmented generation task for an initial prompt to be sent to the LLM assistant."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accepted file formats for `docs_path`:\n",
+      "['txt', 'json', 'csv', 'tsv', 'md', 'html', 'htm', 'rtf', 'rst', 'jsonl', 'log', 'xml', 'yaml', 'yml', 'pdf']\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"Accepted file formats for `docs_path`:\")\n",
+    "print(TEXT_FORMATS)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# 1. create an AssistantAgent instance named \"assistant\"\n",
+    "assistant = AssistantAgent(\n",
+    "    name=\"assistant\",\n",
+    "    system_message=\"You are a helpful assistant.\",\n",
+    "    llm_config={\n",
+    "        \"timeout\": 600,\n",
+    "        \"cache_seed\": 42,\n",
+    "        \"config_list\": config_list,\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "# 2. create the RetrieveUserProxyAgent instance named \"ragproxyagent\"\n",
+    "# Refer to https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/retrieve_user_proxy_agent\n",
+    "# and https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/vectordb/couchbase\n",
+    "# for more information on the RetrieveUserProxyAgent and CouchbaseVectorDB\n",
+    "ragproxyagent = RetrieveUserProxyAgent(\n",
+    "    name=\"ragproxyagent\",\n",
+    "    human_input_mode=\"NEVER\",\n",
+    "    max_consecutive_auto_reply=3,\n",
+    "    retrieve_config={\n",
+    "        \"task\": \"code\",\n",
+    "        \"docs_path\": [\n",
+    "            \"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md\",\n",
+    "            \"https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md\",\n",
+    "        ],\n",
+    "        \"chunk_token_size\": 2000,\n",
+    "        \"model\": config_list[0][\"model\"],\n",
+    "        \"vector_db\": \"couchbase\",  # Couchbase Capella VectorDB\n",
+    "        \"collection_name\": \"demo_collection\",  # Couchbase Capella collection name to be utilized/created\n",
+    "        \"db_config\": {\n",
+    "            \"connection_string\": os.environ[\"CB_CONN_STR\"],  # Couchbase Capella connection string\n",
+    "            \"username\": os.environ[\"CB_USERNAME\"],  # Couchbase Capella username\n",
+    "            \"password\": os.environ[\"CB_PASSWORD\"],  # Couchbase Capella password\n",
+    "            \"bucket_name\": \"test_db\",  # Couchbase Capella bucket name\n",
+    "            \"scope_name\": \"test_scope\",  # Couchbase Capella scope name\n",
+    "            \"index_name\": \"vector_index\",  # Couchbase Capella index name to be created\n",
+    "        },\n",
+    "        \"get_or_create\": True,  # set to False if you don't want to reuse an existing collection\n",
+    "        \"overwrite\": False,  # set to True if you want to overwrite an existing collection, each overwrite will force a index creation and reupload of documents\n",
+    "    },\n",
+    "    code_execution_config=False,  # set to False if you don't want to execute the code\n",
+    ")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Example 1\n",
+    "\n",
+    "[Back to top](#table-of-contents)\n",
+    "\n",
+    "Use RetrieveChat to help generate sample code and automatically run the code and fix errors if there is any.\n",
+    "\n",
+    "Problem: Which API should I use if I want to use FLAML for a classification task and I want to train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.\n",
+    "\n",
+    "Note: You may need to create an index on the cluster to query"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2024-10-16 12:08:07,062 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - \u001b[32mUse the existing collection `demo_collection`.\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Trying to create collection.\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2024-10-16 12:08:07,953 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 2 chunks.\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "VectorDB returns doc_ids:  [['bdfbc921', '7968cf3c']]\n",
+      "\u001b[32mAdding content of doc bdfbc921 to context.\u001b[0m\n",
+      "\u001b[32mAdding content of doc 7968cf3c to context.\u001b[0m\n",
+      "\u001b[33mragproxyagent\u001b[0m (to assistant):\n",
+      "\n",
+      "You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the\n",
+      "context provided by the user.\n",
+      "If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.\n",
+      "For code generation, you must obey the following rules:\n",
+      "Rule 1. You MUST NOT install any packages because all the packages needed are already installed.\n",
+      "Rule 2. You must follow the formats below to write your code:\n",
+      "```language\n",
+      "# your code\n",
+      "```\n",
+      "\n",
+      "User's question is: How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached.\n",
+      "\n",
+      "Context is: # Integrate - Spark\n",
+      "\n",
+      "FLAML has integrated Spark for distributed training. There are two main aspects of integration with Spark:\n",
+      "\n",
+      "- Use Spark ML estimators for AutoML.\n",
+      "- Use Spark to run training in parallel spark jobs.\n",
+      "\n",
+      "## Spark ML Estimators\n",
+      "\n",
+      "FLAML integrates estimators based on Spark ML models. These models are trained in parallel using Spark, so we called them Spark estimators. To use these models, you first need to organize your data in the required format.\n",
+      "\n",
+      "### Data\n",
+      "\n",
+      "For Spark estimators, AutoML only consumes Spark data. FLAML provides a convenient function `to_pandas_on_spark` in the `flaml.automl.spark.utils` module to convert your data into a pandas-on-spark (`pyspark.pandas`) dataframe/series, which Spark estimators require.\n",
+      "\n",
+      "This utility function takes data in the form of a `pandas.Dataframe` or `pyspark.sql.Dataframe` and converts it into a pandas-on-spark dataframe. It also takes `pandas.Series` or `pyspark.sql.Dataframe` and converts it into a [pandas-on-spark](https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/index.html) series. If you pass in a `pyspark.pandas.Dataframe`, it will not make any changes.\n",
+      "\n",
+      "This function also accepts optional arguments `index_col` and `default_index_type`.\n",
+      "\n",
+      "- `index_col` is the column name to use as the index, default is None.\n",
+      "- `default_index_type` is the default index type, default is \"distributed-sequence\". More info about default index type could be found on Spark official [documentation](https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/options.html#default-index-type)\n",
+      "\n",
+      "Here is an example code snippet for Spark Data:\n",
+      "\n",
+      "```python\n",
+      "import pandas as pd\n",
+      "from flaml.automl.spark.utils import to_pandas_on_spark\n",
+      "\n",
+      "# Creating a dictionary\n",
+      "data = {\n",
+      "    \"Square_Feet\": [800, 1200, 1800, 1500, 850],\n",
+      "    \"Age_Years\": [20, 15, 10, 7, 25],\n",
+      "    \"Price\": [100000, 200000, 300000, 240000, 120000],\n",
+      "}\n",
+      "\n",
+      "# Creating a pandas DataFrame\n",
+      "dataframe = pd.DataFrame(data)\n",
+      "label = \"Price\"\n",
+      "\n",
+      "# Convert to pandas-on-spark dataframe\n",
+      "psdf = to_pandas_on_spark(dataframe)\n",
+      "```\n",
+      "\n",
+      "To use Spark ML models you need to format your data appropriately. Specifically, use [`VectorAssembler`](https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.feature.VectorAssembler.html) to merge all feature columns into a single vector column.\n",
+      "\n",
+      "Here is an example of how to use it:\n",
+      "\n",
+      "```python\n",
+      "from pyspark.ml.feature import VectorAssembler\n",
+      "\n",
+      "columns = psdf.columns\n",
+      "feature_cols = [col for col in columns if col != label]\n",
+      "featurizer = VectorAssembler(inputCols=feature_cols, outputCol=\"features\")\n",
+      "psdf = featurizer.transform(psdf.to_spark(index_col=\"index\"))[\"index\", \"features\"]\n",
+      "```\n",
+      "\n",
+      "Later in conducting the experiment, use your pandas-on-spark data like non-spark data and pass them using `X_train, y_train` or `dataframe, label`.\n",
+      "\n",
+      "### Estimators\n",
+      "\n",
+      "#### Model List\n",
+      "\n",
+      "- `lgbm_spark`: The class for fine-tuning Spark version LightGBM models, using [SynapseML](https://microsoft.github.io/SynapseML/docs/features/lightgbm/about/) API.\n",
+      "\n",
+      "#### Usage\n",
+      "\n",
+      "First, prepare your data in the required format as described in the previous section.\n",
+      "\n",
+      "By including the models you intend to try in the `estimators_list` argument to `flaml.automl`, FLAML will start trying configurations for these models. If your input is Spark data, FLAML will also use estimators with the `_spark` postfix by default, even if you haven't specified them.\n",
+      "\n",
+      "Here is an example code snippet using SparkML models in AutoML:\n",
+      "\n",
+      "```python\n",
+      "import flaml\n",
+      "\n",
+      "# prepare your data in pandas-on-spark format as we previously mentioned\n",
+      "\n",
+      "automl = flaml.AutoML()\n",
+      "settings = {\n",
+      "    \"time_budget\": 30,\n",
+      "    \"metric\": \"r2\",\n",
+      "    \"estimator_list\": [\"lgbm_spark\"],  # this setting is optional\n",
+      "    \"task\": \"regression\",\n",
+      "}\n",
+      "\n",
+      "automl.fit(\n",
+      "    dataframe=psdf,\n",
+      "    label=label,\n",
+      "    **settings,\n",
+      ")\n",
+      "```\n",
+      "\n",
+      "[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/automl_bankrupt_synapseml.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/automl_bankrupt_synapseml.ipynb)\n",
+      "\n",
+      "## Parallel Spark Jobs\n",
+      "\n",
+      "You can activate Spark as the parallel backend during parallel tuning in both [AutoML](/docs/Use-Cases/Task-Oriented-AutoML#parallel-tuning) and [Hyperparameter Tuning](/docs/Use-Cases/Tune-User-Defined-Function#parallel-tuning), by setting the `use_spark` to `true`. FLAML will dispatch your job to the distributed Spark backend using [`joblib-spark`](https://github.com/joblib/joblib-spark).\n",
+      "\n",
+      "Please note that you should not set `use_spark` to `true` when applying AutoML and Tuning for Spark Data. This is because only SparkML models will be used for Spark Data in AutoML and Tuning. As SparkML models run in parallel, there is no need to distribute them with `use_spark` again.\n",
+      "\n",
+      "All the Spark-related arguments are stated below. These arguments are available in both Hyperparameter Tuning and AutoML:\n",
+      "\n",
+      "- `use_spark`: boolean, default=False | Whether to use spark to run the training in parallel spark jobs. This can be used to accelerate training on large models and large datasets, but will incur more overhead in time and thus slow down training in some cases. GPU training is not supported yet when use_spark is True. For Spark clusters, by default, we will launch one trial per executor. However, sometimes we want to launch more trials than the number of executors (e.g., local mode). In this case, we can set the environment variable `FLAML_MAX_CONCURRENT` to override the detected `num_executors`. The final number of concurrent trials will be the minimum of `n_concurrent_trials` and `num_executors`.\n",
+      "- `n_concurrent_trials`: int, default=1 | The number of concurrent trials. When n_concurrent_trials > 1, FLAML performes parallel tuning.\n",
+      "- `force_cancel`: boolean, default=False | Whether to forcely cancel Spark jobs if the search time exceeded the time budget. Spark jobs include parallel tuning jobs and Spark-based model training jobs.\n",
+      "\n",
+      "An example code snippet for using parallel Spark jobs:\n",
+      "\n",
+      "```python\n",
+      "import flaml\n",
+      "\n",
+      "automl_experiment = flaml.AutoML()\n",
+      "automl_settings = {\n",
+      "    \"time_budget\": 30,\n",
+      "    \"metric\": \"r2\",\n",
+      "    \"task\": \"regression\",\n",
+      "    \"n_concurrent_trials\": 2,\n",
+      "    \"use_spark\": True,\n",
+      "    \"force_cancel\": True,  # Activating the force_cancel option can immediately halt Spark jobs once they exceed the allocated time_budget.\n",
+      "}\n",
+      "\n",
+      "automl.fit(\n",
+      "    dataframe=dataframe,\n",
+      "    label=label,\n",
+      "    **automl_settings,\n",
+      ")\n",
+      "```\n",
+      "\n",
+      "[Link to notebook](https://github.com/microsoft/FLAML/blob/main/notebook/integrate_spark.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/integrate_spark.ipynb)\n",
+      "# Research\n",
+      "\n",
+      "For technical details, please check our research publications.\n",
+      "\n",
+      "- [FLAML: A Fast and Lightweight AutoML Library](https://www.microsoft.com/en-us/research/publication/flaml-a-fast-and-lightweight-automl-library/). Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wang2021flaml,\n",
+      "    title={FLAML: A Fast and Lightweight AutoML Library},\n",
+      "    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},\n",
+      "    year={2021},\n",
+      "    booktitle={MLSys},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wu2021cfo,\n",
+      "    title={Frugal Optimization for Cost-related Hyperparameters},\n",
+      "    author={Qingyun Wu and Chi Wang and Silu Huang},\n",
+      "    year={2021},\n",
+      "    booktitle={AAAI},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wang2021blendsearch,\n",
+      "    title={Economical Hyperparameter Optimization With Blended Search Strategy},\n",
+      "    author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},\n",
+      "    year={2021},\n",
+      "    booktitle={ICLR},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://aclanthology.org/2021.acl-long.178.pdf). Susan Xueqing Liu, Chi Wang. ACL 2021.\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{liuwang2021hpolm,\n",
+      "    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},\n",
+      "    author={Susan Xueqing Liu and Chi Wang},\n",
+      "    year={2021},\n",
+      "    booktitle={ACL},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wu2021chacha,\n",
+      "    title={ChaCha for Online AutoML},\n",
+      "    author={Qingyun Wu and Chi Wang and John Langford and Paul Mineiro and Marco Rossi},\n",
+      "    year={2021},\n",
+      "    booktitle={ICML},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Fair AutoML](https://arxiv.org/abs/2111.06495). Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2111.06495 (2021).\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wuwang2021fairautoml,\n",
+      "    title={Fair AutoML},\n",
+      "    author={Qingyun Wu and Chi Wang},\n",
+      "    year={2021},\n",
+      "    booktitle={ArXiv preprint arXiv:2111.06495},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. ArXiv preprint arXiv:2202.09927 (2022).\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{kayaliwang2022default,\n",
+      "    title={Mining Robust Default Configurations for Resource-constrained AutoML},\n",
+      "    author={Moe Kayali and Chi Wang},\n",
+      "    year={2022},\n",
+      "    booktitle={ArXiv preprint arXiv:2202.09927},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives](https://openreview.net/forum?id=0Ij9_q567Ma). Shaokun Zhang, Feiran Jia, Chi Wang, Qingyun Wu. ICLR 2023 (notable-top-5%).\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{zhang2023targeted,\n",
+      "    title={Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives},\n",
+      "    author={Shaokun Zhang and Feiran Jia and Chi Wang and Qingyun Wu},\n",
+      "    booktitle={International Conference on Learning Representations},\n",
+      "    year={2023},\n",
+      "    url={https://openreview.net/forum?id=0Ij9_q567Ma},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673). Chi Wang, Susan Xueqing Liu, Ahmed H. Awadallah. ArXiv preprint arXiv:2303.04673 (2023).\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wang2023EcoOptiGen,\n",
+      "    title={Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference},\n",
+      "    author={Chi Wang and Susan Xueqing Liu and Ahmed H. Awadallah},\n",
+      "    year={2023},\n",
+      "    booktitle={ArXiv preprint arXiv:2303.04673},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "- [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).\n",
+      "\n",
+      "```bibtex\n",
+      "@inproceedings{wu2023empirical,\n",
+      "    title={An Empirical Study on Challenging Math Problem Solving with GPT-4},\n",
+      "    author={Yiran Wu and Feiran Jia and Shaokun Zhang and Hangyu Li and Erkang Zhu and Yue Wang and Yin Tat Lee and Richard Peng and Qingyun Wu and Chi Wang},\n",
+      "    year={2023},\n",
+      "    booktitle={ArXiv preprint arXiv:2306.01337},\n",
+      "}\n",
+      "```\n",
+      "\n",
+      "\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33massistant\u001b[0m (to ragproxyagent):\n",
+      "\n",
+      "```python\n",
+      "import pandas as pd\n",
+      "from pyspark.ml.feature import VectorAssembler\n",
+      "import flaml\n",
+      "from flaml.automl.spark.utils import to_pandas_on_spark\n",
+      "\n",
+      "# Creating a dictionary for the example data\n",
+      "data = {\n",
+      "    \"Square_Feet\": [800, 1200, 1800, 1500, 850],\n",
+      "    \"Age_Years\": [20, 15, 10, 7, 25],\n",
+      "    \"Price\": [100000, 200000, 300000, 240000, 120000],\n",
+      "}\n",
+      "\n",
+      "# Creating a pandas DataFrame\n",
+      "dataframe = pd.DataFrame(data)\n",
+      "label = \"Price\"\n",
+      "\n",
+      "# Convert to pandas-on-spark dataframe\n",
+      "psdf = to_pandas_on_spark(dataframe)\n",
+      "\n",
+      "# Prepare features using VectorAssembler\n",
+      "columns = psdf.columns\n",
+      "feature_cols = [col for col in columns if col != label]\n",
+      "featurizer = VectorAssembler(inputCols=feature_cols, outputCol=\"features\")\n",
+      "psdf = featurizer.transform(psdf.to_spark(index_col=\"index\"))[[\"index\", \"features\"]]\n",
+      "\n",
+      "# Setting up and running FLAML for AutoML with Spark\n",
+      "automl = flaml.AutoML()\n",
+      "automl_settings = {\n",
+      "    \"time_budget\": 30,                    # Set the time budget to 30 seconds\n",
+      "    \"metric\": \"r2\",                       # Performance metric\n",
+      "    \"task\": \"regression\",                 # Problem type\n",
+      "    \"n_concurrent_trials\": 2,             # Number of concurrent trials\n",
+      "    \"use_spark\": True,                    # Use Spark for parallel jobs\n",
+      "    \"force_cancel\": True,                 # Force cancel jobs if time limit is reached\n",
+      "}\n",
+      "\n",
+      "automl.fit(\n",
+      "    dataframe=psdf,\n",
+      "    label=label,\n",
+      "    **automl_settings\n",
+      ")\n",
+      "```\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33mragproxyagent\u001b[0m (to assistant):\n",
+      "\n",
+      "\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33massistant\u001b[0m (to ragproxyagent):\n",
+      "\n",
+      "UPDATE CONTEXT\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[32mUpdating context and resetting conversation.\u001b[0m\n",
+      "VectorDB returns doc_ids:  [['bdfbc921', '7968cf3c']]\n",
+      "\u001b[32mNo more context, will terminate.\u001b[0m\n",
+      "\u001b[33mragproxyagent\u001b[0m (to assistant):\n",
+      "\n",
+      "TERMINATE\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n"
+     ]
+    }
+   ],
+   "source": [
+    "# reset the assistant. Always reset the assistant before starting a new conversation.\n",
+    "assistant.reset()\n",
+    "\n",
+    "# given a problem, we use the ragproxyagent to generate a prompt to be sent to the assistant as the initial message.\n",
+    "# the assistant receives the message and generates a response. The response will be sent back to the ragproxyagent for processing.\n",
+    "# The conversation continues until the termination condition is met, in RetrieveChat, the termination condition when no human-in-loop is no code block detected.\n",
+    "# With human-in-loop, the conversation will continue until the user says \"exit\".\n",
+    "code_problem = \"How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached.\"\n",
+    "chat_result = ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem=code_problem)"
+   ]
+  }
+ ],
+ "metadata": {
+  "front_matter": {
+   "description": "Explore the use of AutoGen's RetrieveChat for tasks like code generation from docstrings, answering complex questions with human feedback, and exploiting features like Update Context, custom prompts, and few-shot learning.",
+   "tags": [
+    "RAG"
+   ]
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  },
+  "skip_test": "Requires interactive usage"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/notebook/agentchat_RetrieveChat_mongodb.ipynb
+++ b/notebook/agentchat_RetrieveChat_mongodb.ipynb
@ -10,7 +10,7 @@
    "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
    "\n",
-    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
+    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
    "\n",
    "## Table of Contents\n",
    "We'll demonstrate six examples of using RetrieveChat for code generation and question answering:\n",
@ -22,7 +22,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[retrievechat-mongodb] flaml[automl]\n",
+    "pip install autogen-agentchat[retrievechat-mongodb]~=0.2 flaml[automl]\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_RetrieveChat_pgvector.ipynb
+++ b/notebook/agentchat_RetrieveChat_pgvector.ipynb
@ -10,7 +10,7 @@
    "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framework allows tool use and human participation through multi-agent conversation.\n",
    "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
    "\n",
-    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
+    "RetrieveChat is a conversational system for retrieval-augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
    "\n",
    "## Table of Contents\n",
    "We'll demonstrate six examples of using RetrieveChat for code generation and question answering:\n",
@ -24,7 +24,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[retrievechat-pgvector] flaml[automl]\n",
+    "pip install autogen-agentchat[retrievechat-pgvector]~=0.2 flaml[automl]\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_RetrieveChat_qdrant.ipynb
+++ b/notebook/agentchat_RetrieveChat_qdrant.ipynb
@ -12,7 +12,7 @@
    "This notebook demonstrates the usage of Qdrant for RAG, based on [agentchat_RetrieveChat.ipynb](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_RetrieveChat.ipynb).\n",
    "\n",
    "\n",
-    "RetrieveChat is a conversational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)).\n",
+    "RetrieveChat is a conversational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `AssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_auto_feedback_from_code_execution.ipynb)).\n",
    "\n",
    "We'll demonstrate usage of RetrieveChat with Qdrant for code generation and question answering w/ human feedback.\n",
    "\n",
@ -21,7 +21,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install \"pyautogen[retrievechat-qdrant]\" \"flaml[automl]\"\n",
+    "pip install \"autogen-agentchat[retrievechat-qdrant]~=0.2\" \"flaml[automl]\"\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -43,7 +43,7 @@
    }
   ],
   "source": [
-    "%pip install \"pyautogen[retrievechat-qdrant]\" \"flaml[automl]\" -q"
+    "%pip install \"autogen-agentchat[retrievechat-qdrant]~=0.2\" \"flaml[automl]\" -q"
   ]
  },
  {
--- a/notebook/agentchat_agentops.ipynb
+++ b/notebook/agentchat_agentops.ipynb
@ -55,7 +55,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen agentops\n",
+    "pip install autogen-agentchat~=0.2 agentops\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_agentoptimizer.ipynb
+++ b/notebook/agentchat_agentoptimizer.ipynb
@ -53,7 +53,7 @@
      "source": [
        "# MathUserProxy with function_call\n",
        "\n",
-        "This agent is a customized MathUserProxy inherits from its [parent class](https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/math_user_proxy_agent.py).\n",
+        "This agent is a customized MathUserProxy inherits from its [parent class](https://github.com/microsoft/autogen/blob/0.2/autogen/agentchat/contrib/math_user_proxy_agent.py).\n",
        "\n",
        "It supports using both function_call and python to solve math problems.\n"
      ]
--- a/notebook/agentchat_auto_feedback_from_code_execution.ipynb
+++ b/notebook/agentchat_auto_feedback_from_code_execution.ipynb
@ -16,7 +16,7 @@
    ":::info Requirements\n",
    "Install the following packages before running the code below:\n",
    "```bash\n",
-    "pip install pyautogen matplotlib yfinance\n",
+    "pip install autogen-agentchat~=0.2 matplotlib yfinance\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -37,10 +37,10 @@
    "\n",
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
-    "    filter_dict={\"tags\": [\"gpt-4\"]},  # comment out to get all\n",
+    "    filter_dict={\"tags\": [\"gpt-4o\"]},  # comment out to get all\n",
    ")\n",
    "# When using a single openai endpoint, you can use the following:\n",
-    "# config_list = [{\"model\": \"gpt-4\", \"api_key\": os.getenv(\"OPENAI_API_KEY\")}]"
+    "# config_list = [{\"model\": \"gpt-4o\", \"api_key\": os.getenv(\"OPENAI_API_KEY\")}]"
   ]
  },
  {
--- a/notebook/agentchat_azr_ai_search.ipynb
+++ b/notebook/agentchat_azr_ai_search.ipynb
@ -84,9 +84,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "!pip3 install pyautogen==0.2.16\n",
+    "!pip3 install autogen-agentchat[graph]~=0.2\n",
    "!pip3 install python-dotenv==1.0.1\n",
-    "!pip3 install pyautogen[graph]>=0.2.11\n",
    "!pip3 install azure-search-documents==11.4.0b8\n",
    "!pip3 install azure-identity==1.12.0"
   ]
--- a/notebook/agentchat_cost_token_tracking.ipynb
+++ b/notebook/agentchat_cost_token_tracking.ipynb
@ -54,7 +54,7 @@
    "\n",
    "AutoGen requires `Python>=3.8`:\n",
    "```bash\n",
-    "pip install \"pyautogen\"\n",
+    "pip install \"autogen-agentchat~=0.2\"\n",
    "```"
   ]
  },
@ -79,7 +79,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"model\": [\"gpt-3.5-turbo\", \"gpt-3.5-turbo-16k\"],  # comment out to get all\n",
+    "        \"model\": [\"gpt-3.5-turbo\"],  # comment out to get all\n",
    "    },\n",
    ")"
   ]
@ -109,7 +109,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_custom_model.ipynb
+++ b/notebook/agentchat_custom_model.ipynb
@ -22,7 +22,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen torch transformers sentencepiece\n",
+    "pip install autogen-agentchat~=0.2 torch transformers sentencepiece\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -238,7 +238,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/notebook/oai_openai_utils.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_dalle_and_gpt4v.ipynb
+++ b/notebook/agentchat_dalle_and_gpt4v.ipynb
@ -17,7 +17,7 @@
   "source": [
    "### Before everything starts, install AutoGen with the `lmm` option\n",
    "```bash\n",
-    "pip install \"pyautogen[lmm]>=0.2.3\"\n",
+    "pip install \"autogen-agentchat[lmm]~=0.2\"\n",
    "```"
   ]
  },
--- a/notebook/agentchat_databricks_dbrx.ipynb
+++ b/notebook/agentchat_databricks_dbrx.ipynb
@ -15,7 +15,7 @@
    "This notebook will demonstrate a few basic examples of Autogen with DBRX, including the use of  `AssistantAgent`, `UserProxyAgent`, and `ConversableAgent`. These demos are not intended to be exhaustive - feel free to use them as a base to build upon!\n",
    "\n",
    "## Requirements\n",
-    "AutoGen must be installed on your Databricks cluster, and requires `Python>=3.8`. This example includes the `%pip` magic command to install: `%pip install pyautogen`, as well as other necessary libraries. \n",
+    "AutoGen must be installed on your Databricks cluster, and requires `Python>=3.8`. This example includes the `%pip` magic command to install: `%pip install autogen-agentchat~=0.2`, as well as other necessary libraries. \n",
    "\n",
    "This code has been tested on: \n",
    "* [Serverless Notebooks](https://docs.databricks.com/en/compute/serverless.html) (in public preview as of Apr 18, 2024)\n",
@ -47,13 +47,11 @@
    {
     "name": "stdout",
     "output_type": "stream",
-     "text": [
-      ""
-     ]
+     "text": []
    }
   ],
   "source": [
-    "%pip install pyautogen==0.2.25 openai==1.21.2 typing_extensions==4.11.0 --upgrade"
+    "%pip install autogen-agentchat~=0.2.25 openai==1.21.2 typing_extensions==4.11.0 --upgrade"
   ]
  },
  {
--- a/notebook/agentchat_function_call.ipynb
+++ b/notebook/agentchat_function_call.ipynb
@ -23,9 +23,9 @@
    "\n",
    "## Requirements\n",
    "\n",
-    "AutoGen requires `Python>=3.8`. To run this notebook example, please install `pyautogen`:\n",
+    "AutoGen requires `Python>=3.8`. To run this notebook example, please Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```"
   ]
  },
@ -36,7 +36,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install \"pyautogen>=0.2.3\""
+    "# %pip install \"autogen-agentchat~=0.2\""
   ]
  },
  {
@ -104,7 +104,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_function_call_async.ipynb
+++ b/notebook/agentchat_function_call_async.ipynb
@ -20,9 +20,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_function_call_code_writing.ipynb
+++ b/notebook/agentchat_function_call_code_writing.ipynb
@ -28,7 +28,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "! pip install pyautogen"
+    "! pip install autogen-agentchat~=0.2"
   ]
  },
  {
--- a/notebook/agentchat_function_call_currency_calculator.ipynb
+++ b/notebook/agentchat_function_call_currency_calculator.ipynb
@ -21,9 +21,9 @@
    "\n",
    "## Requirements\n",
    "\n",
-    "AutoGen requires `Python>=3.8`. To run this notebook example, please install `pyautogen`:\n",
+    "AutoGen requires `Python>=3.8`. To run this notebook example, please Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```"
   ]
  },
@ -34,7 +34,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install \"pyautogen>=0.2.3\""
+    "# %pip install \"autogen-agentchat~=0.2\""
   ]
  },
  {
@ -65,7 +65,7 @@
    "\n",
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
-    "    filter_dict={\"tags\": [\"3.5-tool\"]},  # comment out to get all\n",
+    "    filter_dict={\"tags\": [\"tool\"]},  # comment out to get all\n",
    ")"
   ]
  },
@ -104,7 +104,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_function_call_with_composio.ipynb
+++ b/notebook/agentchat_function_call_with_composio.ipynb
@ -0,0 +1,421 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "aRDQQophCLQs"
+   },
+   "source": [
+    "# AI email agent using Composio"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "bZVu2lrqCLQu"
+   },
+   "source": [
+    "This notebook demonstrates how to create an AI Email agent using Composio’s Gmail tool with autogen to create an agent that will automatically respond to emails based on provided instructions.\n",
+    "\n",
+    "[Composio](https://composio.dev/) allows an AI agent or LLM to easily connect to apps like Gmail, Slack, Trello etc. The key features of Composio are:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "UI78uqxyCLQu"
+   },
+   "source": [
+    "- Repository of Tools: Composio allows LLMs and agents to integrate with 100+ apps (Github, Salesforce, File Manager, Code Execution & More) to perform actions & subscribe to triggers(events).\n",
+    "\n",
+    "- Frameworks & LLM Agnostic: Composio provides out of box support for 10+ popular agentic frameworks and works with all the LLM providers using function calling.\n",
+    "\n",
+    "- Managed Auth: Composio helps manage authentication for all users/agents from a single dashboard."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Visit [Composio Docs](https://docs.composio.dev/introduction/intro/overview) to learn more."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "16kFQX0kCLQv"
+   },
+   "source": [
+    "The notebook demonstrates how to create a Gmail integration with Composio, set up a trigger for new emails, initialize agents with tools and finally we'll see the agent in action."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ixe3kpQrCLQv"
+   },
+   "source": [
+    "````{=mdx}\n",
+    ":::info Requirements\n",
+    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
+    "\n",
+    "```bash\n",
+    "pip install autogen-agentchat~=0.2 composio-autogen\n",
+    "```\n",
+    "\n",
+    "For more information, please refer to the [installation guide](/docs/installation/).\n",
+    ":::\n",
+    "````"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "XK6_H749CLQv"
+   },
+   "source": [
+    "## Composio Setup\n",
+    "\n",
+    "To get started with using Composio's Gmail tool, we need to create an integration between Composio and Gmail. This can be done using a simple command -"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "-1XLfYJRCLQv"
+   },
+   "outputs": [],
+   "source": [
+    "!composio add gmail"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "T-YxALkYCLQw"
+   },
+   "source": [
+    "To set up a trigger(basically a listener) for new emails -"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "YUzcCGaCCLQw"
+   },
+   "outputs": [],
+   "source": [
+    "!composio triggers enable gmail_new_gmail_message"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yTM8eqE2CLQw"
+   },
+   "source": [
+    "This enables the `gmail_new_gmail_message` trigger, which is fired when a new email is received in the connected account."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "GqlJ06y8CLQw"
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "from composio_autogen import Action, ComposioToolSet\n",
+    "\n",
+    "from autogen.agentchat import AssistantAgent, UserProxyAgent\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR_API_KEY\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "BHuqMynBCLQw"
+   },
+   "source": [
+    "## Initialize agents"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "VzEYr6uuCLQw"
+   },
+   "outputs": [],
+   "source": [
+    "llm_config = {\"config_list\": [{\"model\": \"gpt-4o\", \"api_key\": os.environ.get(\"OPENAI_API_KEY\")}]}\n",
+    "\n",
+    "# Prompt for email assistant\n",
+    "email_assistant_prompt = \"\"\"\n",
+    "    You are an AI email assistant specialized in drafting replies to emails.\n",
+    "    You create appropriate and professional replies to emails based on the content of the email.\n",
+    "    After executing the GMAIL_REPLY_TO_THREAD action and sending the email to the user, respond with TERMINATE.\n",
+    "\"\"\"\n",
+    "\n",
+    "# Initialize AssistantAgent\n",
+    "chatbot = AssistantAgent(\n",
+    "    \"chatbot\",\n",
+    "    system_message=email_assistant_prompt,\n",
+    "    llm_config=llm_config,\n",
+    ")\n",
+    "\n",
+    "# Initialize UserProxyAgent\n",
+    "user_proxy = UserProxyAgent(\n",
+    "    \"user_proxy\",\n",
+    "    is_termination_msg=lambda x: x.get(\"content\", \"\") and \"TERMINATE\" in x.get(\"content\", \"\"),\n",
+    "    human_input_mode=\"NEVER\",\n",
+    "    code_execution_config=False,\n",
+    "    llm_config=llm_config,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "6SB0HFZ7CLQw"
+   },
+   "source": [
+    "## Initialize Composio's Toolset\n",
+    "\n",
+    "Now, we initialize Composio's toolset and get the tools and actions we need for the agent. Then, we register the tools with the `UserProxyAgent`.\n",
+    "\n",
+    "The agent can then call the tools using function calling."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "_nosEvgxCLQx"
+   },
+   "outputs": [],
+   "source": [
+    "# Initialize Composio Toolset\n",
+    "composio_toolset = ComposioToolSet()\n",
+    "\n",
+    "# Get the required tools and register them with the agents\n",
+    "email_tools = composio_toolset.register_tools(\n",
+    "    caller=user_proxy,\n",
+    "    executor=chatbot,\n",
+    "    actions=[\n",
+    "        Action.GMAIL_REPLY_TO_THREAD,\n",
+    "    ],\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "kFkkMIIeCLQx"
+   },
+   "source": [
+    "Here, we get the `GMAIL_REPLY_TO_THREAD` action, which is just a function that can be used to reply to an email. We'll be using this action to reply to emails automatically when they arrive."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "ZsgE3qm9CLQx"
+   },
+   "source": [
+    "## Create trigger listener"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "fU6TmawGCLQx"
+   },
+   "source": [
+    "Now, we create a listener for the trigger that we created above. This listener will listen for new emails and when a new email arrives, it'll provide data associated with the email like the sender email, email content etc. This data will be used by the attached callback function to invoke the agent and to send a reply to the email."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `@listener.callback` decorator registers the function it decorates as a callback for a specific event trigger, in this case, when a new Gmail message is received (`GMAIL_NEW_GMAIL_MESSAGE`). It listens for the specified trigger and invokes the decorated function (`callback_new_message`) when the event occurs.\n",
+    "\n",
+    "After extracting the relevant data from the trigger payload, we start a conversation between `user_proxy` and `chatbot` to send a reply to the received email."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 1000
+    },
+    "id": "aDTm1tQECLQx",
+    "outputId": "8aa5ab9a-9526-4287-e8f1-7b8ac9cfb0b3"
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:composio.utils.shared:Creating trigger subscription\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Subscribed to triggers!\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:composio.utils.shared:Received trigger event with trigger ID: ea36d63f-5cc9-4581-9a19-b647e7468697 and trigger name: GMAIL_NEW_GMAIL_MESSAGE\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "user_proxy (to chatbot):\n",
+      "\n",
+      "\n",
+      "        Analyze the email content and create an appropriate reply. \n",
+      "                a. The email was received from John Doe <example_email@gmail.com> \n",
+      "                b. The content of the email is: hey, how are you?\n",
+      " \n",
+      "                c. The thread id is: 1922811a78db4....\n",
+      "        \n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "chatbot (to user_proxy):\n",
+      "\n",
+      "GMAIL_REPLY_TO_THREAD thread_id: 1922811a78db4... message: \n",
+      "Hi John,\n",
+      "\n",
+      "I'm doing well, thank you! How about you?\n",
+      "\n",
+      "Best,\n",
+      "[Your Name]\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "user_proxy (to chatbot):\n",
+      "\n",
+      "***** Suggested tool call (call_qGQzJ6XgyO8LKSSFnwkQhSCz): GMAIL_REPLY_TO_THREAD_8c4b19f45c *****\n",
+      "Arguments: \n",
+      "{\"thread_id\":\"1922811a78db4...\",\"message_body\":\"Hi John,\\n\\nI'm doing well, thank you! How about you?\\n\\nBest,\\n[Your Name]\",\"recipient_email\":\"example_email@gmail.com\"}\n",
+      "*************************************************************************************************\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\n",
+      ">>>>>>>> EXECUTING FUNCTION GMAIL_REPLY_TO_THREAD_8c4b19f45c...\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:composio.utils.shared:Executing `GMAIL_REPLY_TO_THREAD` with params={'thread_id': '1922811a78db4...', 'message_body': \"Hi John,\\n\\nI'm doing well, thank you! How about you?\\n\\nBest,\\n[Your Name]\", 'recipient_email': 'example_email@gmail.com'} and metadata={} connected_account_i...\n",
+      "INFO:composio.utils.shared:Got response={'successfull': True, 'data': {'response_data': {'id': '1922811c1b3ed...', 'threadId': '1922811a78db4...', 'labelIds': ['SENT']}}, 'error': None} from action=<composio.client.enums._action.Action object at 0x7d50554c4310> with params={'thread_...\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "chatbot (to user_proxy):\n",
+      "\n",
+      "chatbot (to user_proxy):\n",
+      "\n",
+      "***** Response from calling tool (call_qGQzJ6XgyO8LKSSFnwkQhSCz) *****\n",
+      "{\"successfull\": true, \"data\": {\"response_data\": {\"id\": \"1922811c1b3ed...\", \"threadId\": \"1922811a78db4...\", \"labelIds\": [\"SENT\"]}}, \"error\": null}\n",
+      "**********************************************************************\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "user_proxy (to chatbot):\n",
+      "\n",
+      "I've replied to the email with the following message:\n",
+      "\n",
+      "Hi John,\n",
+      "\n",
+      "I'm doing well, thank you! How about you?\n",
+      "\n",
+      "Best,\n",
+      "[Your Name]\n",
+      "\n",
+      "Is there anything else you need?\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "chatbot (to user_proxy):\n",
+      "\n",
+      "TERMINATE\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Create a trigger listener\n",
+    "listener = composio_toolset.create_trigger_listener()\n",
+    "\n",
+    "\n",
+    "@listener.callback(filters={\"trigger_name\": \"GMAIL_NEW_GMAIL_MESSAGE\"})\n",
+    "def callback_new_message(event) -> None:\n",
+    "    # Get the payload and extract relevant information\n",
+    "    payload = event.payload  # Email payload\n",
+    "    thread_id = payload.get(\"threadId\")\n",
+    "    message = payload.get(\"messageText\")\n",
+    "    sender_mail = payload.get(\"sender\")\n",
+    "    if sender_mail is None:\n",
+    "        print(\"No sender email found\")\n",
+    "        return\n",
+    "\n",
+    "    analyze_email_task = f\"\"\"\n",
+    "        Analyze the email content and create an appropriate reply.\n",
+    "                a. The email was received from {sender_mail}\n",
+    "                b. The content of the email is: {message}\n",
+    "                c. The thread id is: {thread_id}.\n",
+    "        \"\"\"\n",
+    "    # Initiate the conversation\n",
+    "    res = user_proxy.initiate_chat(chatbot, message=analyze_email_task)\n",
+    "    print(res.summary)\n",
+    "\n",
+    "\n",
+    "print(\"Subscribed to triggers!\")\n",
+    "# Start listening\n",
+    "listener.listen()"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "front_matter": {
+   "description": "Use Composio to create AI agents that seamlessly connect with external tools, Apps, and APIs to perform actions and receive triggers. With built-in support for AutoGen, Composio enables the creation of highly capable and adaptable AI agents that can autonomously execute complex tasks and deliver personalized experiences.",
+   "tags": [
+    "agents",
+    "tool use"
+   ]
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/notebook/agentchat_gemini.ipynb
+++ b/notebook/agentchat_gemini.ipynb
--- a/notebook/agentchat_group_chat_with_llamaindex_agents.ipynb
+++ b/notebook/agentchat_group_chat_with_llamaindex_agents.ipynb
@ -26,7 +26,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install pyautogen llama-index llama-index-tools-wikipedia llama-index-readers-wikipedia wikipedia"
+    "%pip install autogen-agentchat~=0.2 llama-index llama-index-tools-wikipedia llama-index-readers-wikipedia wikipedia"
   ]
  },
  {
--- a/notebook/agentchat_groupchat.ipynb
+++ b/notebook/agentchat_groupchat.ipynb
@ -14,9 +14,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -218,8 +218,11 @@
 ],
 "metadata": {
  "front_matter": {
-    "tags": ["orchestration", "group chat"],
-    "description": "Explore the utilization of large language models in automated group chat scenarios, where agents perform tasks collectively, demonstrating how they can be configured, interact with each other, and retrieve specific information from external resources."
+   "description": "Explore the utilization of large language models in automated group chat scenarios, where agents perform tasks collectively, demonstrating how they can be configured, interact with each other, and retrieve specific information from external resources.",
+   "tags": [
+    "orchestration",
+    "group chat"
+   ]
  },
  "kernelspec": {
   "display_name": "flaml",
--- a/notebook/agentchat_groupchat_RAG.ipynb
+++ b/notebook/agentchat_groupchat_RAG.ipynb
@ -15,7 +15,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[retrievechat]\n",
+    "pip install autogen-agentchat[retrievechat]~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_groupchat_customized.ipynb
+++ b/notebook/agentchat_groupchat_customized.ipynb
@ -39,9 +39,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_groupchat_finite_state_machine.ipynb
+++ b/notebook/agentchat_groupchat_finite_state_machine.ipynb
@ -18,9 +18,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -35,7 +35,7 @@
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
-    "%pip install pyautogen[graph]>=0.2.11"
+    "%pip install autogen-agentchat[graph]~=0.2.11"
   ]
  },
  {
@ -94,7 +94,7 @@
    "    \"cache_seed\": 44,  # change the seed for different trials\n",
    "    \"config_list\": autogen.config_list_from_json(\n",
    "        \"OAI_CONFIG_LIST\",\n",
-    "        filter_dict={\"tags\": [\"gpt-4\", \"gpt-4-32k\"]},  # comment out to get all\n",
+    "        filter_dict={\"tags\": [\"gpt-4o\"]},  # comment out to get all\n",
    "    ),\n",
    "    \"temperature\": 0,\n",
    "}"
--- a/notebook/agentchat_groupchat_research.ipynb
+++ b/notebook/agentchat_groupchat_research.ipynb
@ -14,9 +14,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -515,8 +515,10 @@
 ],
 "metadata": {
  "front_matter": {
-    "tags": ["group chat"],
-    "description": "Perform research using a group chat with a number of specialized agents"
+   "description": "Perform research using a group chat with a number of specialized agents",
+   "tags": [
+    "group chat"
+   ]
  },
  "kernelspec": {
   "display_name": "flaml",
--- a/notebook/agentchat_groupchat_stateflow.ipynb
+++ b/notebook/agentchat_groupchat_stateflow.ipynb
@ -12,9 +12,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -43,7 +43,7 @@
    "config_list = autogen.config_list_from_json(\n",
    "    \"OAI_CONFIG_LIST\",\n",
    "    filter_dict={\n",
-    "        \"tags\": [\"gpt-4\", \"gpt-4-32k\"],\n",
+    "        \"tags\": [\"gpt-4o\"],\n",
    "    },\n",
    ")"
   ]
--- a/notebook/agentchat_groupchat_vis.ipynb
+++ b/notebook/agentchat_groupchat_vis.ipynb
@ -12,9 +12,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -976,8 +976,10 @@
 ],
 "metadata": {
  "front_matter": {
-    "tags": ["group chat"],
-    "description": "Explore a group chat example using agents such as a coder and visualization agent."
+   "description": "Explore a group chat example using agents such as a coder and visualization agent.",
+   "tags": [
+    "group chat"
+   ]
  },
  "kernelspec": {
   "display_name": "flaml",
--- a/notebook/agentchat_huggingface_langchain.ipynb
+++ b/notebook/agentchat_huggingface_langchain.ipynb
@ -0,0 +1,866 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Using AutoGen AgentChat with LangChain-based Custom Client and Hugging Face Models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Introduction\n",
+    "\n",
+    "This notebook demonstrates how you can use LangChain's extensive support for LLMs to enable flexible use of various Language Models (LLMs) in agent-based conversations in AutoGen.\n",
+    "\n",
+    "What we'll cover:\n",
+    "\n",
+    "1. Creating a custom model client that uses LangChain to load and interact with LLMs\n",
+    "2. Configuring AutoGen to use our custom LangChain-based model\n",
+    "3. Setting up AutoGen agents with the custom model\n",
+    "4. Demonstrating a simple conversation using this setup\n",
+    "\n",
+    "While we used a Hugging Face model in this example, the same approach can be applied to any LLM supported by LangChain, including models from OpenAI, Anthropic, or custom models. This integration opens up a wide range of possibilities for creating sophisticated, multi-model conversational agents using AutoGen\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Requirements\n",
+    "\n",
+    "````{=mdx}\n",
+    ":::info Requirements\n",
+    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
+    "\n",
+    "```bash\n",
+    "pip install pyautogen torch transformers sentencepiece langchain-huggingface \n",
+    "```\n",
+    "\n",
+    "For more information, please refer to the [installation guide](/docs/installation/).\n",
+    ":::\n",
+    "````\n",
+    "\n",
+    "**NOTE: Depending on what model you use, you may need to play with the default prompts of the Agent's**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup and Imports\n",
+    "\n",
+    "First, let's import the necessary libraries and define our custom model client."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "import os\n",
+    "from types import SimpleNamespace\n",
+    "\n",
+    "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n",
+    "from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline\n",
+    "\n",
+    "from autogen import AssistantAgent, UserProxyAgent, config_list_from_json"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create and configure the custom model\n",
+    "\n",
+    "A custom model class can be created in many ways, but needs to adhere to the `ModelClient` protocol and response structure which is defined in client.py and shown below.\n",
+    "\n",
+    "The response protocol has some minimum requirements, but can be extended to include any additional information that is needed.\n",
+    "Message retrieval therefore can be customized, but needs to return a list of strings or a list of `ModelClientResponseProtocol.Choice.Message` objects.\n",
+    "\n",
+    "\n",
+    "```python\n",
+    "class ModelClient(Protocol):\n",
+    "    \"\"\"\n",
+    "    A client class must implement the following methods:\n",
+    "    - create must return a response object that implements the ModelClientResponseProtocol\n",
+    "    - cost must return the cost of the response\n",
+    "    - get_usage must return a dict with the following keys:\n",
+    "        - prompt_tokens\n",
+    "        - completion_tokens\n",
+    "        - total_tokens\n",
+    "        - cost\n",
+    "        - model\n",
+    "\n",
+    "    This class is used to create a client that can be used by OpenAIWrapper.\n",
+    "    The response returned from create must adhere to the ModelClientResponseProtocol but can be extended however needed.\n",
+    "    The message_retrieval method must be implemented to return a list of str or a list of messages from the response.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    RESPONSE_USAGE_KEYS = [\"prompt_tokens\", \"completion_tokens\", \"total_tokens\", \"cost\", \"model\"]\n",
+    "\n",
+    "    class ModelClientResponseProtocol(Protocol):\n",
+    "        class Choice(Protocol):\n",
+    "            class Message(Protocol):\n",
+    "                content: Optional[str]\n",
+    "\n",
+    "            message: Message\n",
+    "\n",
+    "        choices: List[Choice]\n",
+    "        model: str\n",
+    "\n",
+    "    def create(self, params) -> ModelClientResponseProtocol:\n",
+    "        ...\n",
+    "\n",
+    "    def message_retrieval(\n",
+    "        self, response: ModelClientResponseProtocol\n",
+    "    ) -> Union[List[str], List[ModelClient.ModelClientResponseProtocol.Choice.Message]]:\n",
+    "        \"\"\"\n",
+    "        Retrieve and return a list of strings or a list of Choice.Message from the response.\n",
+    "\n",
+    "        NOTE: if a list of Choice.Message is returned, it currently needs to contain the fields of OpenAI's ChatCompletion Message object,\n",
+    "        since that is expected for function or tool calling in the rest of the codebase at the moment, unless a custom agent is being used.\n",
+    "        \"\"\"\n",
+    "        ...\n",
+    "\n",
+    "    def cost(self, response: ModelClientResponseProtocol) -> float:\n",
+    "        ...\n",
+    "\n",
+    "    @staticmethod\n",
+    "    def get_usage(response: ModelClientResponseProtocol) -> Dict:\n",
+    "        \"\"\"Return usage summary of the response using RESPONSE_USAGE_KEYS.\"\"\"\n",
+    "        ...\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example of simple custom client\n",
+    "\n",
+    "Following the huggingface example for using [Mistral's Open-Orca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)\n",
+    "\n",
+    "For the response object, python's `SimpleNamespace` is used to create a simple object that can be used to store the response data, but any object that follows the `ClientResponseProtocol` can be used.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# custom client with custom model loader\n",
+    "\n",
+    "\n",
+    "class CustomModelClient:\n",
+    "    \"\"\"Custom model client implementation for LangChain integration with AutoGen.\"\"\"\n",
+    "\n",
+    "    def __init__(self, config, **kwargs):\n",
+    "        \"\"\"Initialize the CustomModelClient.\"\"\"\n",
+    "        print(f\"CustomModelClient config: {config}\")\n",
+    "        self.device = config.get(\"device\", \"cpu\")\n",
+    "\n",
+    "        gen_config_params = config.get(\"params\", {})\n",
+    "        self.model_name = config[\"model\"]\n",
+    "        pipeline = HuggingFacePipeline.from_model_id(\n",
+    "            model_id=self.model_name,\n",
+    "            task=\"text-generation\",\n",
+    "            pipeline_kwargs=gen_config_params,\n",
+    "            device=self.device,\n",
+    "        )\n",
+    "        self.model = ChatHuggingFace(llm=pipeline)\n",
+    "        print(f\"Loaded model {config['model']} to {self.device}\")\n",
+    "\n",
+    "    def _to_chatml_format(self, message):\n",
+    "        \"\"\"Convert message to ChatML format.\"\"\"\n",
+    "        if message[\"role\"] == \"system\":\n",
+    "            return SystemMessage(content=message[\"content\"])\n",
+    "        if message[\"role\"] == \"assistant\":\n",
+    "            return AIMessage(content=message[\"content\"])\n",
+    "        if message[\"role\"] == \"user\":\n",
+    "            return HumanMessage(content=message[\"content\"])\n",
+    "        raise ValueError(f\"Unknown message type: {type(message)}\")\n",
+    "\n",
+    "    def create(self, params):\n",
+    "        \"\"\"Create a response using the model.\"\"\"\n",
+    "        if params.get(\"stream\", False) and \"messages\" in params:\n",
+    "            raise NotImplementedError(\"Local models do not support streaming.\")\n",
+    "\n",
+    "        num_of_responses = params.get(\"n\", 1)\n",
+    "        response = SimpleNamespace()\n",
+    "        inputs = [self._to_chatml_format(m) for m in params[\"messages\"]]\n",
+    "        response.choices = []\n",
+    "        response.model = self.model_name\n",
+    "\n",
+    "        for _ in range(num_of_responses):\n",
+    "            outputs = self.model.invoke(inputs)\n",
+    "            text = outputs.content\n",
+    "            choice = SimpleNamespace()\n",
+    "            choice.message = SimpleNamespace()\n",
+    "            choice.message.content = text\n",
+    "            choice.message.function_call = None\n",
+    "            response.choices.append(choice)\n",
+    "\n",
+    "        return response\n",
+    "\n",
+    "    def message_retrieval(self, response):\n",
+    "        \"\"\"Retrieve messages from the response.\"\"\"\n",
+    "        return [choice.message.content for choice in response.choices]\n",
+    "\n",
+    "    def cost(self, response) -> float:\n",
+    "        \"\"\"Calculate the cost of the response.\"\"\"\n",
+    "        response.cost = 0\n",
+    "        return 0\n",
+    "\n",
+    "    @staticmethod\n",
+    "    def get_usage(response):\n",
+    "        \"\"\"Get usage statistics.\"\"\"\n",
+    "        return {}"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set your API Endpoint\n",
+    "\n",
+    "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n",
+    "\n",
+    "It first looks for an environment variable of a specified name (\"OAI_CONFIG_LIST\" in this example), which needs to be a valid json string. If that variable is not found, it looks for a json file with the same name. It filters the configs by models (you can filter by other keys as well).\n",
+    "\n",
+    "The json looks like the following:\n",
+    "```json\n",
+    "[\n",
+    "    {\n",
+    "        \"model\": \"gpt-4\",\n",
+    "        \"api_key\": \"<your OpenAI API key here>\"\n",
+    "    },\n",
+    "    {\n",
+    "        \"model\": \"gpt-4\",\n",
+    "        \"api_key\": \"<your Azure OpenAI API key here>\",\n",
+    "        \"base_url\": \"<your Azure OpenAI API base here>\",\n",
+    "        \"api_type\": \"azure\",\n",
+    "        \"api_version\": \"2024-02-01\"\n",
+    "    },\n",
+    "    {\n",
+    "        \"model\": \"gpt-4-32k\",\n",
+    "        \"api_key\": \"<your Azure OpenAI API key here>\",\n",
+    "        \"base_url\": \"<your Azure OpenAI API base here>\",\n",
+    "        \"api_type\": \"azure\",\n",
+    "        \"api_version\": \"2024-02-01\"\n",
+    "    }\n",
+    "]\n",
+    "```\n",
+    "\n",
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set the config for the custom model\n",
+    "\n",
+    "You can add any paramteres that are needed for the custom model loading in the same configuration list.\n",
+    "\n",
+    "It is important to add the `model_client_cls` field and set it to a string that corresponds to the class name: `\"CustomModelClient\"`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "os.environ[\"OAI_CONFIG_LIST\"] = json.dumps(\n",
+    "    [\n",
+    "        {\n",
+    "            \"model\": \"mistralai/Mistral-7B-Instruct-v0.2\",\n",
+    "            \"model_client_cls\": \"CustomModelClient\",\n",
+    "            \"device\": 0,\n",
+    "            \"n\": 1,\n",
+    "            \"params\": {\n",
+    "                \"max_new_tokens\": 500,\n",
+    "                \"top_k\": 50,\n",
+    "                \"temperature\": 0.1,\n",
+    "                \"do_sample\": True,\n",
+    "                \"return_full_text\": False,\n",
+    "            },\n",
+    "        }\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config_list_custom = config_list_from_json(\n",
+    "    \"OAI_CONFIG_LIST\",\n",
+    "    filter_dict={\"model_client_cls\": [\"CustomModelClient\"]},\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "\n",
+    "from huggingface_hub import login\n",
+    "\n",
+    "# The Mistral-7B-Instruct-v0.2 is a gated model which requires API token to access\n",
+    "login(token=getpass.getpass(\"Enter your HuggingFace API Token\"))"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Construct Agents\n",
+    "\n",
+    "Consturct a simple conversation between a User proxy and an Assistent agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[autogen.oai.client: 09-01 12:53:51] {484} INFO - Detected custom model client in config: CustomModelClient, model client can not be used until register_model_client is called.\n"
+     ]
+    }
+   ],
+   "source": [
+    "assistant = AssistantAgent(\"assistant\", llm_config={\"config_list\": config_list_custom})\n",
+    "user_proxy = UserProxyAgent(\"user_proxy\", code_execution_config=False)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Register the custom client class to the assistant agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CustomModelClient config: {'model': 'microsoft/Phi-3.5-mini-instruct', 'model_client_cls': 'CustomModelClient', 'device': 0, 'n': 1, 'params': {'max_new_tokens': 100, 'top_k': 50, 'temperature': 0.1, 'do_sample': True, 'return_full_text': False}}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:07<00:00,  3.51s/it]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Loaded model microsoft/Phi-3.5-mini-instruct to 0\n"
+     ]
+    }
+   ],
+   "source": [
+    "assistant.register_model_client(model_client_cls=CustomModelClient)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+      "\n",
+      "Write python code to print Hello World!\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "You are not running the flash-attention implementation, expect numerical differences.\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+      "\n",
+      " ```python\n",
+      "# filename: hello_world.py\n",
+      "\n",
+      "print(\"Hello World!\")\n",
+      "```\n",
+      "\n",
+      "To execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:\n",
+      "\n",
+      "```\n",
+      "python hello_world.py\n",
+      "```\n",
+      "\n",
+      "The output should be:\n",
+      "\n",
+      "```\n",
+      "Hello World!\n",
+      "```\n",
+      "\n",
+      "If you encounter any errors,\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "ChatResult(chat_id=None, chat_history=[{'content': 'Write python code to print Hello World!', 'role': 'assistant', 'name': 'user_proxy'}, {'content': ' ```python\\n# filename: hello_world.py\\n\\nprint(\"Hello World!\")\\n```\\n\\nTo execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:\\n\\n```\\npython hello_world.py\\n```\\n\\nThe output should be:\\n\\n```\\nHello World!\\n```\\n\\nIf you encounter any errors,', 'role': 'user', 'name': 'assistant'}], summary=' ```python\\n# filename: hello_world.py\\n\\nprint(\"Hello World!\")\\n```\\n\\nTo execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:\\n\\n```\\npython hello_world.py\\n```\\n\\nThe output should be:\\n\\n```\\nHello World!\\n```\\n\\nIf you encounter any errors,', cost={'usage_including_cached_inference': {'total_cost': 0}, 'usage_excluding_cached_inference': {'total_cost': 0}}, human_input=['exit'])"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "user_proxy.initiate_chat(assistant, message=\"Write python code to print Hello World!\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "front_matter": {
+   "description": "Define and laod a custom model",
+   "tags": [
+    "custom model"
+   ]
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
+   }
+  },
+  "widgets": {
+   "application/vnd.jupyter.widget-state+json": {
+    "state": {
+     "2d910cfd2d2a4fc49fc30fbbdc5576a7": {
+      "model_module": "@jupyter-widgets/base",
+      "model_module_version": "2.0.0",
+      "model_name": "LayoutModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/base",
+       "_model_module_version": "2.0.0",
+       "_model_name": "LayoutModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "LayoutView",
+       "align_content": null,
+       "align_items": null,
+       "align_self": null,
+       "border_bottom": null,
+       "border_left": null,
+       "border_right": null,
+       "border_top": null,
+       "bottom": null,
+       "display": null,
+       "flex": null,
+       "flex_flow": null,
+       "grid_area": null,
+       "grid_auto_columns": null,
+       "grid_auto_flow": null,
+       "grid_auto_rows": null,
+       "grid_column": null,
+       "grid_gap": null,
+       "grid_row": null,
+       "grid_template_areas": null,
+       "grid_template_columns": null,
+       "grid_template_rows": null,
+       "height": null,
+       "justify_content": null,
+       "justify_items": null,
+       "left": null,
+       "margin": null,
+       "max_height": null,
+       "max_width": null,
+       "min_height": null,
+       "min_width": null,
+       "object_fit": null,
+       "object_position": null,
+       "order": null,
+       "overflow": null,
+       "padding": null,
+       "right": null,
+       "top": null,
+       "visibility": null,
+       "width": null
+      }
+     },
+     "454146d0f7224f038689031002906e6f": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "HBoxModel",
+      "state": {
+       "_dom_classes": [],
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "HBoxModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/controls",
+       "_view_module_version": "2.0.0",
+       "_view_name": "HBoxView",
+       "box_style": "",
+       "children": [
+        "IPY_MODEL_e4ae2b6f5a974fd4bafb6abb9d12ff26",
+        "IPY_MODEL_577e1e3cc4db4942b0883577b3b52755",
+        "IPY_MODEL_b40bdfb1ac1d4cffb7cefcb870c64d45"
+       ],
+       "layout": "IPY_MODEL_dc83c7bff2f241309537a8119dfc7555",
+       "tabbable": null,
+       "tooltip": null
+      }
+     },
+     "577e1e3cc4db4942b0883577b3b52755": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "FloatProgressModel",
+      "state": {
+       "_dom_classes": [],
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "FloatProgressModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/controls",
+       "_view_module_version": "2.0.0",
+       "_view_name": "ProgressView",
+       "bar_style": "success",
+       "description": "",
+       "description_allow_html": false,
+       "layout": "IPY_MODEL_2d910cfd2d2a4fc49fc30fbbdc5576a7",
+       "max": 1,
+       "min": 0,
+       "orientation": "horizontal",
+       "style": "IPY_MODEL_74a6ba0c3cbc4051be0a83e152fe1e62",
+       "tabbable": null,
+       "tooltip": null,
+       "value": 1
+      }
+     },
+     "6086462a12d54bafa59d3c4566f06cb2": {
+      "model_module": "@jupyter-widgets/base",
+      "model_module_version": "2.0.0",
+      "model_name": "LayoutModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/base",
+       "_model_module_version": "2.0.0",
+       "_model_name": "LayoutModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "LayoutView",
+       "align_content": null,
+       "align_items": null,
+       "align_self": null,
+       "border_bottom": null,
+       "border_left": null,
+       "border_right": null,
+       "border_top": null,
+       "bottom": null,
+       "display": null,
+       "flex": null,
+       "flex_flow": null,
+       "grid_area": null,
+       "grid_auto_columns": null,
+       "grid_auto_flow": null,
+       "grid_auto_rows": null,
+       "grid_column": null,
+       "grid_gap": null,
+       "grid_row": null,
+       "grid_template_areas": null,
+       "grid_template_columns": null,
+       "grid_template_rows": null,
+       "height": null,
+       "justify_content": null,
+       "justify_items": null,
+       "left": null,
+       "margin": null,
+       "max_height": null,
+       "max_width": null,
+       "min_height": null,
+       "min_width": null,
+       "object_fit": null,
+       "object_position": null,
+       "order": null,
+       "overflow": null,
+       "padding": null,
+       "right": null,
+       "top": null,
+       "visibility": null,
+       "width": null
+      }
+     },
+     "74a6ba0c3cbc4051be0a83e152fe1e62": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "ProgressStyleModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "ProgressStyleModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "StyleView",
+       "bar_color": null,
+       "description_width": ""
+      }
+     },
+     "7d3f3d9e15894d05a4d188ff4f466554": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "HTMLStyleModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "HTMLStyleModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "StyleView",
+       "background": null,
+       "description_width": "",
+       "font_size": null,
+       "text_color": null
+      }
+     },
+     "b40bdfb1ac1d4cffb7cefcb870c64d45": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "HTMLModel",
+      "state": {
+       "_dom_classes": [],
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "HTMLModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/controls",
+       "_view_module_version": "2.0.0",
+       "_view_name": "HTMLView",
+       "description": "",
+       "description_allow_html": false,
+       "layout": "IPY_MODEL_f1355871cc6f4dd4b50d9df5af20e5c8",
+       "placeholder": "",
+       "style": "IPY_MODEL_ca245376fd9f4354af6b2befe4af4466",
+       "tabbable": null,
+       "tooltip": null,
+       "value": " 1/1 [00:00&lt;00:00, 44.69it/s]"
+      }
+     },
+     "ca245376fd9f4354af6b2befe4af4466": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "HTMLStyleModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "HTMLStyleModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "StyleView",
+       "background": null,
+       "description_width": "",
+       "font_size": null,
+       "text_color": null
+      }
+     },
+     "dc83c7bff2f241309537a8119dfc7555": {
+      "model_module": "@jupyter-widgets/base",
+      "model_module_version": "2.0.0",
+      "model_name": "LayoutModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/base",
+       "_model_module_version": "2.0.0",
+       "_model_name": "LayoutModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "LayoutView",
+       "align_content": null,
+       "align_items": null,
+       "align_self": null,
+       "border_bottom": null,
+       "border_left": null,
+       "border_right": null,
+       "border_top": null,
+       "bottom": null,
+       "display": null,
+       "flex": null,
+       "flex_flow": null,
+       "grid_area": null,
+       "grid_auto_columns": null,
+       "grid_auto_flow": null,
+       "grid_auto_rows": null,
+       "grid_column": null,
+       "grid_gap": null,
+       "grid_row": null,
+       "grid_template_areas": null,
+       "grid_template_columns": null,
+       "grid_template_rows": null,
+       "height": null,
+       "justify_content": null,
+       "justify_items": null,
+       "left": null,
+       "margin": null,
+       "max_height": null,
+       "max_width": null,
+       "min_height": null,
+       "min_width": null,
+       "object_fit": null,
+       "object_position": null,
+       "order": null,
+       "overflow": null,
+       "padding": null,
+       "right": null,
+       "top": null,
+       "visibility": null,
+       "width": null
+      }
+     },
+     "e4ae2b6f5a974fd4bafb6abb9d12ff26": {
+      "model_module": "@jupyter-widgets/controls",
+      "model_module_version": "2.0.0",
+      "model_name": "HTMLModel",
+      "state": {
+       "_dom_classes": [],
+       "_model_module": "@jupyter-widgets/controls",
+       "_model_module_version": "2.0.0",
+       "_model_name": "HTMLModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/controls",
+       "_view_module_version": "2.0.0",
+       "_view_name": "HTMLView",
+       "description": "",
+       "description_allow_html": false,
+       "layout": "IPY_MODEL_6086462a12d54bafa59d3c4566f06cb2",
+       "placeholder": "",
+       "style": "IPY_MODEL_7d3f3d9e15894d05a4d188ff4f466554",
+       "tabbable": null,
+       "tooltip": null,
+       "value": "100%"
+      }
+     },
+     "f1355871cc6f4dd4b50d9df5af20e5c8": {
+      "model_module": "@jupyter-widgets/base",
+      "model_module_version": "2.0.0",
+      "model_name": "LayoutModel",
+      "state": {
+       "_model_module": "@jupyter-widgets/base",
+       "_model_module_version": "2.0.0",
+       "_model_name": "LayoutModel",
+       "_view_count": null,
+       "_view_module": "@jupyter-widgets/base",
+       "_view_module_version": "2.0.0",
+       "_view_name": "LayoutView",
+       "align_content": null,
+       "align_items": null,
+       "align_self": null,
+       "border_bottom": null,
+       "border_left": null,
+       "border_right": null,
+       "border_top": null,
+       "bottom": null,
+       "display": null,
+       "flex": null,
+       "flex_flow": null,
+       "grid_area": null,
+       "grid_auto_columns": null,
+       "grid_auto_flow": null,
+       "grid_auto_rows": null,
+       "grid_column": null,
+       "grid_gap": null,
+       "grid_row": null,
+       "grid_template_areas": null,
+       "grid_template_columns": null,
+       "grid_template_rows": null,
+       "height": null,
+       "justify_content": null,
+       "justify_items": null,
+       "left": null,
+       "margin": null,
+       "max_height": null,
+       "max_width": null,
+       "min_height": null,
+       "min_width": null,
+       "object_fit": null,
+       "object_position": null,
+       "order": null,
+       "overflow": null,
+       "padding": null,
+       "right": null,
+       "top": null,
+       "visibility": null,
+       "width": null
+      }
+     }
+    },
+    "version_major": 2,
+    "version_minor": 0
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/notebook/agentchat_human_feedback.ipynb
+++ b/notebook/agentchat_human_feedback.ipynb
@ -28,7 +28,7 @@
    "\n",
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```"
   ]
  },
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install \"pyautogen>=0.2.3\""
+    "# %pip install \"autogen-agentchat~=0.2\""
   ]
  },
  {
@ -102,7 +102,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_image_generation_capability.ipynb
+++ b/notebook/agentchat_image_generation_capability.ipynb
@ -20,7 +20,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[lmm]\n",
+    "pip install autogen-agentchat[lmm]~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_langchain.ipynb
+++ b/notebook/agentchat_langchain.ipynb
@ -28,9 +28,9 @@
    "\n",
    "## Requirements\n",
    "\n",
-    "AutoGen requires `Python>=3.8`. To run this notebook example, please install `pyautogen` and `Langchain`:\n",
+    "AutoGen requires `Python>=3.8`. To run this notebook example, please install `autogen-agentchat` and `Langchain`:\n",
    "```bash\n",
-    "pip install pyautogen Langchain\n",
+    "pip install autogen-agentchat~=0.2 Langchain\n",
    "```"
   ]
  },
@ -47,7 +47,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install \"pyautogen>=0.2.3\" Langchain"
+    "%pip install \"autogen-agentchat~=0.2\" Langchain"
   ]
  },
  {
@ -139,7 +139,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_lmm_gpt-4v.ipynb
+++ b/notebook/agentchat_lmm_gpt-4v.ipynb
@ -21,9 +21,9 @@
   "source": [
    "### Before everything starts, install AutoGen with the `lmm` option\n",
    "\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install \"pyautogen[lmm]>=0.2.17\"\n",
+    "pip install \"autogen-agentchat[lmm]~=0.2\"\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n"
--- a/notebook/agentchat_lmm_llava.ipynb
+++ b/notebook/agentchat_lmm_llava.ipynb
@ -26,7 +26,7 @@
   "source": [
    "### Before everything starts, install AutoGen with the `lmm` option\n",
    "```bash\n",
-    "pip install \"pyautogen[lmm]>=0.2.3\"\n",
+    "pip install \"autogen-agentchat[lmm]~=0.2\"\n",
    "```"
   ]
  },
--- a/notebook/agentchat_memory_using_mem0.ipynb
+++ b/notebook/agentchat_memory_using_mem0.ipynb
@ -17,7 +17,7 @@
   "source": [
    "This notebook demonstrates an intelligent customer service chatbot system that combines:\n",
    "\n",
-    "- PyAutoGen for conversational agents\n",
+    "- AutoGen for conversational agents\n",
    "- Mem0 for memory management\n",
    "\n",
    "[Mem0](https://www.mem0.ai/) provides a smart, self-improving memory layer for Large Language Models (LLMs), enabling developers to create personalized AI experiences that evolve with each user interaction. Refer [docs](https://docs.mem0.ai/overview) for more information.\n",
@ -50,7 +50,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen mem0ai\n",
+    "pip install autogen-agentchat~=0.2 mem0ai\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_microsoft_fabric.ipynb
+++ b/notebook/agentchat_microsoft_fabric.ipynb
@ -225,8 +225,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# pyautogen>0.1.14 supports openai>=1\n",
-    "%pip install \"pyautogen>0.2\" \"openai>1\" -q"
+    "# autogen-agentchat>0.1.14 supports openai>=1\n",
+    "%pip install \"autogen-agentchat~=0.2\" \"openai>1\" -q"
   ]
  },
  {
@ -418,7 +418,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install \"pyautogen[retrievechat,lmm]>=0.2.28\" -q"
+    "%pip install \"autogen-agentchat[retrievechat,lmm]~=0.2\" -q"
   ]
  },
  {
--- a/notebook/agentchat_multi_task_async_chats.ipynb
+++ b/notebook/agentchat_multi_task_async_chats.ipynb
@ -15,9 +15,9 @@
    "\n",
    "\\:\\:\\:info Requirements\n",
    "\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_multi_task_chats.ipynb
+++ b/notebook/agentchat_multi_task_chats.ipynb
@ -15,9 +15,9 @@
    "\n",
    "\\:\\:\\:info Requirements\n",
    "\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_nested_chats_chess.ipynb
+++ b/notebook/agentchat_nested_chats_chess.ipynb
@ -39,7 +39,7 @@
   "source": [
    "## Installation\n",
    "\n",
-    "First you need to install the `pyautogen` and `chess` packages to use AutoGen."
+    "First you need to install the `autogen-agentchat~=0.2` and `chess` packages to use AutoGen."
   ]
  },
  {
@ -48,7 +48,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "! pip install -qqq pyautogen chess"
+    "! pip install -qqq autogen-agentchat~=0.2 chess"
   ]
  },
  {
--- a/notebook/agentchat_nested_chats_chess_altmodels.ipynb
+++ b/notebook/agentchat_nested_chats_chess_altmodels.ipynb
@ -40,7 +40,7 @@
   "source": [
    "## Installation\n",
    "\n",
-    "First, you need to install the `pyautogen` and `chess` packages to use AutoGen. We'll include Anthropic and Together.AI libraries."
+    "First, you need to install the `autogen-agentchat~=0.2` and `chess` packages to use AutoGen. We'll include Anthropic and Together.AI libraries."
   ]
  },
  {
@ -49,7 +49,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "! pip install -qqq pyautogen[anthropic,together] chess"
+    "! pip install -qqq autogen-agentchat[anthropic,together]~=0.2 chess"
   ]
  },
  {
--- a/notebook/agentchat_nested_sequential_chats.ipynb
+++ b/notebook/agentchat_nested_sequential_chats.ipynb
@ -15,9 +15,9 @@
    "\n",
    "\\:\\:\\:info Requirements\n",
    "\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_nestedchat.ipynb
+++ b/notebook/agentchat_nestedchat.ipynb
@ -15,9 +15,9 @@
    "\n",
    "\\:\\:\\:info Requirements\n",
    "\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_nestedchat_optiguide.ipynb
+++ b/notebook/agentchat_nestedchat_optiguide.ipynb
@ -21,7 +21,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen eventlet gurobipy\n",
+    "pip install autogen-agentchat~=0.2 eventlet gurobipy\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_oai_assistant_function_call.ipynb
+++ b/notebook/agentchat_oai_assistant_function_call.ipynb
@ -19,9 +19,9 @@
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -36,7 +36,7 @@
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
-    "# %pip install \"pyautogen>=0.2.3\""
+    "# %pip install \"autogen-agentchat~=0.2\""
   ]
  },
  {
--- a/notebook/agentchat_oai_assistant_groupchat.ipynb
+++ b/notebook/agentchat_oai_assistant_groupchat.ipynb
@ -16,9 +16,9 @@
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_oai_assistant_retrieval.ipynb
+++ b/notebook/agentchat_oai_assistant_retrieval.ipynb
@ -6,7 +6,7 @@
   "source": [
    "## RAG OpenAI Assistants in AutoGen\n",
    "\n",
-    "This notebook shows an example of the [`GPTAssistantAgent`](https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/gpt_assistant_agent.py#L16C43-L16C43) with retrieval augmented generation. `GPTAssistantAgent` is an experimental AutoGen agent class that leverages the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview) for conversational capabilities,  working with\n",
+    "This notebook shows an example of the [`GPTAssistantAgent`](https://github.com/microsoft/autogen/blob/0.2/autogen/agentchat/contrib/gpt_assistant_agent.py#L16C43-L16C43) with retrieval augmented generation. `GPTAssistantAgent` is an experimental AutoGen agent class that leverages the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview) for conversational capabilities,  working with\n",
    "`UserProxyAgent` in AutoGen."
   ]
  },
--- a/notebook/agentchat_oai_assistant_twoagents_basic.ipynb
+++ b/notebook/agentchat_oai_assistant_twoagents_basic.ipynb
@ -6,7 +6,7 @@
   "source": [
    "## OpenAI Assistants in AutoGen\n",
    "\n",
-    "This notebook shows a very basic example of the [`GPTAssistantAgent`](https://github.com/microsoft/autogen/blob/main/autogen/agentchat/contrib/gpt_assistant_agent.py#L16C43-L16C43), which is an experimental AutoGen agent class that leverages the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview) for conversational capabilities,  working with\n",
+    "This notebook shows a very basic example of the [`GPTAssistantAgent`](https://github.com/microsoft/autogen/blob/0.2/autogen/agentchat/contrib/gpt_assistant_agent.py#L16C43-L16C43), which is an experimental AutoGen agent class that leverages the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview) for conversational capabilities,  working with\n",
    "`UserProxyAgent` in AutoGen."
   ]
  },
--- a/notebook/agentchat_oai_code_interpreter.ipynb
+++ b/notebook/agentchat_oai_code_interpreter.ipynb
@ -12,9 +12,9 @@
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
--- a/notebook/agentchat_planning.ipynb
+++ b/notebook/agentchat_planning.ipynb
@ -26,9 +26,9 @@
    "\n",
    "## Requirements\n",
    "\n",
-    "AutoGen requires `Python>=3.8`. To run this notebook example, please install pyautogen and docker:\n",
+    "AutoGen requires `Python>=3.8`. To run this notebook example, please install autogen-agentchat and docker:\n",
    "```bash\n",
-    "pip install pyautogen docker\n",
+    "pip install autogen-agentchat~=0.2 docker\n",
    "```"
   ]
  },
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install \"pyautogen>=0.2.3\" docker"
+    "# %pip install \"autogen-agentchat~=0.2\" docker"
   ]
  },
  {
@ -105,7 +105,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb) for full code examples of the different methods.\n",
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/notebook/oai_openai_utils.ipynb) for full code examples of the different methods.\n",
    "\n",
    "## Construct Agents\n",
    "\n",
--- a/notebook/agentchat_society_of_mind.ipynb
+++ b/notebook/agentchat_society_of_mind.ipynb
@ -15,9 +15,9 @@
    "\n",
    "````{=mdx}\n",
    ":::info Requirements\n",
-    "Install `pyautogen`:\n",
+    "Install `autogen-agentchat`:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -357,8 +357,11 @@
 ],
 "metadata": {
  "front_matter": {
-    "tags": ["orchestration", "nested chat"],
-    "description": "Explore the demonstration of the SocietyOfMindAgent in the AutoGen library, which runs a group chat as an internal monologue, but appears to the external world as a single agent, offering a structured way to manage complex interactions among multiple agents and handle issues such as extracting responses from complex dialogues and dealing with context window constraints."
+   "description": "Explore the demonstration of the SocietyOfMindAgent in the AutoGen library, which runs a group chat as an internal monologue, but appears to the external world as a single agent, offering a structured way to manage complex interactions among multiple agents and handle issues such as extracting responses from complex dialogues and dealing with context window constraints.",
+   "tags": [
+    "orchestration",
+    "nested chat"
+   ]
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
--- a/notebook/agentchat_stream.ipynb
+++ b/notebook/agentchat_stream.ipynb
@ -28,7 +28,7 @@
    "\n",
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
    "```bash\n",
-    "pip install pyautogen\n",
+    "pip install autogen-agentchat~=0.2\n",
    "```"
   ]
  },
@ -45,7 +45,7 @@
   },
   "outputs": [],
   "source": [
-    "# %pip install \"pyautogen>=0.2.3\""
+    "# %pip install \"autogen-agentchat~=0.2\""
   ]
  },
  {
@ -102,7 +102,7 @@
    "]\n",
    "```\n",
    "\n",
-    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/main/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
+    "You can set the value of config_list in any way you prefer. Please refer to this [notebook](https://github.com/microsoft/autogen/blob/0.2/website/docs/topics/llm_configuration.ipynb) for full code examples of the different methods."
   ]
  },
  {
--- a/notebook/agentchat_surfer.ipynb
+++ b/notebook/agentchat_surfer.ipynb
@ -15,7 +15,7 @@
    "\n",
    "AutoGen requires `Python>=3.8`. To run this notebook example, please install AutoGen with the optional `websurfer` dependencies:\n",
    "```bash\n",
-    "pip install \"pyautogen[websurfer]\"\n",
+    "pip install \"autogen-agentchat[websurfer]~=0.2\"\n",
    "```"
   ]
  },
@ -25,7 +25,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install --quiet \"pyautogen[websurfer]\""
+    "# %pip install --quiet \"autogen-agentchat[websurfer]~=0.2\""
   ]
  },
  {
@ -479,7 +479,7 @@
      "#### Multi-Agent Conversation Framework[](#multi-agent-conversation-framework \"Direct link to Multi-Agent Conversation Framework\")\n",
      "\n",
      "Autogen enables the next-gen LLM applications with a generic multi-agent conversation framework. It offers customizable and conversable agents which integrate LLMs, tools, and humans.\n",
-      "By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For [example](https://github.com/microsoft/autogen/blob/main/test/twoagent.py),\n",
+      "By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For [example](https://github.com/microsoft/autogen/blob/0.2/test/twoagent.py),\n",
      "\n",
      "The figure below shows an example conversation flow with AutoGen.\n",
      "\n",
--- a/notebook/agentchat_teachability.ipynb
+++ b/notebook/agentchat_teachability.ipynb
@ -13,7 +13,7 @@
    "\n",
    "In making decisions about memo storage and retrieval, `Teachability` calls an instance of `TextAnalyzerAgent` to analyze pieces of text in several different ways. This adds extra LLM calls involving a relatively small number of tokens. These calls can add a few seconds to the time a user waits for a response.\n",
    "\n",
-    "This notebook demonstrates how `Teachability` can be added to an agent so that it can learn facts, preferences, and skills from users. To chat with a teachable agent yourself, run [chat_with_teachable_agent.py](https://github.com/microsoft/autogen/blob/main/test/agentchat/contrib/capabilities/chat_with_teachable_agent.py).\n",
+    "This notebook demonstrates how `Teachability` can be added to an agent so that it can learn facts, preferences, and skills from users. To chat with a teachable agent yourself, run [chat_with_teachable_agent.py](https://github.com/microsoft/autogen/blob/0.2/test/agentchat/contrib/capabilities/chat_with_teachable_agent.py).\n",
    "\n",
    "## Requirements\n",
    "\n",
@ -22,7 +22,7 @@
    "Some extra dependencies are needed for this notebook, which can be installed via pip:\n",
    "\n",
    "```bash\n",
-    "pip install pyautogen[teachable]\n",
+    "pip install autogen-agentchat[teachable]~=0.2\n",
    "```\n",
    "\n",
    "For more information, please refer to the [installation guide](/docs/installation/).\n",
@ -99,8 +99,8 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[92m\n",
-      "CLEARING MEMORY\u001B[0m\n"
+      "\u001b[92m\n",
+      "CLEARING MEMORY\u001b[0m\n"
     ]
    }
   ],
@ -152,14 +152,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "What is the Vicuna model?\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "The term \"Vicuna model\" does not point to a well-known concept or framework in the realms of science, technology, or social sciences as of my last knowledge update in early 2023. It's possible that the term could be a reference to a proprietary model or a concept that has emerged after my last update or it might be a misspelling or a misunderstanding.\n",
      "\n",
@ -185,14 +185,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Vicuna is a 13B-parameter language model released by Meta.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "My apologies for the confusion. As of my last update, the Vicuna model had not been part of my database. If Vicuna is indeed a 13-billion-parameter language model developed by Meta (formerly Facebook Inc.), then it would be one of the large-scale transformer-based models akin to those like GPT-3 by OpenAI.\n",
      "\n",
@ -222,14 +222,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "What is the Orca model?\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "As of my last update, the Orca model appears to reference a new development that I do not have extensive information on, similar to the earlier reference to the Vicuna model.\n",
      "\n",
@ -255,14 +255,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Orca is a 13B-parameter language model developed by Microsoft. It outperforms Vicuna on most tasks.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "Thank you for providing the context about the Orca model. Based on the new information you've given, Orca is a language model with 13 billion parameters, similar in size to Meta's Vicuna model, but developed by Microsoft. If it outperforms Vicuna on most tasks, it suggests that it could have been trained on a more diverse dataset, use a more advanced architecture, have more effective training techniques, or some combination of these factors.\n",
      "\n",
@ -297,14 +297,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "How does the Vicuna model compare to the Orca model?\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "The Vicuna model and the Orca model are both large-scale language models with a significant number of parameters—13 billion, to be exact.\n",
      "\n",
@ -340,7 +340,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Please summarize this abstract.\n",
      "\n",
@ -350,9 +350,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "AutoGen is an open-source framework designed to facilitate the creation of applications using large language models (LLMs) through the use of multiple conversational agents. These agents can be tailored to users' needs and are capable of interaction in multiple modes, including with other LLMs, human input, and additional tools. With AutoGen, developers have the flexibility to program agent interactions using both natural language and code, enabling the creation of complex patterns suitable for a wide range of applications. The framework has been proven effective across various fields, such as math, coding, question answering, and entertainment, based on empirical studies conducted to test its capabilities.\n",
      "\n",
@ -386,7 +386,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Please summarize this abstract. \n",
      "When I'm summarizing an abstract, I try to make the summary contain just three short bullet points:  the title, the innovation, and the key empirical results.\n",
@ -397,9 +397,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "- Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation\n",
      "- Innovation: AutoGen, an open-source framework that supports building large language model (LLM) applications by enabling conversation among multiple customizable and conversable agents.\n",
@ -436,7 +436,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Please summarize this abstract.\n",
      "\n",
@ -445,9 +445,9 @@
      "Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions.\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "- Title: Sparks of Artificial General Intelligence: Early experiments with GPT-4\n",
      "\n",
@ -487,7 +487,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Consider the identity:  \n",
      "9 * 4 + 6 * 6 = 72\n",
@ -496,9 +496,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "To solve this problem, we need to find a way to add exactly 27 (since 99 - 72 = 27) to the left hand side of the equation by modifying only one of the integers in the equation. \n",
      "\n",
@ -563,7 +563,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Consider the identity:  \n",
      "9 * 4 + 6 * 6 = 72\n",
@ -584,9 +584,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "Given the new set of instructions and the correction that according to a past memory, the solution is \"9 * 1 + 6 * 9\", let's follow the steps carefully to arrive at the correct modified equation.\n",
      "\n",
@ -668,7 +668,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Consider the identity:  \n",
      "9 * 4 + 6 * 6 = 72\n",
@ -677,9 +677,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "Let's apply the steps you've provided to solve the problem at hand:\n",
      "\n",
@ -740,7 +740,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "\u001B[33muser\u001B[0m (to teachable_agent):\n",
+      "\u001b[33muser\u001b[0m (to teachable_agent):\n",
      "\n",
      "Consider the identity:  \n",
      "8 * 3 + 7 * 9 = 87\n",
@ -749,9 +749,9 @@
      "\n",
      "\n",
      "--------------------------------------------------------------------------------\n",
-      "\u001B[31m\n",
-      ">>>>>>>> USING AUTO REPLY...\u001B[0m\n",
-      "\u001B[33mteachable_agent\u001B[0m (to user):\n",
+      "\u001b[31m\n",
+      ">>>>>>>> USING AUTO REPLY...\u001b[0m\n",
+      "\u001b[33mteachable_agent\u001b[0m (to user):\n",
      "\n",
      "Let's apply the plan step-by-step to find the correct modification:\n",
      "\n",
--- a/Show More
+++ b/Show More