support #425 . Due to my local environment problem (`vearch` docker
container start failed in Apple M1), I do the integration test in remote
`vearch` (I start up `vearch` container in remote host using docker),
and it works fine. (But I don't check using `Testcontainers` to start
up)
Two more things need discussion and your opinion:
1. There is a translation between `RelevantScore` and `CosineSimilarity`
in `findRelevant` method, I don't know if that's correct, because
`vearch` do not support cosine similarity, so I use inner product
instead (same as cosine similarity if vector is normalized). Should we
normalize vector before adding it to the embedding store?
2. There are many contraints in creating `vearch` space (retrieval types
have different parameters). Should we check it or just let users to
check themselves? (see [Create
Space](https://vearch.readthedocs.io/en/latest/use_op/op_space.html#create-space)).
Currently I implement it by using many inner static class (see
`RetrievalParam` and `RetrievalType`, in `SpaceEngine` it will do some
constraint check.)
Public Ollama Client
- list model method
- get model details method
### Motivation
In my research project, I'm using Langchain4j, as anyone should :)
From my research, it seems that this client code is in sync with the
Ollama API, and it is the easiest and most maintainable code. So, in my
project, I use Langchain4j, and it's backed by the Ollama provider. In
my use case, I need to be able to list models and, in the future, even
create one. Is it possible to make the OllamaClient code public?
I wrote some thoughts about OpenAPI in Ollama in this issue.
https://github.com/jmorganca/ollama/issues/716#issuecomment-1904415711
So, if Ollama developers consider adding an OpenAPI endpoint, I will be
the first to make the OllamaClient package-private again.
Baidu ERNIE Bot big model, the most perfect model in China at present;
As far as I know, many people are currently using this model;
Because the parameters of the Wenxin model are different from those of
the chatgpt, I have added a new set of entry and exit parameters;
`OpenSearchEmbeddingStoreAwsIT` uses `latest` but this is a development
tag version that was used to unblock the IT, the fix is now part of
version `3.1.0`. Also, `AmazonS3DocumentLoaderIT` uses an old version.
Let's unified and use most recent LocalStack version.
So far, LangChain4j had only a simple (a.k.a., naive) RAG
implementation: a single `Retriever` was invoked on each interaction
with the LLM, and all retrieved `TextSegments` were appended to the end
of the `UserMessage`. This approach was very limiting.
This PR introduces support for much more advanced RAG use cases. The
design and mental model are inspired by [this
article](https://blog.langchain.dev/deconstructing-rag/) and [this
paper](https://arxiv.org/abs/2312.10997), making it advisable to read
the article.
This PR introduces a `RetrievalAugmentor` interface responsible for
augmenting a `UserMessage` with relevant content before sending it to
the LLM. The `RetrievalAugmentor` can be used with both `AiServices` and
`ConversationalRetrievalChain`, as well as stand-alone.
A default implementation of `RetrievalAugmentor`
(`DefaultRetrievalAugmentor`) is provided with the library and is
suggested as a good starting point. However, users are not limited to it
and can have more freedom with their own custom implementations.
`DefaultRetrievalAugmentor` decomposes the entire RAG flow into more
granular steps and base components:
- `QueryTransformer`
- `QueryRouter`
- `ContentRetriever` (the old `Retriever` is now deprecated)
- `ContentAggregator`
- `ContentInjector`
This modular design aims to separate concerns and simplify development,
testing, and evaluation. Most (if not all) currently known and proven
RAG techniques can be represented as one or multiple base components
listed above.
Here is how the decomposed RAG flow can be visualized:
![advanced-rag](https://github.com/langchain4j/langchain4j/assets/132277850/b699077d-dabf-4768-a241-3fcd9ab0286c)
This mental and software model aims to simplify the thinking, reasoning,
and implementation of advanced RAG flows.
Each base component listed above has a sensible and simple default
implementation configured in `DefaultRetrievalAugmentor` by default but
can be overridden by more sophisticated implementations (provided by the
library out-of-the-box) as well as custom ones. The list of
implementations is expected to grow over time as we discover new
techniques and implement existing proven ones.
This PR also introduces out-of-the-box support for the following proven
RAG techniques:
- Query expansion
- Query compression
- Query routing using LLM
- [Reciprocal Rank
Fusion](https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking)
- Re-ranking ([Cohere Rerank](https://docs.cohere.com/docs/reranking)
integration is coming in a [separate
PR](https://github.com/langchain4j/langchain4j/pull/539)).
1. Add Qwen multi-modal models implementation, support for image and
text information.
Example of key input fragment:
```json
"messages": [
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "What animal is in the picture?"}
]
}
]
```
Example of key output fragment:
```json
"output": {
"choices": [
{
"finish_reason": null,
"message": {
"role": "assistant",
"content": [
{
"text": "The picture shows a dog sitting on the beach with its owner."
}
]
}
}
]
}
```
Note: The dashscope sdk supports local file URL starting with file://...
2. Add new optional parameters: baseUrl, maxTokens
Bumps [com.google.guava:guava](https://github.com/google/guava) from
30.1-jre to 32.0.0-jre.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/google/guava/releases">com.google.guava:guava's
releases</a>.</em></p>
<blockquote>
<h2>32.0.0</h2>
<h3>Maven</h3>
<pre lang="xml"><code><dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.0.0-jre</version>
<!-- or, for Android: -->
<version>32.0.0-android</version>
</dependency>
</code></pre>
<h3>Jar files</h3>
<ul>
<li><a
href="https://repo1.maven.org/maven2/com/google/guava/guava/32.0.0-jre/guava-32.0.0-jre.jar">32.0.0-jre.jar</a></li>
<li><a
href="https://repo1.maven.org/maven2/com/google/guava/guava/32.0.0-android/guava-32.0.0-android.jar">32.0.0-android.jar</a></li>
</ul>
<p>Guava requires <a
href="https://github.com/google/guava/wiki/UseGuavaInYourBuild#what-about-guavas-own-dependencies">one
runtime dependency</a>, which you can download here:</p>
<ul>
<li><a
href="https://repo1.maven.org/maven2/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar">failureaccess-1.0.1.jar</a></li>
</ul>
<h3>Javadoc</h3>
<ul>
<li><a
href="http://guava.dev/releases/32.0.0-jre/api/docs/">32.0.0-jre</a></li>
<li><a
href="http://guava.dev/releases/32.0.0-android/api/docs/">32.0.0-android</a></li>
</ul>
<h3>JDiff</h3>
<ul>
<li><a href="http://guava.dev/releases/32.0.0-jre/api/diffs/">32.0.0-jre
vs. 31.1-jre</a></li>
<li><a
href="http://guava.dev/releases/32.0.0-android/api/diffs/">32.0.0-android
vs. 31.1-android</a></li>
<li><a
href="http://guava.dev/releases/32.0.0-android/api/androiddiffs/">32.0.0-android
vs. 32.0.0-jre</a></li>
</ul>
<h3>Changelog</h3>
<h4>Security fixes</h4>
<ul>
<li>Reimplemented <code>Files.createTempDir</code> and
<code>FileBackedOutputStream</code> to further address CVE-2020-8908 (<a
href="https://redirect.github.com/google/guava/issues/4011">#4011</a>)
and CVE-2023-2976 (<a
href="https://redirect.github.com/google/guava/issues/2575">#2575</a>).
(feb83a1c8f)</li>
</ul>
<p>While CVE-2020-8908 was officially closed when we deprecated
<code>Files.createTempDir</code> in <a
href="https://github.com/google/guava/releases/tag/v30.0">Guava
30.0</a>, we've heard from users that even recent versions of Guava have
been listed as vulnerable in <em>other</em> databases of security
vulnerabilities. In response, we've reimplemented the method (and the
very rarely used <code>FileBackedOutputStream</code> class, which had a
similar issue) to eliminate the insecure behavior entirely. This change
could technically affect users in a number of different ways (discussed
under "Incompatible changes" below), but in practice, the only
problem users are likely to encounter is with Windows. If you are using
those APIs under Windows, you should skip 32.0.0 and go straight to <a
href="https://github.com/google/guava/releases/tag/v32.0.1">32.0.1</a>
which fixes the problem. (Unfortunately, we didn't think of the Windows
problem until after the release. And while we <a
href="https://github.com/google/guava#important-warnings">warn that
<code>common.io</code> in particular may not work under Windows</a>, we
didn't intend to regress support.) Sorry for the trouble.</p>
<h4>Incompatible changes</h4>
<p>Although this release bumps Guava's major version number, it makes
<strong>no binary-incompatible changes to the <code>guava</code>
artifact</strong>.</p>
<p>One change could cause issues for Widows users, and a few other
changes could cause issues for users in more usual situations:</p>
<ul>
<li><strong>The new implementations of <code>Files.createTempDir</code>
and <code>FileBackedOutputStream</code> <a
href="https://redirect.github.com/google/guava/issues/6535">throw an
exception under Windows</a>.</strong> This is fixed in <a
href="https://github.com/google/guava/releases/tag/v32.0.1">32.0.1</a>.
Sorry for the trouble.</li>
<li><code>guava-gwt</code> now <a
href="https://redirect.github.com/google/guava/issues/6627">requires</a>
GWT <a
href="https://github.com/gwtproject/gwt/releases/tag/2.10.0">2.10.0</a>.</li>
<li>This release makes a binary-incompatible change to a
<code>@Beta</code> API in the <strong>separate artifact</strong>
<code>guava-testlib</code>. Specifically, we changed the return type of
<code>TestingExecutors.sameThreadScheduledExecutor</code> to
<code>ListeningScheduledExecutorService</code>. The old return type was
a package-private class, which caused the Kotlin compiler to produce
warnings. (dafaa3e435)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/google/guava/commits">compare view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=com.google.guava:guava&package-manager=maven&previous-version=30.1-jre&new-version=32.0.0-jre)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain4j/langchain4j/network/alerts).
</details>
Following [this answer](https://stackoverflow.com/a/20002683/1490806)
`Matcher.quoteReplacement` has to be used to avoid
`java.lang.IllegalArgumentException: Illegal group reference` whenever
the content contains `$$`
This PR introduces a new module `langchain4j-qdrant` to enable Qdrant -
https://qdrant.tech/ to be used as a vector database in Langchain4J.
Integration tests have been implemented by extending `EmbeddingStoreIT`
and using `org.testcontainers` to spawn a local Qdrant instance.
Mistral AI continues its mission to deliver the best open models to the
developer community. They are [growing
quickly](https://mistral.ai/news/mixtral-of-experts/) and it's very
helpful add support to Mistral AI models.
This is a small prototype based on discussions originating from
https://github.com/ai-for-java/openai4j/pull/13
The approach I took here is to allow for decorating the models/builders
with additional functionality without having to extend model classes or
builders. I did it for a single model in this prototype - the
`OpenAiChatModel`, but this pattern could be applied to all of the other
models across Langchain4J.
That doesn't mean you couldn't extend the model classes if you wanted to
use inheritance. I just try to avoid it and use composition instead.
I also added a test which shows how it would be used. Downstream
libraries (like Spring Boot or Quarkus) could use this mechanism to
extend/enhance with their own capabilities which aren't necessarily part
of the model.
Let me know what you think @geoand / @langchain4j !
Happy to continue conversation and see where we can bring this!
Using Azure OpenAI, on some Jackson mappings I'm getting the following
error:
```
Caused by: java.lang.NoClassDefFoundError: com/fasterxml/jackson/annotation/JsonTypeInfo$Value
```
This is because not all our Jackson dependencies are using the same
version. The intent of this commit is to fix this, by using the Jackson
BOM.