By adding an optional dependency on the
spring-boot-configuration-processor, a
`META-INF/spring-configuration-metadata.json` file will be created for
all `@ConfigurationProperties`-annoted types to support autocompletion
within IDEs. Since nested beans are used, the bound fields for those
beans need to be annotated with `@NestedConfigurationProperty`. (purely
for this tool, this does not affect runtime behavior in any way)
Adding two separate loaders that load a single document or multiple
documents from S3 respectively. They also contain different parameters
to support different configurations. However, the document type is
dependent on the current parsers that langchain4j supports, but I am
planning to help in adding more parsers in the future.
Add the Spring Boot Configuration Processor dependency to the Spring
Boot starter to generate metadata
about the LangChain4J custom properties.
Fixes gh-250
This PR contains the implementation of an integration with
[OpenSearch](https://opensearch.org/). As one of the growing vector
databases in the open source world, adding support for it to this
project makes total sense. This implementation includes:
1. A complete implementation of the `EmbeddingStore` interface.
2. Unit tests for the major use cases a store must implement correctly.
3. Usage of [TestContainers](https://testcontainers.com/) to automate
the execution of backends.
Using 1.8.28 results in
` java: java.lang.NoSuchFieldError: Class
com.sun.tools.javac.tree.JCTree$JCImport does not have member field
'com.sun.tools.javac.tree.JCTree qualid'`
I've done integration with Elastic and do some local test to ensure it's
right!(some logic is translated from LangChain Python to Java).
Elasticsearch do not support `Gson`. So we must have `Jackson`
dependency.
Note that the declaration of license in dashscope-sdk-java:pom.xml is:
<licenses>
<license>
<name>The Apache License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>
(There is a "The " prefix in its name.)
Qwen series models are provided by Alibaba Cloud. They are much better
in Asia languages then other LLMs.
DashScope is a model service platform. Qwen models are its primary
supported models. But it also supports other series like LLaMA2, Dolly,
ChatGLM, BiLLa(based on LLaMA)...These may be integrated sometime in the
future.
- all-minilm-l6-v2
- all-minilm-l6-v2-q
- e5-small-v2
- e5-small-v2-q
The idea is to give users an option to embed documents/texts in the same
Java process without any external dependencies.
ONNX Runtime is used to run models inside JVM.
Each model resides in it's own maven module (inside the jar).
Now, the StreamingChatLanguageModel can be used in conjunction with
tools.
One can send tool specifications along with a message to the LLM, and
the LLM can either stream a response or initiate a request to execute a
tool (also as a stream of tokens).
Have a parent pom that contains most/all common things for the
sub-projects.
Note that it is separate from the root aggregator pom: not mixing the
aggregator and the parents makes things slightly easier.
If this change makes it harder to do releases, there might be a
possibility to generate the effective poms for each subproject, but on
the other hand releasing everything should not be too problematic.