If the text is longer than 510 tokens, it is automatically partitioned
into parts <= 510 tokens long. These partitions are embedded
independently and then weighted (by the number of tokens in each
partition) average embedding is calculated.
Related to https://github.com/langchain4j/langchain4j-embeddings/pull/4
Note that the declaration of license in dashscope-sdk-java:pom.xml is:
<licenses>
<license>
<name>The Apache License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>
(There is a "The " prefix in its name.)
Qwen series models are provided by Alibaba Cloud. They are much better
in Asia languages then other LLMs.
DashScope is a model service platform. Qwen models are its primary
supported models. But it also supports other series like LLaMA2, Dolly,
ChatGLM, BiLLa(based on LLaMA)...These may be integrated sometime in the
future.
- Added ChatMemoryStore and InMemoryChatMemoryStore
- Changed MessageWindowChatMemory and TokenWindowChatMemory to use
ChatMemoryStore
- Changed Supplier<ChatMemory> into ChatMemoryProvider
- Small improvements
- did some refactorings
- added javadoc
- fixed NPE in PineconeEmbeddingStoreImpl when adding embedding by id
- PineconeEmbeddingStoreImpl now takes into account minScore and returns
score in EmbeddingMatch
- InMemoryEmbeddingStore now returns score instead of cosine similarity
Some leftovers from an earlier (and now incorrect) CI configuration.
Modules that don't need to comply with the licenses need to deactivate
the relevant plugin on a case-by-case basis.
In-process embeddings are moved to their own repository
(https://github.com/langchain4j/langchain4j-embeddings) due to the size
of the involved files.
Note that once this PR is merged we'd ideally need a release so as to
have the `langchain4j-embeddings` related project depend on that
version.
- all-minilm-l6-v2
- all-minilm-l6-v2-q
- e5-small-v2
- e5-small-v2-q
The idea is to give users an option to embed documents/texts in the same
Java process without any external dependencies.
ONNX Runtime is used to run models inside JVM.
Each model resides in it's own maven module (inside the jar).
Now, the StreamingChatLanguageModel can be used in conjunction with
tools.
One can send tool specifications along with a message to the LLM, and
the LLM can either stream a response or initiate a request to execute a
tool (also as a stream of tokens).