Commit Graph

248 Commits

Author SHA1 Message Date
LangChain4j 271fdd61e2
Added overlap to splitters (#181) 2023-09-19 22:20:11 +02:00
LangChain4j bc6ca31579
Removed dynamic loading for elasticsearch (#182) 2023-09-19 20:57:13 +02:00
LangChain4j 118306fdba
Moved Chroma integration to a separate module (#178) 2023-09-17 16:51:11 +02:00
LangChain4j 21f744868b
Return Response<T> from models (#159) 2023-09-15 14:52:22 +02:00
ZYinNJU cb6bc141d0
fix error message version (#167)
sorry I make a mistake, I forgot change the version in Error Message
when no `langchain4j-elasticsearch` dependency was found.
2023-09-14 17:02:45 +02:00
Heezer 70c6965f74
Vespa support (#132)
* PR contains small fixes for Weaviate too.
* Example: https://github.com/langchain4j/langchain4j-examples/pull/8
2023-09-14 16:58:47 +02:00
LangChain4j b804d03ca8
Fixed relevance score calculation (#164) 2023-09-07 19:19:20 +02:00
jiangsier-xyz f2bb6f992e
Refactor the implementation for Qwen series models using the new DashScope SDK APIs. (#155)
The design of the Dashscope SDK is evolving towards OpenAI, offering new
fields and specifications. Utilize these latest features to refactor the
implementation of the Qwen models.
2023-09-03 13:21:15 +02:00
ZYinNJU 3bffc971df
Integration with Elastic (#95)
I've done integration with Elastic and do some local test to ensure it's
right!(some logic is translated from LangChain Python to Java).

Elasticsearch do not support `Gson`. So we must have `Jackson`
dependency.
2023-09-02 20:32:46 +02:00
jiangsier-xyz 80c3880062
Wrap a method to return original LLM result (#147)
Wrap a method to return the original LLM result, allowing subclasses to
have a chance to obtain more information such as usage (input/output
tokens, finish reason...).
2023-08-30 21:14:26 +02:00
LangChain4j 105eb459af
Update README.md 2023-08-30 18:09:40 +02:00
deep-learning-dynamo c1cc5be1c7 released 0.22.0 2023-08-29 19:21:56 +02:00
LangChain4j 1e8c5a226b
added extra checks to constructors (#143) 2023-08-29 17:32:10 +02:00
LangChain4j bebfc78ee1
Re-implemented document splitters (#130) 2023-08-28 21:33:48 +02:00
kuraleta 88b56778f4
Integration with Google Vertex AI (#135) 2023-08-28 21:30:18 +02:00
LangChain4j 20753a980a
Added EmbeddingModelTextClassifier (#139) 2023-08-28 21:19:11 +02:00
LangChain4j 6f5845ff4e
Text segment improvements (#134) 2023-08-28 20:35:57 +02:00
LangChain4j eb729971c2
Fixed tool execution in a loop, now multiple tools can be called in a row (#137) 2023-08-28 19:54:53 +02:00
LangChain4j ad97638840
Fixed IndexOutOfBoundsException when streaming (#138) 2023-08-28 19:45:14 +02:00
LangChain4j 6f41fa80bf
Added configurable metadata and link extraction into HtmlTextExtractor (#133) 2023-08-28 19:20:24 +02:00
deep-learning-dynamo be8eae1b9c changed log level to trace in chat memories 2023-08-27 11:57:27 +02:00
LangChain4j 5905750060
Fixed logStreaming config parameter for *Streaming*Model (#131) 2023-08-25 18:56:47 +02:00
LangChain4j 97d7d0c755
Added an option to serialize and deserialize InMemoryEmbeddingStore to/from json or file (#120) 2023-08-22 18:40:15 +02:00
LangChain4j 8274ad6db6
Now in-process embedding models can embed texts longer than 510 tokens (#119)
If the text is longer than 510 tokens, it is automatically partitioned
into parts <= 510 tokens long. These partitions are embedded
independently and then weighted (by the number of tokens in each
partition) average embedding is calculated.

Related to https://github.com/langchain4j/langchain4j-embeddings/pull/4
2023-08-22 18:38:09 +02:00
LangChain4j 7307f43d98
Update README.md 2023-08-20 22:39:19 +02:00
LangChain4j 407db96385
Added more javadoc for Azure + embed max 16 segments at a time (#115)
Azure OpenAI can embed only 16 items at a time, so I have added an
internal loop in AzureOpenAiEmbeddingModel to embed in batches of 16.
2023-08-20 20:11:47 +02:00
jiangsier-xyz e0487baaa2
Make the langchain4j-dashscope pass the license compliance check (#111)
Note that the declaration of license in dashscope-sdk-java:pom.xml is:

<licenses>
    <license>
        <name>The Apache License, Version 2.0</name>
        <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
        <distribution>repo</distribution>
    </license>
</licenses>

(There is a "The " prefix in its name.)
2023-08-20 20:09:06 +02:00
deep-learning-dynamo db1f236ed2 released 0.21.0 2023-08-19 15:57:39 +02:00
deep-learning-dynamo ed6158e8e5 ChatMemoryStore: renamed UserId to MemoryId 2023-08-19 13:46:54 +02:00
deep-learning-dynamo d013fb0aaa removed unnecessary code 2023-08-18 21:01:25 +02:00
kuraleta da45b7e259
Integration with Azure (#107) 2023-08-18 20:55:51 +02:00
jiangsier-xyz d908f5158a
Integrate the Qwen series models via dashscope-sdk. (#99)
Qwen series models are provided by Alibaba Cloud. They are much better
in Asia languages then other LLMs.

DashScope is a model service platform. Qwen models are its primary
supported models. But it also supports other series like LLaMA2, Dolly,
ChatGLM, BiLLa(based on LLaMA)...These may be integrated sometime in the
future.
2023-08-18 20:49:50 +02:00
Iurii Koval ec4a673b52
Add Milvus support (#58)
Authored-by: iurii.koval <koval.iurii@protonmail.com>
2023-08-18 20:38:45 +02:00
kuraleta 18dffd3521
Integration with Chroma, doc (#100) 2023-08-18 20:33:21 +02:00
LangChain4j ba7fc4def6
Added an option to store ChatMemory anywhere (in memory, DB, etc) (#106)
- Added ChatMemoryStore and InMemoryChatMemoryStore
- Changed MessageWindowChatMemory and TokenWindowChatMemory to use
ChatMemoryStore
- Changed Supplier<ChatMemory> into ChatMemoryProvider
- Small improvements
2023-08-18 20:31:22 +02:00
kuraleta 57a84d9a46
Integration with Chroma (#92) 2023-08-15 14:38:00 +02:00
deep-learning-dynamo d7b96ca9a6 released 0.20.0 2023-08-14 00:44:07 +02:00
LangChain4j 7b69a1691d
Added an option to setup a proxy for OpenAI models (#93) 2023-08-13 20:34:09 +02:00
LangChain4j 3179b1b64c
Added more pre-packaged in-process embedding models (#91) 2023-08-13 19:19:56 +02:00
LangChain4j 0632618adf
InMemoryEmbeddingStore: return matches from highest to lowest (#90) 2023-08-13 19:08:59 +02:00
deep-learning-dynamo 0e93deed77 released 0.19.0 2023-08-10 17:55:45 +02:00
deep-learning-dynamo 1541f214c1 released 0.19.0 2023-08-10 14:34:21 +02:00
LangChain4j f3757b8e18
[Snyk] Upgrade io.netty:netty-codec from 4.1.93.Final to 4.1.94.Final (#76)
authored-by: snyk-bot <snyk-bot@snyk.io>
2023-08-10 08:53:07 +02:00
LangChain4j 16a84a9074
[Snyk] Upgrade org.projectlombok:lombok from 1.18.26 to 1.18.28 (#74)
authored-by: snyk-bot <snyk-bot@snyk.io>
2023-08-10 08:49:04 +02:00
LangChain4j edc8400dff
[Snyk] Upgrade org.apache.pdfbox:pdfbox from 2.0.28 to 2.0.29 (#73)
Co-authored-by: snyk-bot <snyk-bot@snyk.io>
2023-08-10 08:46:16 +02:00
LangChain4j 93173f20d7
Added ability to estimate token count for tools (#81)
Also refactored MessageWindowChatMemory and TokenWindowChatMemory
2023-08-09 22:01:52 +02:00
LangChain4j 7497191bf9
Improvements (#79)
- did some refactorings
- added javadoc
- fixed NPE in PineconeEmbeddingStoreImpl when adding embedding by id
- PineconeEmbeddingStoreImpl now takes into account minScore and returns
score in EmbeddingMatch
- InMemoryEmbeddingStore now returns score instead of cosine similarity
2023-08-09 07:58:55 +02:00
Julien Perrochet 5cb371d7bf
[ci] let the compliance check run on all modules (#75)
Some leftovers from an earlier (and now incorrect) CI configuration.

Modules that don't need to comply with the licenses need to deactivate
the relevant plugin on a case-by-case basis.
2023-08-06 21:22:26 +02:00
Julien Perrochet 5659e8e2ba
[misc] migrate embeddings-related projects out of project (#72)
In-process embeddings are moved to their own repository
(https://github.com/langchain4j/langchain4j-embeddings) due to the size
of the involved files.

Note that once this PR is merged we'd ideally need a release so as to
have the `langchain4j-embeddings` related project depend on that
version.
2023-08-06 20:02:28 +02:00
LangChain4j 9a5da28cc4
Added support for a separate chat memory for each user to AI Services (#63) 2023-08-06 17:54:04 +02:00