Commit Graph

40 Commits

Author SHA1 Message Date
deep-learning-dynamo 0fd7805546 bedrock: minor 2023-11-10 13:47:13 +01:00
Pascal Vantrepote 286e95c047
Adding support for AWS Bedrock (#269) 2023-11-10 13:11:32 +01:00
ZYinNJU 677cf26bca
Ollama integration (#249)
@langchain4j Hi! I make some progress to integrate with Ollama, see
#244. Using `retrofit` to define REST API in [Ollama
doc](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion).

Local test need local deployment of `Ollama`. Looking forward to get
your code review!

---------

Co-authored-by: Heezer <33568148+Heezer@users.noreply.github.com>
2023-11-07 21:39:40 +01:00
kevin-wu-os dbc141a10f
PGVector: First E2E implementation (Java 8 Implementation) (#236)
# Summary

I use https://github.com/pgvector/pgvector-java/ to implement
`PgVectorEmbeddingStore`.

1. Insertion works as regular SQL as long as a table (with namespace as
table name) contains type `vector`.
2. Query is upon the enhanced SQL by vector extension. The SQL returned
results are the final result with passing `minScore` and `maxResults`.

# Caveat
* The implementation leaves responsibility of installing `vector`
extension to users.
* Only cosine similarity is used.
* Only ivfflat index is used.
2023-10-27 16:45:57 +02:00
ZYinNJU a62c302464
add BOM to manage artifacts (#227)
see #217, it's a good way to manage artifacts.

Now users can use code below to import dependency:

```xml
    <dependencyManagement>
        <dependencies>
              <dependency>
                  <groupId>dev.langchain4j</groupId>
                  <artifactId>langchain4j-bom</artifactId>
                  <version>${langchain4j.version}</version>
                  <type>pom</type>
                  <scope>import</scope>
              </dependency>
        </dependencies>
    </dependencyManagement>
```
2023-10-27 14:18:04 +02:00
deep-learning-dynamo fb8ec688ea OpenSearch: added Java 11 requirement, slight refactoring 2023-10-09 12:09:24 +02:00
Ricardo Ferreira 79b825df63
Add Support for OpenSearch as Embedding Store (#208)
This PR contains the implementation of an integration with
[OpenSearch](https://opensearch.org/). As one of the growing vector
databases in the open source world, adding support for it to this
project makes total sense. This implementation includes:

1. A complete implementation of the `EmbeddingStore` interface.
2. Unit tests for the major use cases a store must implement correctly.
3. Usage of [TestContainers](https://testcontainers.com/) to automate
the execution of backends.
2023-10-09 11:45:27 +02:00
deep-learning-dynamo 315eab8641 released 0.23.0 2023-09-29 14:27:51 +02:00
Cedrick Lunven c632322493
Cassandra and Astra (dbaas) as VectorStore and ChatMemoryStore (#162)
#### Context

Apache Cassandra is a popular open-source database created back in 2008.
This year with
[CEP30](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes)
support for vector and similarity searches have been introduced.
Cassandra is very fast in read and write and is used as a cache by many
companies, it as an opportunity to implement the ChatMemoryStore. This
feature is expected for Cassandra 5 at the end of the year but some
docker images are already available.

DataStax AstraDb is a distribution of Apache Cassandra available as Saas
providing a free tier (free forever) of 80 millions queries/month.
[Registration](https://astra.datastax.com). The vector capability is
there production ready.

#### Data Modelling

With the proper data model in Cassandra we can perform both similarity
search, keyword search, metadata search.

```sql
CREATE TABLE sample_vector_table (
    row_id text PRIMARY KEY,
    attributes_blob text,
    body_blob text,
    metadata_s map<text, text>,
    vector vector<float, 1536>
);
```

#### Implementation Throughts

- The **configuration** to connect to Astra and Cassandra are not
exactly the same so 2 different classes with associated builder are
provided:
[Astra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/AstraDbEmbeddingConfiguration.java)
and [OSS
Cassandra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingConfiguration.java).
A couple of fields are mutualized but creating a superclass to inherit
from lead to the use of Lombok `@SuperBuilder` and the Javadoc was not
able to found out what to do.

- Instead of passing a large number of arguments like other stores I
prefer to wrap them as a bean. With this trick you can add or remove
attributes, make then optional or mandatory at will. If you need to add
a new attribute in the configuration you do not have to change the
implementation of `XXXStore` and `XXXStoreImpl`

- I create an
[AstractEmbeddedStore<T>](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/AbstractEmbeddingStore.java)
that could very well become the super class for any store. It handles
the different call of the real concrete implementation. (_delegate
pattern_). Some default implementation can be implemented

```java
/**
 * Add a list of embeddings to the store.
 *
 * @param embeddings
 *      list of embeddings (hold vector)
 * @return
 *      list of ids
*/
@Override
public List<String> addAll(List<Embedding> embeddings) {
   Objects.requireNonNull(embeddings, "embeddings must not be null");
   return embeddings.stream().map(this::add).collect(Collectors.toList());
}
```

The only method to implement at the Store level is:

```java
/**
* Initialize the concrete implementation.
* @return create implementation class for the store
*/
protected abstract EmbeddingStore<T> loadImplementation()
throws ClassNotFoundException, NoSuchMethodException, InstantiationException,
       IllegalAccessException, InvocationTargetException;
```

-
[CassandraEmbeddedStore](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingStore.java#L30)
proposes 2 constructors, one could override the implementation class if
they want (extension point)

#### Tests

- Test classes are provided including some long form examples based on
classed found in `langchain4j-examples` but test are disabled.

- To start a local cassandra use docker and the
[docker-compose](https://github.com/clun/langchain4j/blob/main/langchain4j-cassandra/src/test/resources/docker-compose.yml)

```
docker compose up -d
```

- To run Test with Astra signin with your github account, create a token
(api Key) with role `Organization Administrator` following this
[procedure](https://awesome-astra.github.io/docs/pages/astra/create-token/#c-procedure)

<img width="926" alt="Screenshot 2023-09-06 at 18 14 12"
src="https://github.com/langchain4j/langchain4j/assets/726536/dfd2d9e5-09c9-4504-bfaa-31cfd87704a1">

- Pick the full value of the `token` from the json

<img width="713" alt="Screenshot 2023-09-06 at 18 15 53"
src="https://github.com/langchain4j/langchain4j/assets/726536/1be56234-dd98-4f59-af71-03df42ed6997">

- Create the environment variable `ASTRA_DB_APPLICATION_TOKEN`

```console
export ASTRA_DB_APPLICATION_TOKEN=AstraCS:....<your_token>
```
2023-09-27 15:50:04 +02:00
ZYinNJU 887120b409
Redis integration as embedding store (#174) 2023-09-26 09:29:08 +02:00
LangChain4j ed854871d4
Extracted model providers to separate modules (#190) 2023-09-24 20:11:09 +02:00
LangChain4j 118306fdba
Moved Chroma integration to a separate module (#178) 2023-09-17 16:51:11 +02:00
Heezer 70c6965f74
Vespa support (#132)
* PR contains small fixes for Weaviate too.
* Example: https://github.com/langchain4j/langchain4j-examples/pull/8
2023-09-14 16:58:47 +02:00
ZYinNJU 3bffc971df
Integration with Elastic (#95)
I've done integration with Elastic and do some local test to ensure it's
right!(some logic is translated from LangChain Python to Java).

Elasticsearch do not support `Gson`. So we must have `Jackson`
dependency.
2023-09-02 20:32:46 +02:00
deep-learning-dynamo c1cc5be1c7 released 0.22.0 2023-08-29 19:21:56 +02:00
kuraleta 88b56778f4
Integration with Google Vertex AI (#135) 2023-08-28 21:30:18 +02:00
deep-learning-dynamo db1f236ed2 released 0.21.0 2023-08-19 15:57:39 +02:00
jiangsier-xyz d908f5158a
Integrate the Qwen series models via dashscope-sdk. (#99)
Qwen series models are provided by Alibaba Cloud. They are much better
in Asia languages then other LLMs.

DashScope is a model service platform. Qwen models are its primary
supported models. But it also supports other series like LLaMA2, Dolly,
ChatGLM, BiLLa(based on LLaMA)...These may be integrated sometime in the
future.
2023-08-18 20:49:50 +02:00
Iurii Koval ec4a673b52
Add Milvus support (#58)
Authored-by: iurii.koval <koval.iurii@protonmail.com>
2023-08-18 20:38:45 +02:00
deep-learning-dynamo d7b96ca9a6 released 0.20.0 2023-08-14 00:44:07 +02:00
deep-learning-dynamo 1541f214c1 released 0.19.0 2023-08-10 14:34:21 +02:00
Julien Perrochet 5659e8e2ba
[misc] migrate embeddings-related projects out of project (#72)
In-process embeddings are moved to their own repository
(https://github.com/langchain4j/langchain4j-embeddings) due to the size
of the involved files.

Note that once this PR is merged we'd ideally need a release so as to
have the `langchain4j-embeddings` related project depend on that
version.
2023-08-06 20:02:28 +02:00
Heezer d45ddbfc7c
Weaviate support (#57)
Authored-by: Titov, Alexey <alexey.titov@adesso.de>
2023-08-06 17:03:39 +02:00
deep-learning-dynamo d4fca658c1 released 0.18.0 2023-07-26 21:19:24 +02:00
LangChain4j 529ef6b647
Added in-process embedding models (#41)
- all-minilm-l6-v2
- all-minilm-l6-v2-q
- e5-small-v2
- e5-small-v2-q

The idea is to give users an option to embed documents/texts in the same
Java process without any external dependencies.
ONNX Runtime is used to run models inside JVM.
Each model resides in it's own maven module (inside the jar).
2023-07-23 19:05:13 +02:00
deep-learning-dynamo 1976560aeb released 0.16.0 2023-07-18 10:49:43 +02:00
deep-learning-dynamo e439f96466 released 0.15.0 2023-07-18 00:13:08 +02:00
deep-learning-dynamo 14185653c7 released 0.14.0 2023-07-16 12:15:31 +02:00
deep-learning-dynamo 120c6a01d8 released 0.13.0 2023-07-15 17:53:10 +02:00
deep-learning-dynamo 52b7c3b441 released a hotfix for https://github.com/langchain4j/langchain4j/issues/23 2023-07-14 19:18:47 +02:00
Julien Perrochet c451a220d9
[build] Introduce a parent pom (#15)
Have a parent pom that contains most/all common things for the
sub-projects.

Note that it is separate from the root aggregator pom: not mixing the
aggregator and the parents makes things slightly easier.

If this change makes it harder to do releases, there might be a
possibility to generate the effective poms for each subproject, but on
the other hand releasing everything should not be too problematic.
2023-07-13 22:59:25 +02:00
deep-learning-dynamo 17654e31d0 released 0.11.0 2023-07-11 20:50:57 +02:00
deep-learning-dynamo d645a8d5c7 released 0.10.0 2023-07-05 18:55:20 +02:00
deep-learning-dynamo 721a330228 released 0.9.0 2023-07-03 15:12:43 +02:00
deep-learning-dynamo acb1e641c0 released 0.8.0 2023-07-02 23:13:13 +02:00
deep-learning-dynamo 1805278e11 added spring boot starter 2023-07-02 22:07:27 +02:00
deep-learning-dynamo 6e85f7f06c - added support for tools
- released 0.7.0
2023-07-01 18:33:50 +02:00
deep-learning-dynamo fa9646145d Released 0.6.0 2023-06-29 22:15:38 +02:00
deep-learning-dynamo 7a349d0045 Released 0.5.0 2023-06-26 20:20:57 +02:00
DeepLearning Dynamo 0df4ec6345 added parent pom 2023-06-24 09:07:23 +02:00