2023-06-24 15:07:23 +08:00
|
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
|
|
<project xmlns="http://maven.apache.org/POM/4.0.0"
|
|
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
|
|
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
|
|
<modelVersion>4.0.0</modelVersion>
|
|
|
|
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
2023-07-14 04:59:25 +08:00
|
|
|
<artifactId>langchain4j-aggregator</artifactId>
|
2023-11-13 01:58:31 +08:00
|
|
|
<version>0.24.0</version>
|
2023-06-24 15:07:23 +08:00
|
|
|
<packaging>pom</packaging>
|
|
|
|
|
|
|
|
<modules>
|
2023-08-19 02:49:50 +08:00
|
|
|
|
2023-07-14 04:59:25 +08:00
|
|
|
<module>langchain4j-parent</module>
|
2023-10-27 20:18:04 +08:00
|
|
|
<module>langchain4j-bom</module>
|
2023-07-24 01:05:13 +08:00
|
|
|
|
2023-07-14 04:59:25 +08:00
|
|
|
<module>langchain4j-core</module>
|
2023-07-24 01:05:13 +08:00
|
|
|
<module>langchain4j</module>
|
2023-07-03 02:46:24 +08:00
|
|
|
<module>langchain4j-spring-boot-starter</module>
|
2023-08-19 02:38:45 +08:00
|
|
|
|
2023-08-19 02:49:50 +08:00
|
|
|
<!-- model providers -->
|
2023-09-25 02:11:09 +08:00
|
|
|
<module>langchain4j-azure-open-ai</module>
|
2023-12-08 17:51:26 +08:00
|
|
|
<module>langchain4j-quarkus</module>
|
2023-11-10 20:47:13 +08:00
|
|
|
<module>langchain4j-bedrock</module>
|
2023-08-19 02:49:50 +08:00
|
|
|
<module>langchain4j-dashscope</module>
|
2023-09-25 02:11:09 +08:00
|
|
|
<module>langchain4j-hugging-face</module>
|
|
|
|
<module>langchain4j-local-ai</module>
|
|
|
|
<module>langchain4j-open-ai</module>
|
2023-08-29 03:30:18 +08:00
|
|
|
<module>langchain4j-vertex-ai</module>
|
2023-11-08 04:39:40 +08:00
|
|
|
<module>langchain4j-ollama</module>
|
2023-08-29 03:30:18 +08:00
|
|
|
|
2023-08-19 02:49:50 +08:00
|
|
|
<!-- embedding stores -->
|
Cassandra and Astra (dbaas) as VectorStore and ChatMemoryStore (#162)
#### Context
Apache Cassandra is a popular open-source database created back in 2008.
This year with
[CEP30](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes)
support for vector and similarity searches have been introduced.
Cassandra is very fast in read and write and is used as a cache by many
companies, it as an opportunity to implement the ChatMemoryStore. This
feature is expected for Cassandra 5 at the end of the year but some
docker images are already available.
DataStax AstraDb is a distribution of Apache Cassandra available as Saas
providing a free tier (free forever) of 80 millions queries/month.
[Registration](https://astra.datastax.com). The vector capability is
there production ready.
#### Data Modelling
With the proper data model in Cassandra we can perform both similarity
search, keyword search, metadata search.
```sql
CREATE TABLE sample_vector_table (
row_id text PRIMARY KEY,
attributes_blob text,
body_blob text,
metadata_s map<text, text>,
vector vector<float, 1536>
);
```
#### Implementation Throughts
- The **configuration** to connect to Astra and Cassandra are not
exactly the same so 2 different classes with associated builder are
provided:
[Astra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/AstraDbEmbeddingConfiguration.java)
and [OSS
Cassandra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingConfiguration.java).
A couple of fields are mutualized but creating a superclass to inherit
from lead to the use of Lombok `@SuperBuilder` and the Javadoc was not
able to found out what to do.
- Instead of passing a large number of arguments like other stores I
prefer to wrap them as a bean. With this trick you can add or remove
attributes, make then optional or mandatory at will. If you need to add
a new attribute in the configuration you do not have to change the
implementation of `XXXStore` and `XXXStoreImpl`
- I create an
[AstractEmbeddedStore<T>](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/AbstractEmbeddingStore.java)
that could very well become the super class for any store. It handles
the different call of the real concrete implementation. (_delegate
pattern_). Some default implementation can be implemented
```java
/**
* Add a list of embeddings to the store.
*
* @param embeddings
* list of embeddings (hold vector)
* @return
* list of ids
*/
@Override
public List<String> addAll(List<Embedding> embeddings) {
Objects.requireNonNull(embeddings, "embeddings must not be null");
return embeddings.stream().map(this::add).collect(Collectors.toList());
}
```
The only method to implement at the Store level is:
```java
/**
* Initialize the concrete implementation.
* @return create implementation class for the store
*/
protected abstract EmbeddingStore<T> loadImplementation()
throws ClassNotFoundException, NoSuchMethodException, InstantiationException,
IllegalAccessException, InvocationTargetException;
```
-
[CassandraEmbeddedStore](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingStore.java#L30)
proposes 2 constructors, one could override the implementation class if
they want (extension point)
#### Tests
- Test classes are provided including some long form examples based on
classed found in `langchain4j-examples` but test are disabled.
- To start a local cassandra use docker and the
[docker-compose](https://github.com/clun/langchain4j/blob/main/langchain4j-cassandra/src/test/resources/docker-compose.yml)
```
docker compose up -d
```
- To run Test with Astra signin with your github account, create a token
(api Key) with role `Organization Administrator` following this
[procedure](https://awesome-astra.github.io/docs/pages/astra/create-token/#c-procedure)
<img width="926" alt="Screenshot 2023-09-06 at 18 14 12"
src="https://github.com/langchain4j/langchain4j/assets/726536/dfd2d9e5-09c9-4504-bfaa-31cfd87704a1">
- Pick the full value of the `token` from the json
<img width="713" alt="Screenshot 2023-09-06 at 18 15 53"
src="https://github.com/langchain4j/langchain4j/assets/726536/1be56234-dd98-4f59-af71-03df42ed6997">
- Create the environment variable `ASTRA_DB_APPLICATION_TOKEN`
```console
export ASTRA_DB_APPLICATION_TOKEN=AstraCS:....<your_token>
```
2023-09-27 21:50:04 +08:00
|
|
|
<module>langchain4j-cassandra</module>
|
2023-09-17 22:51:11 +08:00
|
|
|
<module>langchain4j-chroma</module>
|
2023-09-14 22:58:47 +08:00
|
|
|
<module>langchain4j-elasticsearch</module>
|
2023-08-19 02:38:45 +08:00
|
|
|
<module>langchain4j-milvus</module>
|
2023-10-09 18:09:24 +08:00
|
|
|
<module>langchain4j-opensearch</module>
|
2023-10-27 22:45:57 +08:00
|
|
|
<module>langchain4j-pgvector</module>
|
2023-11-19 19:59:24 +08:00
|
|
|
<module>langchain4j-pinecone</module>
|
Cassandra and Astra (dbaas) as VectorStore and ChatMemoryStore (#162)
#### Context
Apache Cassandra is a popular open-source database created back in 2008.
This year with
[CEP30](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes)
support for vector and similarity searches have been introduced.
Cassandra is very fast in read and write and is used as a cache by many
companies, it as an opportunity to implement the ChatMemoryStore. This
feature is expected for Cassandra 5 at the end of the year but some
docker images are already available.
DataStax AstraDb is a distribution of Apache Cassandra available as Saas
providing a free tier (free forever) of 80 millions queries/month.
[Registration](https://astra.datastax.com). The vector capability is
there production ready.
#### Data Modelling
With the proper data model in Cassandra we can perform both similarity
search, keyword search, metadata search.
```sql
CREATE TABLE sample_vector_table (
row_id text PRIMARY KEY,
attributes_blob text,
body_blob text,
metadata_s map<text, text>,
vector vector<float, 1536>
);
```
#### Implementation Throughts
- The **configuration** to connect to Astra and Cassandra are not
exactly the same so 2 different classes with associated builder are
provided:
[Astra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/AstraDbEmbeddingConfiguration.java)
and [OSS
Cassandra](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingConfiguration.java).
A couple of fields are mutualized but creating a superclass to inherit
from lead to the use of Lombok `@SuperBuilder` and the Javadoc was not
able to found out what to do.
- Instead of passing a large number of arguments like other stores I
prefer to wrap them as a bean. With this trick you can add or remove
attributes, make then optional or mandatory at will. If you need to add
a new attribute in the configuration you do not have to change the
implementation of `XXXStore` and `XXXStoreImpl`
- I create an
[AstractEmbeddedStore<T>](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/AbstractEmbeddingStore.java)
that could very well become the super class for any store. It handles
the different call of the real concrete implementation. (_delegate
pattern_). Some default implementation can be implemented
```java
/**
* Add a list of embeddings to the store.
*
* @param embeddings
* list of embeddings (hold vector)
* @return
* list of ids
*/
@Override
public List<String> addAll(List<Embedding> embeddings) {
Objects.requireNonNull(embeddings, "embeddings must not be null");
return embeddings.stream().map(this::add).collect(Collectors.toList());
}
```
The only method to implement at the Store level is:
```java
/**
* Initialize the concrete implementation.
* @return create implementation class for the store
*/
protected abstract EmbeddingStore<T> loadImplementation()
throws ClassNotFoundException, NoSuchMethodException, InstantiationException,
IllegalAccessException, InvocationTargetException;
```
-
[CassandraEmbeddedStore](https://github.com/clun/langchain4j/blob/main/langchain4j/src/main/java/dev/langchain4j/store/embedding/cassandra/CassandraEmbeddingStore.java#L30)
proposes 2 constructors, one could override the implementation class if
they want (extension point)
#### Tests
- Test classes are provided including some long form examples based on
classed found in `langchain4j-examples` but test are disabled.
- To start a local cassandra use docker and the
[docker-compose](https://github.com/clun/langchain4j/blob/main/langchain4j-cassandra/src/test/resources/docker-compose.yml)
```
docker compose up -d
```
- To run Test with Astra signin with your github account, create a token
(api Key) with role `Organization Administrator` following this
[procedure](https://awesome-astra.github.io/docs/pages/astra/create-token/#c-procedure)
<img width="926" alt="Screenshot 2023-09-06 at 18 14 12"
src="https://github.com/langchain4j/langchain4j/assets/726536/dfd2d9e5-09c9-4504-bfaa-31cfd87704a1">
- Pick the full value of the `token` from the json
<img width="713" alt="Screenshot 2023-09-06 at 18 15 53"
src="https://github.com/langchain4j/langchain4j/assets/726536/1be56234-dd98-4f59-af71-03df42ed6997">
- Create the environment variable `ASTRA_DB_APPLICATION_TOKEN`
```console
export ASTRA_DB_APPLICATION_TOKEN=AstraCS:....<your_token>
```
2023-09-27 21:50:04 +08:00
|
|
|
<module>langchain4j-redis</module>
|
2023-09-14 22:58:47 +08:00
|
|
|
<module>langchain4j-vespa</module>
|
2023-08-19 02:49:50 +08:00
|
|
|
<module>langchain4j-weaviate</module>
|
2023-11-10 20:47:13 +08:00
|
|
|
|
2023-11-19 16:19:48 +08:00
|
|
|
<!-- other -->
|
|
|
|
<module>langchain4j-graal</module>
|
|
|
|
|
2023-06-24 15:07:23 +08:00
|
|
|
</modules>
|
|
|
|
|
|
|
|
</project>
|