POC: Easy RAG (#686)
Implementing RAG applications is hard. Especially for those who are just
getting started exploring LLMs and RAG.
This PR introduces an "Easy RAG" feature that should help developers to
get started with RAG as easy as possible.
With it, there is no need to learn about
chunking/splitting/segmentation, embeddings, embedding models, vector
databases, retrieval techniques and other RAG-related concepts.
This is similar to how one can simply upload one or multiple files into
[OpenAI Assistants
API](https://platform.openai.com/docs/assistants/overview) and the LLM
will automagically know about their contents when answering questions.
Easy RAG is using local embedding model running in your CPU (GPU support
can be added later).
Your files are ingested into an in-memory embedding store.
Please note that "Easy RAG" will not replace manual RAG setups and
especially [advanced RAG
techniques](https://github.com/langchain4j/langchain4j/pull/538), but
will provide an easier way to get started with RAG.
The quality of an "Easy RAG" should be sufficient for demos, proof of
concepts and for getting started.
To use "Easy RAG", simply import `langchain4j-easy-rag` dependency that
includes everything needed to do RAG:
- Apache Tika document loader (to parse all document types
automatically)
- Quantized [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) in-process embedding model which has an impressive (for it's size) 51.68 [score](https://huggingface.co/spaces/mteb/leaderboard) for retrieval
Here is the proposed API:
```java
List<Document> documents = FileSystemDocumentLoader.loadDocuments(directoryPath); // one can also load documents recursively and filter with glob/regex
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // we will use an in-memory embedding store for simplicity
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(model)
.contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
.build();
String answer = assistant.chat("Who is Charlie?"); // Charlie is a carrot...
```
`FileSystemDocumentLoader` in the above code loads documents using
`DocumentParser` available in classpath via SPI, in this case an
`ApacheTikaDocumentParser` imported with the `langchain4j-easy-rag`
dependency.
The `EmbeddingStoreIngestor` in the above code:
- splits documents into smaller text segments using a `DocumentSplitter`
loaded via SPI from the `langchain4j-easy-rag` dependency. Currently it
uses `DocumentSplitters.recursive(300, 30, new HuggingFaceTokenizer())`
- embeds text segments using an `AllMiniLmL6V2QuantizedEmbeddingModel`
loaded via SPI from the `langchain4j-easy-rag` dependency
- stores text segments and their embeddings into the specified embedding
store
When using `InMemoryEmbeddingStore`, one can serialize/persist it into a
JSON string on into a file.
This way one can skip loading documents and embedding them on each
application run.
It is easy to customize the ingestion in the above code, just change
```java
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
```
into
```java
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
//.documentTransformer(...) // you can optionally transform (clean, enrich, etc) documents before splitting
//.documentSplitter(...) // you can optionally specify another splitter
//.textSegmentTransformer(...) // you can optionally transform (clean, enrich, etc) segments before embedding
//.embeddingModel(...) // you can optionally specify another embedding model to use for embedding
.embeddingStore(embeddingStore)
.build();
ingestor.ingest(documents)
```
Over time, we can add an auto-eval feature that will find the most
suitable hyperparametes for a given documents (e.g. which embedding
model to use, which splitting method, possibly advanced RAG techniques,
etc.) so that "easy RAG" can be comparable to the "advanced RAG".
Related:
https://github.com/langchain4j/langchain4j-embeddings/pull/16
---------
Co-authored-by: dliubars <dliubars@redhat.com>
2024-03-22 00:37:38 +08:00
|
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
|
|
<project xmlns="http://maven.apache.org/POM/4.0.0"
|
|
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
|
|
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
|
|
<modelVersion>4.0.0</modelVersion>
|
|
|
|
<parent>
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
|
|
|
<artifactId>langchain4j-parent</artifactId>
|
2024-09-25 21:23:52 +08:00
|
|
|
<version>0.36.0-SNAPSHOT</version>
|
POC: Easy RAG (#686)
Implementing RAG applications is hard. Especially for those who are just
getting started exploring LLMs and RAG.
This PR introduces an "Easy RAG" feature that should help developers to
get started with RAG as easy as possible.
With it, there is no need to learn about
chunking/splitting/segmentation, embeddings, embedding models, vector
databases, retrieval techniques and other RAG-related concepts.
This is similar to how one can simply upload one or multiple files into
[OpenAI Assistants
API](https://platform.openai.com/docs/assistants/overview) and the LLM
will automagically know about their contents when answering questions.
Easy RAG is using local embedding model running in your CPU (GPU support
can be added later).
Your files are ingested into an in-memory embedding store.
Please note that "Easy RAG" will not replace manual RAG setups and
especially [advanced RAG
techniques](https://github.com/langchain4j/langchain4j/pull/538), but
will provide an easier way to get started with RAG.
The quality of an "Easy RAG" should be sufficient for demos, proof of
concepts and for getting started.
To use "Easy RAG", simply import `langchain4j-easy-rag` dependency that
includes everything needed to do RAG:
- Apache Tika document loader (to parse all document types
automatically)
- Quantized [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) in-process embedding model which has an impressive (for it's size) 51.68 [score](https://huggingface.co/spaces/mteb/leaderboard) for retrieval
Here is the proposed API:
```java
List<Document> documents = FileSystemDocumentLoader.loadDocuments(directoryPath); // one can also load documents recursively and filter with glob/regex
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // we will use an in-memory embedding store for simplicity
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(model)
.contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
.build();
String answer = assistant.chat("Who is Charlie?"); // Charlie is a carrot...
```
`FileSystemDocumentLoader` in the above code loads documents using
`DocumentParser` available in classpath via SPI, in this case an
`ApacheTikaDocumentParser` imported with the `langchain4j-easy-rag`
dependency.
The `EmbeddingStoreIngestor` in the above code:
- splits documents into smaller text segments using a `DocumentSplitter`
loaded via SPI from the `langchain4j-easy-rag` dependency. Currently it
uses `DocumentSplitters.recursive(300, 30, new HuggingFaceTokenizer())`
- embeds text segments using an `AllMiniLmL6V2QuantizedEmbeddingModel`
loaded via SPI from the `langchain4j-easy-rag` dependency
- stores text segments and their embeddings into the specified embedding
store
When using `InMemoryEmbeddingStore`, one can serialize/persist it into a
JSON string on into a file.
This way one can skip loading documents and embedding them on each
application run.
It is easy to customize the ingestion in the above code, just change
```java
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
```
into
```java
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
//.documentTransformer(...) // you can optionally transform (clean, enrich, etc) documents before splitting
//.documentSplitter(...) // you can optionally specify another splitter
//.textSegmentTransformer(...) // you can optionally transform (clean, enrich, etc) segments before embedding
//.embeddingModel(...) // you can optionally specify another embedding model to use for embedding
.embeddingStore(embeddingStore)
.build();
ingestor.ingest(documents)
```
Over time, we can add an auto-eval feature that will find the most
suitable hyperparametes for a given documents (e.g. which embedding
model to use, which splitting method, possibly advanced RAG techniques,
etc.) so that "easy RAG" can be comparable to the "advanced RAG".
Related:
https://github.com/langchain4j/langchain4j-embeddings/pull/16
---------
Co-authored-by: dliubars <dliubars@redhat.com>
2024-03-22 00:37:38 +08:00
|
|
|
<relativePath>../langchain4j-parent/pom.xml</relativePath>
|
|
|
|
</parent>
|
|
|
|
|
|
|
|
<artifactId>langchain4j-easy-rag</artifactId>
|
|
|
|
<packaging>jar</packaging>
|
|
|
|
|
|
|
|
<name>LangChain4j :: Easy RAG</name>
|
|
|
|
|
2024-10-16 23:38:47 +08:00
|
|
|
<properties>
|
|
|
|
<!-- TODO: remove enforcer.skipRules -->
|
|
|
|
<enforcer.skipRules>dependencyConvergence</enforcer.skipRules>
|
|
|
|
</properties>
|
|
|
|
|
POC: Easy RAG (#686)
Implementing RAG applications is hard. Especially for those who are just
getting started exploring LLMs and RAG.
This PR introduces an "Easy RAG" feature that should help developers to
get started with RAG as easy as possible.
With it, there is no need to learn about
chunking/splitting/segmentation, embeddings, embedding models, vector
databases, retrieval techniques and other RAG-related concepts.
This is similar to how one can simply upload one or multiple files into
[OpenAI Assistants
API](https://platform.openai.com/docs/assistants/overview) and the LLM
will automagically know about their contents when answering questions.
Easy RAG is using local embedding model running in your CPU (GPU support
can be added later).
Your files are ingested into an in-memory embedding store.
Please note that "Easy RAG" will not replace manual RAG setups and
especially [advanced RAG
techniques](https://github.com/langchain4j/langchain4j/pull/538), but
will provide an easier way to get started with RAG.
The quality of an "Easy RAG" should be sufficient for demos, proof of
concepts and for getting started.
To use "Easy RAG", simply import `langchain4j-easy-rag` dependency that
includes everything needed to do RAG:
- Apache Tika document loader (to parse all document types
automatically)
- Quantized [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) in-process embedding model which has an impressive (for it's size) 51.68 [score](https://huggingface.co/spaces/mteb/leaderboard) for retrieval
Here is the proposed API:
```java
List<Document> documents = FileSystemDocumentLoader.loadDocuments(directoryPath); // one can also load documents recursively and filter with glob/regex
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // we will use an in-memory embedding store for simplicity
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(model)
.contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
.build();
String answer = assistant.chat("Who is Charlie?"); // Charlie is a carrot...
```
`FileSystemDocumentLoader` in the above code loads documents using
`DocumentParser` available in classpath via SPI, in this case an
`ApacheTikaDocumentParser` imported with the `langchain4j-easy-rag`
dependency.
The `EmbeddingStoreIngestor` in the above code:
- splits documents into smaller text segments using a `DocumentSplitter`
loaded via SPI from the `langchain4j-easy-rag` dependency. Currently it
uses `DocumentSplitters.recursive(300, 30, new HuggingFaceTokenizer())`
- embeds text segments using an `AllMiniLmL6V2QuantizedEmbeddingModel`
loaded via SPI from the `langchain4j-easy-rag` dependency
- stores text segments and their embeddings into the specified embedding
store
When using `InMemoryEmbeddingStore`, one can serialize/persist it into a
JSON string on into a file.
This way one can skip loading documents and embedding them on each
application run.
It is easy to customize the ingestion in the above code, just change
```java
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
```
into
```java
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
//.documentTransformer(...) // you can optionally transform (clean, enrich, etc) documents before splitting
//.documentSplitter(...) // you can optionally specify another splitter
//.textSegmentTransformer(...) // you can optionally transform (clean, enrich, etc) segments before embedding
//.embeddingModel(...) // you can optionally specify another embedding model to use for embedding
.embeddingStore(embeddingStore)
.build();
ingestor.ingest(documents)
```
Over time, we can add an auto-eval feature that will find the most
suitable hyperparametes for a given documents (e.g. which embedding
model to use, which splitting method, possibly advanced RAG techniques,
etc.) so that "easy RAG" can be comparable to the "advanced RAG".
Related:
https://github.com/langchain4j/langchain4j-embeddings/pull/16
---------
Co-authored-by: dliubars <dliubars@redhat.com>
2024-03-22 00:37:38 +08:00
|
|
|
<dependencies>
|
|
|
|
|
2024-03-26 18:54:43 +08:00
|
|
|
<dependency>
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
|
|
|
<artifactId>langchain4j</artifactId>
|
|
|
|
</dependency>
|
|
|
|
|
POC: Easy RAG (#686)
Implementing RAG applications is hard. Especially for those who are just
getting started exploring LLMs and RAG.
This PR introduces an "Easy RAG" feature that should help developers to
get started with RAG as easy as possible.
With it, there is no need to learn about
chunking/splitting/segmentation, embeddings, embedding models, vector
databases, retrieval techniques and other RAG-related concepts.
This is similar to how one can simply upload one or multiple files into
[OpenAI Assistants
API](https://platform.openai.com/docs/assistants/overview) and the LLM
will automagically know about their contents when answering questions.
Easy RAG is using local embedding model running in your CPU (GPU support
can be added later).
Your files are ingested into an in-memory embedding store.
Please note that "Easy RAG" will not replace manual RAG setups and
especially [advanced RAG
techniques](https://github.com/langchain4j/langchain4j/pull/538), but
will provide an easier way to get started with RAG.
The quality of an "Easy RAG" should be sufficient for demos, proof of
concepts and for getting started.
To use "Easy RAG", simply import `langchain4j-easy-rag` dependency that
includes everything needed to do RAG:
- Apache Tika document loader (to parse all document types
automatically)
- Quantized [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) in-process embedding model which has an impressive (for it's size) 51.68 [score](https://huggingface.co/spaces/mteb/leaderboard) for retrieval
Here is the proposed API:
```java
List<Document> documents = FileSystemDocumentLoader.loadDocuments(directoryPath); // one can also load documents recursively and filter with glob/regex
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // we will use an in-memory embedding store for simplicity
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(model)
.contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
.build();
String answer = assistant.chat("Who is Charlie?"); // Charlie is a carrot...
```
`FileSystemDocumentLoader` in the above code loads documents using
`DocumentParser` available in classpath via SPI, in this case an
`ApacheTikaDocumentParser` imported with the `langchain4j-easy-rag`
dependency.
The `EmbeddingStoreIngestor` in the above code:
- splits documents into smaller text segments using a `DocumentSplitter`
loaded via SPI from the `langchain4j-easy-rag` dependency. Currently it
uses `DocumentSplitters.recursive(300, 30, new HuggingFaceTokenizer())`
- embeds text segments using an `AllMiniLmL6V2QuantizedEmbeddingModel`
loaded via SPI from the `langchain4j-easy-rag` dependency
- stores text segments and their embeddings into the specified embedding
store
When using `InMemoryEmbeddingStore`, one can serialize/persist it into a
JSON string on into a file.
This way one can skip loading documents and embedding them on each
application run.
It is easy to customize the ingestion in the above code, just change
```java
EmbeddingStoreIngestor.ingest(documents, embeddingStore);
```
into
```java
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
//.documentTransformer(...) // you can optionally transform (clean, enrich, etc) documents before splitting
//.documentSplitter(...) // you can optionally specify another splitter
//.textSegmentTransformer(...) // you can optionally transform (clean, enrich, etc) segments before embedding
//.embeddingModel(...) // you can optionally specify another embedding model to use for embedding
.embeddingStore(embeddingStore)
.build();
ingestor.ingest(documents)
```
Over time, we can add an auto-eval feature that will find the most
suitable hyperparametes for a given documents (e.g. which embedding
model to use, which splitting method, possibly advanced RAG techniques,
etc.) so that "easy RAG" can be comparable to the "advanced RAG".
Related:
https://github.com/langchain4j/langchain4j-embeddings/pull/16
---------
Co-authored-by: dliubars <dliubars@redhat.com>
2024-03-22 00:37:38 +08:00
|
|
|
<dependency>
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
|
|
|
<artifactId>langchain4j-document-parser-apache-tika</artifactId>
|
|
|
|
<version>${project.version}</version>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
|
|
|
<artifactId>langchain4j-embeddings-bge-small-en-v15-q</artifactId>
|
|
|
|
<version>${project.version}</version>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>dev.langchain4j</groupId>
|
|
|
|
<artifactId>langchain4j-open-ai</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>org.junit.jupiter</groupId>
|
|
|
|
<artifactId>junit-jupiter-engine</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>org.junit.jupiter</groupId>
|
|
|
|
<artifactId>junit-jupiter-params</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>org.assertj</groupId>
|
|
|
|
<artifactId>assertj-core</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
<dependency>
|
|
|
|
<groupId>org.tinylog</groupId>
|
|
|
|
<artifactId>tinylog-impl</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
<dependency>
|
|
|
|
<groupId>org.tinylog</groupId>
|
|
|
|
<artifactId>slf4j-tinylog</artifactId>
|
|
|
|
<scope>test</scope>
|
|
|
|
</dependency>
|
|
|
|
|
|
|
|
</dependencies>
|
|
|
|
|
|
|
|
</project>
|