langchain4j

Commit Graph

Author	SHA1	Message	Date
LangChain4j	11855157dd	updated version to 0.36.0-SNAPSHOT	2024-09-25 15:23:52 +02:00
LangChain4j	79f03dff36	Release 0.35.0 (#1829 )	2024-09-25 13:16:03 +02:00
LangChain4j	8c625c3caf	capitalize maven module names	2024-09-24 15:33:17 +02:00
LangChain4j	21d35e4434	changed version to 0.35.0-SNAPSHOT	2024-09-09 10:11:09 +02:00
LangChain4j	b0a8e6f45b	Release 0.34.0 (#1711 )	2024-09-05 16:49:39 +02:00
Felipe Zambrin	9b25b59c7b	[Feature] ApachePdfBoxDocumentParser should return metadata (#1475 ) ## Issue Closes #1406 ## Change Added metadata to Document returned by ApachePdfBoxDocumentParser. ## General checklist <!-- Please double-check the following points and mark them like this: [X] --> - [X] There are no breaking changes - [X] I have added unit and integration tests for my change - [X] I have manually run all the unit and integration tests in the module I have added/changed, and they are all green - [ ] I have manually run all the unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules, and they are all green <!-- Before adding documentation and example(s) (below), please wait until the PR is reviewed and approved. --> - [ ] I have added/updated the [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs) - [ ] I have added an example in the [examples repo](https://github.com/langchain4j/langchain4j-examples) (only for "big" features) - [ ] I have added/updated [Spring Boot starter(s)](https://github.com/langchain4j/langchain4j-spring) (if applicable)	2024-09-03 14:42:30 +02:00
PrimosK	e535f0153d	re #1506 Enabling Maven (version) enforcer plugin in modules with no version conflicts (#1507 ) ## Issue #1506 ## Change Enabled Maven Enforcer Plugin on modules without existing version conflicts to ensure they remain conflict-free. The Maven Enforcer Plugin will now cause the build to fail if new conflicts are introduced guarding against these. ## Tests `mvn clean test` passed	2024-08-06 15:21:25 +02:00
LangChain4j	1cccfdfa65	changed version to 0.34.0-SNAPSHOT	2024-07-26 15:12:26 +02:00
LangChain4j	822f09cb1c	Release 0.33.0 (#1514 )	2024-07-25 10:12:20 +02:00
LangChain4j	fe50c88e77	changed version to 0.33.0-SNAPSHOT	2024-07-08 14:47:07 +02:00
LangChain4j	c2366a226c	Release 0.32.0 (#1409 )	2024-07-04 12:04:29 +02:00
Alex K	62fdc16185	Fix deprecated methods (#1213 ) This is small refactoring There are bunch of places where use deprecated methods. These changes fix this issue. ## General checklist <!-- Please double-check the following points and mark them like this: [X] --> - [x] There are no breaking changes - [ ] I have added unit and integration tests for my change - [x] I have manually run all the unit and integration tests in the module I have added/changed, and they are all green - [x] I have manually run all the unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules, and they are all green <!-- Before adding documentation and example(s) (below), please wait until the PR is reviewed and approved. --> - [ ] I have added/updated the [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs) - [ ] I have added an example in the [examples repo](https://github.com/langchain4j/langchain4j-examples) (only for "big" features)	2024-06-13 15:03:34 +02:00
LangChain4j	a1b733d96d	bumped version to 0.32.0-SNAPSHOT	2024-05-24 16:25:13 +02:00
LangChain4j	d9cb1e9b81	Release 0.31.0 (#1151 )	2024-05-23 17:40:52 +02:00
Kais Neffati	f34c5432ee	[BUG] Introduce parser supplier support in FileSystemDocumentLoader (#1031 ) ## Issue https://github.com/langchain4j/langchain4j/issues/1026 ## General checklist <!-- Please double-check the following points and mark them like this: [X] --> - [X] There are no breaking changes - [X] I have added unit and integration tests for my change - [X] I have manually run all the unit and integration tests in the module I have added/changed, and they are all green - [X] I have manually run all the unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules, and they are all green - [X] I have added/updated the [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs) - [ ] I have added an example in the [examples repo](https://github.com/langchain4j/langchain4j-examples) (only for "big" features)	2024-05-06 08:33:12 +02:00
LangChain4j	66c338c135	changed version to 0.31.0-SNAPSHOT	2024-04-29 11:21:00 +02:00
LangChain4j	1a340893ec	Release 0.30.0 (#945 )	2024-04-16 18:21:01 +02:00
LangChain4j	03528af8a1	Fix #913 : FileSystemDocumentLoader: ignore empty/blank documents, improved error/warn messages (#920 ) ## Context When loading with `FileSystemDocumentLoader` and it encounters an empty file, a WARN message in the logs looks like an exception and is confusing. See https://github.com/langchain4j/langchain4j/issues/913 ## Change - Ignore empty/blank documents when loading multiple files with `DocumentIsBlankException` - Changed log message to look not like an exception ## Checklist Before submitting this PR, please check the following points: - [X] I have added unit and integration tests for my change - [X] All unit and integration tests in the module I have added/changed are green - [X] All unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules are green - [ ] I have added/updated the [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs) - [ ] I have added an example in the [examples repo](https://github.com/langchain4j/langchain4j-examples) (only for "big" features) - [ ] I have added my new module in the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml) (only when a new module is added) ## Checklist for adding new embedding store integration - [ ] I have added a {NameOfIntegration}EmbeddingStoreIT that extends from either EmbeddingStoreIT or EmbeddingStoreWithFilteringIT	2024-04-16 08:58:50 +02:00
LangChain4j	d1d9b45adc	bumped to 0.30.0-SNAPSHOT	2024-04-08 17:36:52 +02:00
LangChain4j	45b58ac993	released 0.29.1 (#857 )	2024-03-28 16:42:45 +01:00
LangChain4j	d1e3cc1693	Release 0.29.0 (#830 )	2024-03-26 11:54:43 +01:00
LangChain4j	2f425da9f7	POC: Easy RAG (#686 ) Implementing RAG applications is hard. Especially for those who are just getting started exploring LLMs and RAG. This PR introduces an "Easy RAG" feature that should help developers to get started with RAG as easy as possible. With it, there is no need to learn about chunking/splitting/segmentation, embeddings, embedding models, vector databases, retrieval techniques and other RAG-related concepts. This is similar to how one can simply upload one or multiple files into [OpenAI Assistants API](https://platform.openai.com/docs/assistants/overview) and the LLM will automagically know about their contents when answering questions. Easy RAG is using local embedding model running in your CPU (GPU support can be added later). Your files are ingested into an in-memory embedding store. Please note that "Easy RAG" will not replace manual RAG setups and especially [advanced RAG techniques](https://github.com/langchain4j/langchain4j/pull/538), but will provide an easier way to get started with RAG. The quality of an "Easy RAG" should be sufficient for demos, proof of concepts and for getting started. To use "Easy RAG", simply import `langchain4j-easy-rag` dependency that includes everything needed to do RAG: - Apache Tika document loader (to parse all document types automatically) - Quantized [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) in-process embedding model which has an impressive (for it's size) 51.68 [score](https://huggingface.co/spaces/mteb/leaderboard) for retrieval Here is the proposed API: ```java List<Document> documents = FileSystemDocumentLoader.loadDocuments(directoryPath); // one can also load documents recursively and filter with glob/regex EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // we will use an in-memory embedding store for simplicity EmbeddingStoreIngestor.ingest(documents, embeddingStore); Assistant assistant = AiServices.builder(Assistant.class) .chatLanguageModel(model) .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore)) .build(); String answer = assistant.chat("Who is Charlie?"); // Charlie is a carrot... ``` `FileSystemDocumentLoader` in the above code loads documents using `DocumentParser` available in classpath via SPI, in this case an `ApacheTikaDocumentParser` imported with the `langchain4j-easy-rag` dependency. The `EmbeddingStoreIngestor` in the above code: - splits documents into smaller text segments using a `DocumentSplitter` loaded via SPI from the `langchain4j-easy-rag` dependency. Currently it uses `DocumentSplitters.recursive(300, 30, new HuggingFaceTokenizer())` - embeds text segments using an `AllMiniLmL6V2QuantizedEmbeddingModel` loaded via SPI from the `langchain4j-easy-rag` dependency - stores text segments and their embeddings into the specified embedding store When using `InMemoryEmbeddingStore`, one can serialize/persist it into a JSON string on into a file. This way one can skip loading documents and embedding them on each application run. It is easy to customize the ingestion in the above code, just change ```java EmbeddingStoreIngestor.ingest(documents, embeddingStore); ``` into ```java EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder() //.documentTransformer(...) // you can optionally transform (clean, enrich, etc) documents before splitting //.documentSplitter(...) // you can optionally specify another splitter //.textSegmentTransformer(...) // you can optionally transform (clean, enrich, etc) segments before embedding //.embeddingModel(...) // you can optionally specify another embedding model to use for embedding .embeddingStore(embeddingStore) .build(); ingestor.ingest(documents) ``` Over time, we can add an auto-eval feature that will find the most suitable hyperparametes for a given documents (e.g. which embedding model to use, which splitting method, possibly advanced RAG techniques, etc.) so that "easy RAG" can be comparable to the "advanced RAG". Related: https://github.com/langchain4j/langchain4j-embeddings/pull/16 --------- Co-authored-by: dliubars <dliubars@redhat.com>	2024-03-21 17:37:38 +01:00
LangChain4j	91db3d354a	bumped to 0.29.0-SNAPSHOT	2024-03-14 13:31:28 +01:00
LangChain4j	90fe3040b9	released 0.28.0 (#735 )	2024-03-11 20:08:55 +01:00
LangChain4j	197b4af9d1	bumped version to 0.28.0-SNAPSHOT	2024-02-09 15:11:52 +01:00
LangChain4j	c1462c087f	release 0.27.1 (#621 )	2024-02-09 15:00:42 +01:00
LangChain4j	ad2fd90f32	bumped version to 0.28.0-SNAPSHOT	2024-02-09 08:12:28 +01:00
LangChain4j	a22d297104	Release 0.27.0 (#615 )	2024-02-09 08:00:34 +01:00
Antonio Goncalves	baac759766	Beautifying Maven output (#572 ) Looking at the Maven output I thought it could benefit from a little renaming. I just changed the `<name>` in the `pom.xml`, nothing more. The output is like this at the moment: ![Screenshot 2024-01-30 at 16 26 53](https://github.com/langchain4j/langchain4j/assets/729277/940886d1-565e-416f-a58e-91f609fc0c00) It could look like this if this PR is merged: ![Screenshot 2024-01-30 at 16 42 38](https://github.com/langchain4j/langchain4j/assets/729277/f8787af2-b869-4e95-90bd-72bce5622737) Just a personal taste. Let me know if you like it or not (or want to change it). If not, just discard it, it's fine ;o)	2024-01-30 16:54:54 +01:00
LangChain4j	fca8ca48f7	bump version to 0.27.0-SNAPSHOT	2024-01-30 16:18:40 +01:00
LangChain4j	3958e01738	release 0.26.1 (#570 )	2024-01-30 16:11:21 +01:00
LangChain4j	469699b944	bump version to 0.27.0-SNAPSHOT	2024-01-30 08:07:45 +01:00
LangChain4j	a8ad9e48d9	Automate release (#562 )	2024-01-30 07:20:20 +01:00
LangChain4j	7e5e82b7b2	updated to 0.26.0-SNAPSHOT	2023-12-22 18:08:19 +01:00
LangChain4j	2a5308b794	released 0.25.0	2023-12-22 18:02:04 +01:00
LangChain4j	e1dddb33a2	bumped version to 0.25.0-SNAPSHOT (#369 )	2023-12-19 13:03:48 +01:00
LangChain4j	3731f3326f	Extract document loaders and parsers into separate modules (#354 ) - extract PDF, POI document parsers into separate modules - extract and simplify S3 document loader into a separate module	2023-12-18 16:32:22 +01:00

37 Commits