langchain4j/document-transformers/langchain4j-document-transf...
LangChain4j 42c958a458
Extract HtmlTextExtractor into its own module (#1811)
## Issue
Closes #1049

## Change
Extracted `HtmlTextExtractor` into
`langchain4j-document-transformer-jsoup` module.
Renamed `HtmlToTextDocumentTransformer` into `HtmlTextExtractor`.

Please import:
```xml
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-document-transformer-jsoup</artifactId>
    <version>0.35.0</version>
</dependency>
```

## General checklist
- [ ] There are no breaking changes
- [ ] I have added unit and integration tests for my change
- [X] I have manually run all the unit and integration tests in the
module I have added/changed, and they are all green
- [X] I have manually run all the unit and integration tests in the
[core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core)
and
[main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j)
modules, and they are all green
- [X] I have added/updated the
[documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs)
- [ ] I have added an example in the [examples
repo](https://github.com/langchain4j/langchain4j-examples) (only for
"big" features)
- [ ] I have added/updated [Spring Boot
starter(s)](https://github.com/langchain4j/langchain4j-spring) (if
applicable)
2024-09-24 15:08:23 +02:00
..
src Extract HtmlTextExtractor into its own module (#1811) 2024-09-24 15:08:23 +02:00
pom.xml Extract HtmlTextExtractor into its own module (#1811) 2024-09-24 15:08:23 +02:00