<!-- Thank you so much for your contribution! -->
## Context
See https://github.com/langchain4j/langchain4j/issues/804
## Change
- `OpenAiStreamingChatModel`: in case `modelName` is not one of the
known OpenAI models, do not return `TokenUsage` in the `Response`. This
is done for cases when `OpenAiStreamingChatModel` is used to connect to
other OpenAI-API-compatible LLM providers like Ollama and Groq. In such
cases it is better to not return `TokenUsage` then returning a wrong
one.
- For all OpenAI models, default `Tokenizer` will now use
"gpt-3.5-turbo" model name instead of the one provided by the user in
the `modelName` parameter. This is done to avoid crashing with "Model
'ft:gpt-3.5-turbo:my-org:custom_suffix:id' is unknown to jtokkit" for
fine-tuned OpenAI models. It should be safe to use "gpt-3.5-turbo" by
default with all current OpenAI models, as they all use the same
cl100k_base encoding.
## Checklist
Before submitting this PR, please check the following points:
- [X] I have added unit and integration tests for my change
- [X] All unit and integration tests in the module I have added/changed
are green
- [X] All unit and integration tests in the
[core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core)
and
[main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j)
modules are green
- [ ] I have added/updated the
[documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs)
- [ ] I have added an example in the [examples
repo](https://github.com/langchain4j/langchain4j-examples) (only for
"big" features)
- [ ] I have added my new module in the
[BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml)
(only when a new module is added)
Public Ollama Client
- list model method
- get model details method
### Motivation
In my research project, I'm using Langchain4j, as anyone should :)
From my research, it seems that this client code is in sync with the
Ollama API, and it is the easiest and most maintainable code. So, in my
project, I use Langchain4j, and it's backed by the Ollama provider. In
my use case, I need to be able to list models and, in the future, even
create one. Is it possible to make the OllamaClient code public?
I wrote some thoughts about OpenAPI in Ollama in this issue.
https://github.com/jmorganca/ollama/issues/716#issuecomment-1904415711
So, if Ollama developers consider adding an OpenAPI endpoint, I will be
the first to make the OllamaClient package-private again.
This is a small prototype based on discussions originating from
https://github.com/ai-for-java/openai4j/pull/13
The approach I took here is to allow for decorating the models/builders
with additional functionality without having to extend model classes or
builders. I did it for a single model in this prototype - the
`OpenAiChatModel`, but this pattern could be applied to all of the other
models across Langchain4J.
That doesn't mean you couldn't extend the model classes if you wanted to
use inheritance. I just try to avoid it and use composition instead.
I also added a test which shows how it would be used. Downstream
libraries (like Spring Boot or Quarkus) could use this mechanism to
extend/enhance with their own capabilities which aren't necessarily part
of the model.
Let me know what you think @geoand / @langchain4j !
Happy to continue conversation and see where we can bring this!
- added `OllamaStreamingChatModel`
- added `format` parameter to all models, now can get valid JSON with
`format="json"`
- added `top_k`, `top_p`, `repeat_penalty`, `seed`, `num_predict`,
`stop` paramerters to all models
A new implementation of ChatLanguageModel, OllamaChatModel, is added to
handle interactions with the Ollama AI and has an associated integration
test. This includes necessary configurations and methods for message
generation. This increases the project's modularity and provides a more
convenient and encapsulated way of interfacing with the Ollama AI.
Currently, integration tests in Ollama module are disabled because
it needs a Ollama instance running in order to execute them.
Testcontainers provides this infrastructure not only by running
the ollama container but also by automating the pull model step.
This commit, use the Singleton Container approach to reuse the single
instance across multiple IT. Also, pull step is only executed when the
image is `ollama/ollama`.
The behavior on this is:
1st execution:
1. Pull `ollama/ollama` image
2. Start the container based on `ollama/ollama` image
3. Download the `orca-mini` model
4. Create an image based on the current state (with the model in it)
5. Declare the container ready to use
6. Run test
Next executions:
1. Look for the local image created in the 1st execution
2. Start the container based on the local image
3. Declare the container ready to use
4. Run test
1st execution is expected to take longer because of the model (3GB).
Next execution are way more faster.
@langchain4j Hi! I make some progress to integrate with Ollama, see
#244. Using `retrofit` to define REST API in [Ollama
doc](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion).
Local test need local deployment of `Ollama`. Looking forward to get
your code review!
---------
Co-authored-by: Heezer <33568148+Heezer@users.noreply.github.com>