langchain4j/langchain4j-oracle
LangChain4j 21d35e4434 changed version to 0.35.0-SNAPSHOT 2024-09-09 10:11:09 +02:00
..
src Oracle Database Embedding Store (#1490) 2024-08-27 10:38:29 +02:00
README.md Fix typo in readme (#1653) 2024-09-02 10:48:39 +02:00
pom.xml changed version to 0.35.0-SNAPSHOT 2024-09-09 10:11:09 +02:00

README.md

Oracle Database Embedding Store

This module implements EmbeddingStore using Oracle Database.

Requirements

  • Oracle Database 23.4 or newer

Installation

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artificatId>langchain4j-oracle</artificatId>
    <version>0.1.0</version>
</dependency>

Usage

Instances of this store can be created by configuring a builder. The builder requires that a DataSource and an embedding table be provided. The distance between two vectors is calculated using cosine similarity which measures the cosine of the angle between two vectors.

It is recommended to configure a DataSource with pools connections, such as the Universal Connection Pool or Hikari. A connection pool will avoid the latency of repeatedly creating new database connections.

If an embedding table already exists in your database provide the table name.

EmbeddingStore embeddingStore = OracleEmbeddingStore.builder()
   .dataSource(myDataSource)
   .embeddingTable("my_embedding_table")
   .build();

If the table does not already exist, it can be created by passing a CreateOption to the builder.

EmbeddingStore embeddingStore = OracleEmbeddingStore.builder()
   .dataSource(myDataSource)
   .embeddingTable("my_embedding_table", CreateOption.CREATE_IF_NOT_EXISTS)
   .build();

By default the embedding table will have the following columns:

Name Type Description
id VARCHAR(36) Primary key. Used to store UUID strings which are generated when the embedding store
embedding VECTOR(*, FLOAT32) Stores the embedding
text CLOB Stores the text segment
metadata JSON Stores the metadata

If the columns of your existing table do not match the predefined column names or you would like to use different column names, you can use a EmbeddingTable builder to configure your embedding table.

OracleEmbeddingStore embeddingStore =
OracleEmbeddingStore.builder()
    .dataSource(myDataSource)
    .embeddingTable(EmbeddingTable.builder()
            .createOption(CREATE_OR_REPLACE) // use NONE if the table already exists
            .name("my_embedding_table")
            .idColumn("id_column_name")
            .embeddingColumn("embedding_column_name")
            .textColumn("text_column_name")
            .metadataColumn("metadata_column_name")
            .build())
    .build();

The builder provides two other methods that allow to create an index on the embedding column and configure the use of exact or approximate search.

For more information about Oracle AI Vector Search refer to the documentation.

Running the Test Suite

By default, integration tests will run a docker image of Oracle Database using TestContainers. Alternatively, the tests can connect to an Oracle Database if the following environment variables are configured:

  • ORACLE_JDBC_URL : Set to an Oracle JDBC URL, such as jdbc:oracle:thin@example:1521/serviceName
  • ORACLE_JDBC_USER : Set to the name of a database user. (Optional)
  • ORACLE_JDBC_PASSWORD : Set to the password of a database user. (Optional)