Commit Graph

9181 Commits

Author SHA1 Message Date
meili-bors[bot] a96b45dda7
Merge #4451
4451: Fix nightly build r=dureuill a=dureuill

# Pull Request

## Related issue
Fixes #4441 

## What does this PR do?
- Change imports following https://github.com/rust-lang/rust/pull/117772

## Note

This one is going to be annoying a bit until the lint stabilizes:

- We only get the warning on nightly, so we will discover them when it runs in the CI that uses the nightly compiler (not on regular PRs)
- There's the case of `TryInto`/`TryFrom` traits. They have been added to the prelude in Rust edition 2021, so it means that `use`ing them is a warning on nightly for 2021 edition crates (most crates), but not `use`ing them is an error anywhere for 2018 Rust edition crates, such as `milli`

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-29 07:20:22 +00:00
Louis Dureuil 452a343a2b
Fix imports 2024-02-28 18:09:40 +01:00
meili-bors[bot] b87485e80d
Merge #4433
4433: Enhance facet incremental r=Kerollmops a=ManyTheFish

# Pull Request

## Related issue
Fixes #4367
Fixes #4409

## What does this PR do?

- Add a test reproducing #4409
- Fix #4409 by removing a document from a level only if it is no more present in all the linked sub-level nodes
- Optimize facet Incremental indexing by creating or deleting a complete level once per field id instead of for each facet value
- Optimize facet Incremental indexing by doing the additions and the deletions in the same process instead of doing them separately


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-28 15:28:46 +00:00
meili-bors[bot] 147a67dc82
Merge #4446
4446: Do not omit vectors when importing a dump r=irevoire a=dureuill

# Pull Request

## Related issue
Fixes #4447 

## What does this PR do?
- Correctly populate the maps of embedders before starting the indexing operations, while importing a dump


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-27 09:11:00 +00:00
Louis Dureuil 716ffc07ee
Build the embedders when importing a dump 2024-02-26 22:15:57 +01:00
meili-bors[bot] b005eb3289
Merge #4435
4435: Make update file deletion atomic r=Kerollmops a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4432
Fixes https://github.com/meilisearch/meilisearch/issues/4438 by adding the logs the user asked

## What does this PR do?
- Adds a bunch of logs to help debug this kind of issue in the future
- Delete the update files AFTER committing the update in the `index-scheduler` (thus, if a restart happens, we are able to re-process the batch successfully)
- Multi-thread the deletion of all update files.


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 17:54:40 +00:00
meili-bors[bot] 9e664d87eb
Merge #4443
4443: Add GPU analytics r=dureuill a=dureuill

# Pull Request

## Related issue

Adds analytics indicating whether Meilisearch  was compiled with the `milli/cuda` feature.

Cc `@macraig` 

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-26 17:13:45 +00:00
meili-bors[bot] 6dcb5219a0
Merge #4442
4442: Send custom task r=ManyTheFish a=irevoire

This PR has already been merged on main but was supposed to be merged on `release-v1.7.0` thus we need to merge it a second time; sorry 😓 

### This PR implements the necessary parameters for the High Availability

Introduce a new CLI flag called `--experimental-replication-parameters` that changes a few behaviors in the engine:
- [The auto-deletion of tasks is disabled](https://specs.meilisearch.com/specifications/text/0060-tasks-api.html#_2-technical-details)
- Upon registering a task, you can choose its task ID by sending a new header: `TaskId: 456645`. It must be a valid number, which must be superior to the last task id ever seen.
- Add the ability to « dry-register » a task. That means meilisearch will answer to you with a valid task ID like everything went well, but won’t actually write anything in the database. To do that, you need to use the `DryRun: true` header.
- Specification’s here: https://github.com/meilisearch/specifications/pull/266

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 15:20:16 +00:00
ManyTheFish 5e83bac448 Fix PR comments 2024-02-26 15:40:15 +01:00
Tamo 0562818c2a fix and remove the file-store hack of /dev/null 2024-02-26 13:59:41 +01:00
Tamo a478392b7a create a test with the dry-run parameter enabled 2024-02-26 13:59:41 +01:00
Tamo bbf3fb88ca rename the cli parameter 2024-02-26 13:59:40 +01:00
Tamo 60510e037b update the discussion link 2024-02-26 13:58:04 +01:00
Tamo 36c27a18a1 implement the dry run ha parameter 2024-02-26 13:58:04 +01:00
Tamo 1eb1c043b5 disable the auto deletion of tasks when the ha mode is enabled 2024-02-26 13:58:04 +01:00
Tamo 507739bd98 add an experimental cli parameter to allow specifying your task id 2024-02-26 13:58:03 +01:00
Tamo eb25b07390 let you specify your task id 2024-02-26 13:56:31 +01:00
meili-bors[bot] 938149f814
Merge #4042
4042: Implements the new replication parameters r=ManyTheFish a=irevoire

### This PR implements the necessary parameters for the High Availability

- [ ] Update the spec

Introduce a new CLI flag called `--experimental-replication-parameters` that changes a few behaviors in the engine:
- [The auto-deletion of tasks is disabled](https://specs.meilisearch.com/specifications/text/0060-tasks-api.html#_2-technical-details)
- Upon registering a task, you can choose its task ID by sending a new header: `TaskId: 456645`. It must be a valid number, which must be superior to the last task id ever seen.
- Add the ability to « dry-register » a task. That means meilisearch will answer to you with a valid task ID like everything went well, but won’t actually write anything in the database. To do that, you need to use the `DryRun: true` header.

----

Old prototype `prototype-custom-task-id-0`:
-  Adds the capability to specify your own task ID via the `TaskId` http header
- Make the task IDs a u64 instead of a u32


Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-26 11:37:34 +00:00
Tamo 066a7a3cde takes only one read transaction per thread 2024-02-26 10:43:04 +01:00
Louis Dureuil 55796406c5
Add GPU analytics 2024-02-26 10:41:47 +01:00
Tamo eb90f0b4fb fix and remove the file-store hack of /dev/null 2024-02-26 10:19:07 +01:00
Tamo c2e2003a80 create a test with the dry-run parameter enabled 2024-02-22 15:51:47 +01:00
Tamo 91cdd502f8 When processing tasks, make the update file deletion atomic 2024-02-22 14:56:22 +01:00
ManyTheFish a493a50825 Fix clippy 2024-02-22 14:53:33 +01:00
ManyTheFish 9d1f489a37 Fix facet incremental indexing 2024-02-21 18:42:16 +01:00
Tamo 693ba8dd15 rename the cli parameter 2024-02-21 14:33:40 +01:00
Tamo e1a3eed1eb update the discussion link 2024-02-21 12:30:28 +01:00
Tamo 05ae291989 implement the dry run ha parameter 2024-02-21 11:21:26 +01:00
Tamo 6ba9994916 disable the auto deletion of tasks when the ha mode is enabled 2024-02-20 12:23:39 +01:00
Tamo 01ae46dd80 add an experimental cli parameter to allow specifying your task id 2024-02-20 11:24:44 +01:00
meili-bors[bot] 12f5389ba7
Merge #4416
4416: Create automation when creating Milestone to create update-version issue r=curquiza a=curquiza

Following our discussion `@irevoire` -> we miss reminder to update cargo version BEFORE rc0

Issue template [here](https://github.com/meilisearch/engine-team/blob/main/issue-templates/update-version-issue.md)

Co-authored-by: curquiza <clementine@meilisearch.com>
2024-02-20 08:47:29 +00:00
Tamo 9ee4f55e6c let you specify your task id 2024-02-19 14:29:33 +01:00
ManyTheFish 865b415b3f Add test rerpoducing bug 2024-02-15 16:00:48 +01:00
meili-bors[bot] 5ee6aaddc4
Merge #4418
4418: Output logs to stderr r=dureuill a=irevoire

Output the logs to `stderr` instead of `stdout`. This was introduced in the `v1.7.0-rc.0` and is a bug; logs should always be outputted to stderr.

Fix #4419

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 14:31:37 +00:00
Tamo 4148d391b8 move logs to stderr 2024-02-15 15:24:16 +01:00
meili-bors[bot] 88c6165e20
Merge #4410
4410: Implement the experimental log mode cli flag and log level updates at runtime r=dureuill a=irevoire

# Pull Request
This PR fixes two issues at once because they’re highly correlated in the codebase.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4415
Fixes https://github.com/meilisearch/meilisearch/issues/4413

## What does this PR do?
- It makes the fmt logger configurable to output json or human-readable logs (like we already do today)
- It moves the fmt logger under a `reload` layer so we can update its targets at runtime
- Add the possibility to stream logs in the json mode
- Adds an analytics for the new CLI flag

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-02-15 10:01:06 +00:00
Tamo d097431113
Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:58:43 +01:00
Tamo 1f8af81ba9 update the log mode discussion link 2024-02-15 10:32:48 +01:00
Tamo 5d3bad4120
Update meilisearch/src/option.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-02-15 10:31:23 +01:00
meili-bors[bot] d34692e30b
Merge #4365
4365: Update charabia r=dureuill a=ManyTheFish

Update Charabia v0.8.7,

- Add Vietnamese Normalization (Ð and Đ into d)

Fixes #4357

Charabia versions:
- https://github.com/meilisearch/charabia/releases/tag/v0.8.6
- https://github.com/meilisearch/charabia/releases/tag/v0.8.7

Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-02-14 16:57:25 +00:00
curquiza 024de0dcf8 Create automation when creating Milestone to create update-version issue 2024-02-14 17:36:47 +01:00
Tamo a081da0d90 add support for the json format in the stream route 2024-02-14 15:34:39 +01:00
ManyTheFish 78e04520fc Update charabia version 2024-02-14 15:16:16 +01:00
meili-bors[bot] 72c1674a31
Merge #4350
4350: Make several indexing optimizations r=Kerollmops a=ManyTheFish

# Summary

Implement several enhancements to reduce the indexing time.

# Steps

- Compute the indexing chunk size dynamically based on the available threads and the data size
- Remove the merging step before the writing step and merge at the writing time
- Remove append function
- Make Facet search indexing incremental

# Running Indexing process

## `main`
Each type of data is written after a merging phase:
![Capture d’écran 2024-01-23 à 10 18 08](https://github.com/meilisearch/meilisearch/assets/6482087/6203c3ce-407c-46b4-8b83-04282da1bb16)

> Highlighted parts are the writings

## `remove-merging-phase-from-indexing`
When the extraction of a chunk is finished, the data is written:
![Capture d’écran 2024-01-23 à 10 18 18](https://github.com/meilisearch/meilisearch/assets/6482087/ab1307b4-d0a9-42ac-abbb-fdeb27ddf0d4)

> Highlighted parts are the writings

## Related

This PR removes the appending writes on several indexing parts, which may fix https://github.com/meilisearch/meilisearch/issues/4300. However, all of the appending writes are not removed. There are 2 remaining calls that could trigger this bug:
- When [putting embedders in the settings](b6fc181993/milli/src/update/settings.rs (L996))
- when [bulk indexing the facets](b6fc181993/milli/src/update/facet/bulk.rs (L150))


Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-02-14 14:12:48 +00:00
ManyTheFish 03bb6372af Change is_batchable_with by mergeable_with 2024-02-14 11:50:22 +01:00
ManyTheFish 3beda8833d Fix and add logs 2024-02-14 11:46:30 +01:00
Tamo 3b6544db6d Implement the experimental log mode cli flag 2024-02-13 18:09:15 +01:00
ManyTheFish 55e942cd45 buggy 2024-02-13 15:26:30 +01:00
ManyTheFish 48026aa75c fix PR comments 2024-02-13 15:19:01 +01:00
Many the fish e5e811e2c9
Update milli/src/update/index_documents/extract/mod.rs
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-13 14:22:21 +01:00