- create a snapshot as well as a dump
- only detect inconsistencies in the facet -> document direction
- mark index as corrupted after creating snapshot and dump
- always abort tasks on indexes marked as corrupted
4746: Fix hybrid search limit offset r=irevoire a=dureuill
# Pull Request
## Related issue
Fixes#4745
## What does this PR do?
- Apply offset and limit to the keyword search results when they are returned early.
- Add a test that is initially failing, and then passes
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
4740: Make `embeddings` optional and improve error message for `regenerate` r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4741
## What does this PR do?
- Make the `embeddings` parameter optional when manually specifying embeddings for an embedder
- Adds a lot of tests around malformed `_vectors.embedder` objects
- Use `deserr` to deserialize the `_vectors.embedder` field, improving error messages
Co-authored-by: Tamo <tamo@meilisearch.com>
4715: Build all arroy indexes that need to be built r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4588
## What does this PR do?
- Update arroy
- Ensure we always rebuild the arroy indexes that need to be built
Co-authored-by: Tamo <tamo@meilisearch.com>
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops
This PR is akin to #4682, but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.
A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.
Co-authored-by: Clément Renault <clement@meilisearch.com>
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops
This PR fixes#4611.
### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
4649: Don't store the vectors in the documents database r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607
## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
- when the feature is disabled, documents are never modified
- when the feature is enabled and `retrieveVectors` is disabled, `_vectors` is removed from documents
- when the feature is enabled and `retrieveVectors` is enabled, vectors from the vectors DB are merged with `_vectors` in documents
Additionally `_vectors` is never displayed when the `displayedAttributes` list does not contain either `*` or `_vectors`
- fixed an issue where `_vectors` was not injected when all vectors in the dataset where always generated
4685: Fix ci tests r=dureuill a=ManyTheFish
# Pull Request
Make the all following CI succeed:
https://github.com/meilisearch/meilisearch/actions/runs/9477183091
## Related issue
Fixes#4629
## What does this PR do?
- Change the test behavior for `swedish-recomposition` feature flag
- Remove the `-v` parameter from grep
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>