smart search: index existing courses when feature enabled

currently only courses updated in the past year will be
indexed, to avoid doing a lot of unnecessary work on content
nobody is likely to search. this time period can be adjusted
with the smart_search_index_days_ago Setting.

test plan:
 - have a root account that has courses with page content
 - enable the smart search feature flag in the account
   (or toggle it off and on if it was on before)
 - an `OpenAi.index_course` job should be enqueued for
   each eligible course in the account, which will
   generate embeddings for all non-deleted pages
 - you can use `smart_search_index_course_num_strands`
   to allow multiple courses to be indexed at once

flag=smart_search
closes ADV-28

Change-Id: I3b3a2359f53dff9a4e1c9b736e63d376b2b1098f
Reviewed-on: https://gerrit.instructure.com/c/canvas-lms/+/331648
Tested-by: Service Cloud Jenkins <svc.cloudjenkins@instructure.com>
Reviewed-by: Jonathan Featherstone <jfeatherstone@instructure.com>
QA-Review: Jonathan Featherstone <jfeatherstone@instructure.com>
Product-Review: Jonathan Featherstone <jfeatherstone@instructure.com>
This commit is contained in:
Jeremy Stanley 2023-10-30 15:42:02 -06:00
parent 9cbfb3b8be
commit 4dcf8b7d06
3 changed files with 31 additions and 1 deletions

View File

@ -4,4 +4,4 @@ smart_search:
applies_to: RootAccount
display_name: Smart Search
description: AI driven smart search feature.
after_state_change_proc: smart_search_after_state_change_hook

View File

@ -107,5 +107,11 @@ module FeatureFlags
)
end
end
def self.smart_search_after_state_change_hook(_user, context, old_state, new_state)
if %w[off allowed].include?(old_state) && %w[on allowed_on].include?(new_state) && !context.site_admin?
OpenAi.index_account(context)
end
end
end
end

View File

@ -66,5 +66,29 @@ module OpenAi
vector_schema = ActiveRecord::Base.connection.extension("vector").schema
ActiveRecord::Base.connection.add_schema_to_search_path(vector_schema, &)
end
def index_account(root_account)
return unless smart_search_available?(root_account)
# by default, index all courses updated in the last year
date_cutoff = Setting.get("smart_search_index_days_ago", "365").to_i.days.ago
root_account.all_courses.active.where(updated_at: date_cutoff..).find_each do |course|
delay(priority: Delayed::LOW_PRIORITY,
singleton: "smart_search_index_course:#{course.global_id}",
n_strand: "smart_search_index_course").index_course(course)
end
end
handle_asynchronously :index_account, priority: Delayed::LOW_PRIORITY
def index_course(course)
return unless smart_search_available?(course.root_account)
# index non-deleted pages (that have not already been indexed)
course.wiki_pages.not_deleted
.where.missing(:wiki_page_embeddings)
.find_each do |page|
page.generate_embeddings(synchronous: true)
end
end
end
end