occasionally the sqs queue that we pipe our feature analytics data
to backs up and the timeout_protection call we wrap with raises
this data to sentry.
as long as the pipeline is working generally we are fault tolerant
and don't care too much about missing messages due to occasional
backup. this prevents those timeouts from spamming sentry so much.
flag=none
refs DE-939
Change-Id: I4110af160dcfe68120b71cc877f417203cca5575
Reviewed-on: https://gerrit.instructure.com/c/canvas-lms/+/281445
Tested-by: Service Cloud Jenkins <svc.cloudjenkins@instructure.com>
Reviewed-by: Isaac Moore <isaac.moore@instructure.com>
QA-Review: Ryan Norton <rnorton@instructure.com>
Product-Review: Ryan Norton <rnorton@instructure.com>
This PS adds logic to compute context about a feature flag being
evaluated at the time it is checked, and pushed to an SQS queue if
configured. Whether or not to push the result of the evaluation to the
queue is both sampled and cached so as to not overwhelm the queue or
produce too much overhead on the Canvas side. The general process is:
1. Feature flag is evaluated
2. A random value between 0-1 is checked against a threshold stored in
the dynamic setting feature_analytics.sampling_rate.
3. If under the threshold, the LocalCache is checked for any recent
evaluations of the same flag / context / value combo.
4. If the result isn't already cached, the message is sent to SQS.
5. If anything unexpectedly goes wrong, an error is sent to Sentry and
feature evaluation continues normally (no user-facing error).
refs DE-926
flag = none
Test plan:
- Set up config in dynamic_settings.yml like the following:
feature_analytics:
cache_expiry: '60'
sampling_rate: '1.0'
queue_url: <canvas-feature-analytics SQS queue url in inseng>
region: 'us-west-2'
- Start canvas from a vaulted shell with inseng credentials
- Go to a user's dashboard, a few coures pages, etc..
- Expect to see messages show up in the SQS queue / its consumers
after a minute or two delay
- Expect to not see the same feature / context / value combo show up
on repeated evaluations
- Update the sampling_rate in dynamic_settings.yml to '0.0' and
restart canvas
- Flush redis cache
- Go to the same pages as before
- Expect to not see any more evaluations show up in SQS
Change-Id: I189e127dd5f31dc5d504bf45d354e93f7208107a
Reviewed-on: https://gerrit.instructure.com/c/canvas-lms/+/278885
Tested-by: Service Cloud Jenkins <svc.cloudjenkins@instructure.com>
Reviewed-by: Simon Williams <simon@instructure.com>
QA-Review: Jeff Largent <jeff.largent@instructure.com>
Product-Review: Jeff Largent <jeff.largent@instructure.com>