closes CNVS-6016
No more error reports! (soon)
this commit builds up sentry integration through the new
Canvas::Errors module, along with other things that need
to happen on every exception. ErrorReports
should now get pushed towards just being used for representing
a complaint a user filed via the get help form.
I fixed about half the things that got linted as well
while I was in here, but because this touches to much
I fear divergence from tackling too many (I think we
can safely say it's "better than we found it")
I left a lot of the infrastructure for error reports in place
until other commits for plugins can be merged
TEST PLAN:
1) setup your raven.yml config file with the dsn for our
sentry install
2) force an error to happen in a request response cycle.
3) see the error in sentry
4) force an error to happen in a job
5) see the error in sentry
6) statsd increments shoudl still fire
7) for the moment, an error report should still get created.
Change-Id: I5a9dc7214598f8d5083451fd15f0423f8f939034
Reviewed-on: https://gerrit.instructure.com/51621
Reviewed-by: Simon Williams <simon@instructure.com>
Reviewed-by: Brian Palmer <brianp@instructure.com>
Tested-by: Jenkins
QA-Review: August Thornton <august@instructure.com>
Product-Review: Ethan Vizitei <evizitei@instructure.com>
refs CNVS-15570
test plan: trigger an ignored redis error, such as running
`Canvas.redis.renamenx('no-such-key', 'new-key')` in console, and verify
that rails doesn't log a 'Failure handling redis' message, or send
anything to statsd.
Change-Id: Ie4480fcf053d626ba8dbdf26672b18815267fa86
Reviewed-on: https://gerrit.instructure.com/41805
Reviewed-by: Cody Cutrer <cody@instructure.com>
Tested-by: Jenkins <jenkins@instructure.com>
Product-Review: Brian Palmer <brianp@instructure.com>
QA-Review: Brian Palmer <brianp@instructure.com>
Matching on "ERR" isn't enough, as some normal operations such as
renaming a non-existent key return a CommandError with "ERR" in the
description.
fixes CNVS-15570
test plan:
* Create a new assignment or update the due date of an existing
assignment as a teacher. Make sure to publish the assignment.
* Go to the course dashboard / home page as both the student and
teacher. The new assignment should show up in the recent activity.
Change-Id: Ia3c48d2bdb7e0efefb40e97af90db472a7799953
Reviewed-on: https://gerrit.instructure.com/41752
Reviewed-by: Cody Cutrer <cody@instructure.com>
Tested-by: Jenkins <jenkins@instructure.com>
QA-Review: Amber Taniuchi <amber@instructure.com>
Product-Review: Brian Palmer <brianp@instructure.com>
Some command errors aren't application logic errors, but things such as
"max number of clients reached", and so should be treated like a
connection failure. Since it's the same exception class, the best we can
do is match by error message.
fixes CNVS-14979
test plan: to really test the failure case, you'll have to limit max
clients. i did this locally by:
* start up canvas
* edit the redis config to set maxclients to 1
* restart redis
* open a redis-cli and execute 'get x' or something to grab the 1 conn
* use the canvas web ui or console and verify that you don't get an
error, rather the cache is blacklisted
Change-Id: I64360d165575ab0ef54c9c6d08dec8aa1afebad4
Reviewed-on: https://gerrit.instructure.com/40051
Tested-by: Jenkins <jenkins@instructure.com>
QA-Review: August Thornton <august@instructure.com>
Reviewed-by: Jacob Fugal <jacob@instructure.com>
Product-Review: Brian Palmer <brianp@instructure.com>
test plan:
* configure a ring of redis cache servers
* make sure at least one is not accessibly
* login to canvas
* it should not error
Change-Id: I46c464649438225f080f7097dbfc996260aea6cd
Reviewed-on: https://gerrit.instructure.com/28470
Tested-by: Jenkins <jenkins@instructure.com>
QA-Review: August Thornton <august@instructure.com>
Reviewed-by: Brian Palmer <brianp@instructure.com>
Product-Review: Cody Cutrer <cody@instructure.com>
now that we have SIGHUP, we were changing everything to it anyway,
so just let caching in-proc be the default
Change-Id: Id1b44722522ac9693b17695da7107c99a359d5ac
Reviewed-on: https://gerrit.instructure.com/25020
Reviewed-by: Cody Cutrer <cody@instructure.com>
Product-Review: Cody Cutrer <cody@instructure.com>
QA-Review: Cody Cutrer <cody@instructure.com>
Tested-by: Jenkins <jenkins@instructure.com>
The redis_name is usually a fqdn, which was creating very annoying
folder structures
Change-Id: If8aecd3a523673321406fedc8769a92eee6d1a78
Reviewed-on: https://gerrit.instructure.com/22744
Tested-by: Jenkins <jenkins@instructure.com>
Reviewed-by: Cody Cutrer <cody@instructure.com>
Product-Review: Brian Palmer <brianp@instructure.com>
QA-Review: Brian Palmer <brianp@instructure.com>
fixes CNVS-7021
test plan:
* have two separete redis servers (one being localhost and one being
soemthing that does exist is sufficient) configured in
cache_store.yml
* make sure one is inaccessible (i.e. it doesn't exist)
* run canvas. always reload every page. inspect your logs - on the
second request, approximately half of the cache lines should be
a cache hit, and half a cache miss
* you can be more fine grained by doing Rails.cache.write('key',
true); Rails.cache.fetch('key') in script/console for different
keys. Half of the time it should return true, and half of the time
it should return nil.
Change-Id: I85898e9ac5e01c01d042ce7340ad463865a0ba73
Reviewed-on: https://gerrit.instructure.com/22661
Tested-by: Jenkins <jenkins@instructure.com>
Reviewed-by: Jacob Fugal <jacob@instructure.com>
Reviewed-by: Brian Palmer <brianp@instructure.com>
QA-Review: Jeremy Putnam <jeremyp@instructure.com>
Product-Review: Cody Cutrer <cody@instructure.com>
This required building our own fork of the redis-store gem so that we
could update its dependency, and fix one small issue with redis connect
strings getting nil instead of the default value for the port number.
The redis 3.0.x gem now catches all Errno and Timeout errors and
re-raises them as subclasses of Redis::BaseConnectionError. It also now
handles EAGAIN internally, retrying when appropriate. So we've modified
our redis failure handling code to match.
test plan: verify the redis failure handling code still works (specs
pass). for instance, stop redis locally and see that canvas works in the
degraded state. make sure that redis still works for both caching and
non-caching code such as login attempts.
Change-Id: I9e8d3929afa06c522656d30f71efc0427e4ef7cc
Reviewed-on: https://gerrit.instructure.com/11521
Tested-by: Jenkins <jenkins@instructure.com>
Reviewed-by: Cody Cutrer <cody@instructure.com>
test plan: turn redis caching on but don't have redis running, hit a
page, verify the generated error
Change-Id: Iddd525ed468abdcf0cce2dba9becc65e2d5aaa84
Reviewed-on: https://gerrit.instructure.com/9309
Reviewed-by: Cody Cutrer <cody@instructure.com>
Tested-by: Hudson <hudson@instructure.com>
Hook into the redis library at a pretty low level, to try and do
everything we can to avoid erroring if redis goes down. This applies to
both redis-as-cache and redis-as-data-store.
test plan: Set up redis and caching in your local instance. Point it to
both an existing box on a port not running redis, and a non-existent IP.
In both situations, you should not see caching errors or redis data
errors. After the first error, it shouldn't attempt to hit redis again for 5
minutes.
Change-Id: I101b2d3d2123151b244eb82ba78b176ed1f4d5ad
Reviewed-on: https://gerrit.instructure.com/8097
Tested-by: Hudson <hudson@instructure.com>
Reviewed-by: Cody Cutrer <cody@instructure.com>
This uses redis to store the nonces as locks that expire after 90
minutes. Timestamps are epoch UTC values, as per the oauth spec.
testplan: send oauth requests to the api endpoint with the same nonce
more than once, or with a too-old timestamp
refs #5892
Change-Id: Id6130c2a07e206dad716673aa6adbe9d36565a7c
Reviewed-on: https://gerrit.instructure.com/6683
Tested-by: Hudson <hudson@instructure.com>
Reviewed-by: Brian Whitmer <brian@instructure.com>