There is presently no clean way of telling a caller of `perform_later`
the reason why a job failed to enqueue. When the job is enqueued
successfully, the job object itself is returned, but when the job can
not be enqueued, only `false` is returned. This does not allow callers
to distinguish between classes of failures.
One important class of failures is when the job backend experiences a
network partition when communicating with its underlying datastore. It
is entirely possible for that network partition to recover and as such,
code attempting to enqueue a job may wish to take action to reenqueue
that job after a brief delay. This is distinguished from the class of
failures where due a business rule defined in a callback in the
application, a job fails to enqueue and should not be retried.
This PR changes the following:
- Allows a block to be passed to the `perform_later` method. After the
`enqueue` method is executed, but before the result is returned, the
job will be yielded to the block. This allows the code invoking the
`perform_later` method to inspect the job object, even in failure
scenarios.
- Adds an exception `EnqueueError` which job adapters can raise if they
detect a problem specific to their underlying implementation or
infrastructure during the enqueue process.
- Adds two properties to the job base class: `successfully_enqueued` and
`enqueue_error`. `enqueue_error` will be populated by the `enqueue`
method if it rescues an `EnqueueError` raised by the job backend.
`successfully_enqueued` will be true if the job is not rejected by
callbacks and does not cause the job backend to raise an
`EnqueueError` and will be `false` otherwise.
This will allow developers to do something like the following:
MyJob.perform_later do |job|
unless job.successfully_enqueued?
if job.enqueue_error&.message == "Redis was unavailable"
# invoke some code that will retry the job after a delay
end
end
end
Use inheritance to keep the behavior in the right modules.
The order of Instrumentation and Logging had to change to be
flipped to keep the current behavior.
Before this commit, only StandardError exceptions can be handled by
rescue_from handlers.
This changes the rescue clause to catch all Exception objects, allowing
rescue handlers to be defined for Exception classes not inheriting from
StandardError.
This means that rescue handlers that are rescuing Exceptions outside of
StandardError exceptions may rescue exceptions that were not being
rescued before this change.
Co-authored-by: Adrianna Chang <adrianna.chang@shopify.com>
The implementaiton of `instrument` in `ActiveJob::Instrumentation` was
not keeping the API of `ActiveSupport::Notification.instrument` and
returning the value of the block.
Fixes#40931.
This change broke config.active_job.queue_name_prefix with eager-loading enabled (i.e. in production, by default).
This reverts commit a173a65730, reversing
changes made to 89414f561a.
After upgrading to Rails 6.1 I'm getting a `undefined method `assert_nothing_raised'` using `ActiveJob::TestHelpers` and `assert_enqueued_jobs` in RSpec.
Adding this module fixes it, however not knowing the internals too well, I'm unsure whether it's the right fix.
Before #34953, when using the `:async` Active Job queue adapter, jobs
enqueued in `db/seeds.rb`, such as Active Storage analysis jobs, would
cause a hang (see #34939). Therefore, #34953 changed all jobs enqueued
in `db/seeds.rb` to use the `:inline` queue adapter instead. (This
behavior was later limited to only take effect when the `:async` adapter
was configured, see #35905.) However, inline jobs in `db/seeds.rb`
cleared `CurrentAttributes` values (see #37526). Therefore, #37568
changed the `:inline` adapter to wrap each job in its own thread, for
isolation. However, wrapping a job in its own thread affects which
database connection it uses. Thus inline jobs can no longer execute
within the calling thread's database transaction, including seeing any
uncommitted changes. Additionally, if the calling thread is not wrapped
with the executor, the inline job thread (which is wrapped with the
executor) can deadlock on the load interlock. And when testing (with
`connection_pool.lock_thread = true`), the inline job thread can
deadlock on one of the locks added by #28083.
Therefore, this commit reverts the solutions of #34953 and #37568, and
instead wraps evaluation of `db/seeds.rb` with the executor. This
eliminates the original hang from #34939, which was also due to running
multiple threads and not wrapping all of them with the executor. And,
because nested calls to `executor.wrap` are ignored, any inline jobs in
`db/seeds.rb` will not clear `CurrentAttributes` values.
Alternative fix for #34939.
Reverts #34953.
Reverts #35905.
Partially reverts #35896.
Alternative fix for #37526.
Reverts #37568.
Fixes#40552.
This pull request workarounds Active Job integration test with `resque` reported at https://buildkite.com/rails/rails/builds/72277#410533b8-0676-4ad2-900f-fa8131f9833f/1826
This fix is similar to 7cf8e30 and rails/rails#40451 which is due to https://bugs.ruby-lang.org/issues/17220
- Steps to reproduce
```
- Install Docker
$ git clone https://github.com/rails/rails
$ cd rails
$ git clone https://github.com/rails/buildkite-config .buildkite/
$ RUBY_IMAGE=rubylang/ruby:master-nightly-bionic docker-compose -f .buildkite/docker-compose.yml build base && CI=1 docker-compose -f .buildkite/docker-compose.yml run activejob runner activejob 'AJ_ADAPTER=resque AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 36344'
```
- Without this commit
```
$ RUBY_IMAGE=rubylang/ruby:master-nightly-bionic docker-compose -f .buildkite/docker-compose.yml build base && CI=1 docker-compose -f .buildkite/docker-compose.yml run activejob runner activejob 'AJ_ADAPTER=resque AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 36344'
... snip ...
+++ activejob: AJ_ADAPTER=resque AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 36344
Using resque
/usr/local/lib/ruby/gems/3.0.0/gems/resque-scheduler-4.4.0/lib/resque/scheduler/lock/base.rb:50: warning: Socket.gethostbyname is deprecated; use Addrinfo.getaddrinfo instead.
Run options: --seed 36344
.F
Failure:
QueuingTest#test_should_run_jobs_enqueued_on_a_listening_queue [/rails/activejob/test/integration/queuing_test.rb:14]:
Expected false to be truthy.
bin/test test/integration/queuing_test.rb:11
F
Failure:
QueuingTest#test_current_timezone_is_kept_while_running_perform_later [/rails/activejob/test/integration/queuing_test.rb:119]:
Expected false to be truthy.
bin/test test/integration/queuing_test.rb:110
.SSF
Failure:
QueuingTest#test_current_locale_is_kept_while_running_perform_later [/rails/activejob/test/integration/queuing_test.rb:102]:
Expected false to be truthy.
bin/test test/integration/queuing_test.rb:93
.F
Failure:
QueuingTest#test_should_run_job_enqueued_in_the_future_at_the_specified_time [/rails/activejob/test/integration/queuing_test.rb:76]:
Expected false to be truthy.
bin/test test/integration/queuing_test.rb:71
SSSSSS
Finished in 34.122641s, 0.4396 runs/s, 0.2344 assertions/s.
15 runs, 8 assertions, 4 failures, 0 errors, 8 skips
You have skipped tests. Run with --verbose for details.
$
```
- With this commit
```ruby
$ RUBY_IMAGE=rubylang/ruby:master-nightly-bionic docker-compose -f .buildkite/docker-compose.yml build base && CI=1 docker-compose -f .buildkite/docker-compose.yml run activejob runner activejob 'AJ_ADAPTER=resque AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 36344'
... snip ...
+++ activejob: AJ_ADAPTER=resque AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 36344
Using resque
/usr/local/lib/ruby/gems/3.0.0/gems/resque-scheduler-4.4.0/lib/resque/scheduler/lock/base.rb:50: warning: Socket.gethostbyname is deprecated; use Addrinfo.getaddrinfo instead.
Run options: --seed 36344
....SS...SSSSSS
Finished in 13.850658s, 1.0830 runs/s, 0.7220 assertions/s.
15 runs, 10 assertions, 0 failures, 0 errors, 8 skips
You have skipped tests. Run with --verbose for details.
$
```
This pull request addresses Active Job integration test with `sneakers` reported at https://buildkite.com/rails/rails/builds/72257#20014e66-6e08-47ae-a827-b71de7148306/2017
This fix is similar to 7cf8e30902 which is due to https://bugs.ruby-lang.org/issues/17220
- Steps to reproduce
Install ruby 3.0.0dev using ruby-build or whatever then execute these steps.
```ruby
$ git clone https://github.com/rails/rails
$ cd rails/activejob
$ bundle install
$ bundle exec rake test:integration:sneakers
```
- Result without this commit
```ruby
$ ruby -v
ruby 3.0.0dev (2020-10-24T13:53:53Z master 148961adcd) [x86_64-linux]
$ cd activejob
$ bundle exec rake test:integration:sneakers
... snip ...
/home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/adapters/sneakers.rb:49:in `rescue in start_workers': Failed to start sneakers worker (RuntimeError)
from /home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/adapters/sneakers.rb:41:in `start_workers'
from /home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/helper.rb:27:in `<top (required)>'
from /home/yahonda/src/github.com/rails/rails/activejob/test/helper.rb:12:in `require'
from /home/yahonda/src/github.com/rails/rails/activejob/test/helper.rb:12:in `<top (required)>'
from /home/yahonda/src/github.com/rails/rails/activejob/test/integration/queuing_test.rb:3:in `require'
from /home/yahonda/src/github.com/rails/rails/activejob/test/integration/queuing_test.rb:3:in `<top (required)>'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:17:in `require'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:17:in `block in <main>'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:5:in `select'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:5:in `<main>'
/home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/adapters/sneakers.rb:44:in `sleep': execution expired (Timeout::Error)
from /home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/adapters/sneakers.rb:44:in `block in start_workers'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/3.0.0/timeout.rb:112:in `timeout'
from /home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/adapters/sneakers.rb:42:in `start_workers'
from /home/yahonda/src/github.com/rails/rails/activejob/test/support/integration/helper.rb:27:in `<top (required)>'
from /home/yahonda/src/github.com/rails/rails/activejob/test/helper.rb:12:in `require'
from /home/yahonda/src/github.com/rails/rails/activejob/test/helper.rb:12:in `<top (required)>'
from /home/yahonda/src/github.com/rails/rails/activejob/test/integration/queuing_test.rb:3:in `require'
from /home/yahonda/src/github.com/rails/rails/activejob/test/integration/queuing_test.rb:3:in `<top (required)>'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:17:in `require'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:17:in `block in <main>'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:5:in `select'
from /home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb:5:in `<main>'
rake aborted!
Command failed with status (1): [ruby -w -I"lib:test" -I"/home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib" "/home/yahonda/.rbenv/versions/3.0.0-dev/lib/ruby/gems/3.0.0/gems/rake-13.0.1/lib/rake/rake_test_loader.rb" "test/integration/queuing_test.rb" ]
/home/yahonda/.rbenv/versions/3.0.0-dev/bin/bundle:23:in `load'
/home/yahonda/.rbenv/versions/3.0.0-dev/bin/bundle:23:in `<main>'
Tasks: TOP => test:integration:sneakers
(See full trace by running task with --trace)
$
```
- Result with this commit
```ruby
$ bundle exec rake test:integration:sneakers
... snip ...
15 runs, 6 assertions, 0 failures, 0 errors, 11 skips
```
Sometimes cascading association deletions can cause timeouts due to
an IO issue. Perhaps a model has associations that are destroyed on
deletion which in turn trigger other deletions and this can continue
down a complex tree. Along this tree you may also hit other IO
operations. Such deep deletions can lead to server timeouts while
awaiting completion and really the user may not notice all the
changes on their side immediately making them wait unnecesarially or
worse causing a timeout during the operation.
We now allow associations supporting the `dependent:` key to take `:destroy_async`,
which schedules a background job to destroy associations.
Co-authored-by: Adrianna Chang <adrianna.chang@shopify.com>
Co-authored-by: Rafael Mendonça França <rafael@franca.dev>
Co-authored-by: Cory Gwin @gwincr11 <gwincr11@github.com>
PR #33995 added support for specifying the `args` argument of
`assert_enqueued_with` and `assert_performed_with` as a matcher proc.
In doing so, it added undocumented support for specifying the other
arguments as matcher procs as well. This commit officially documents
that support, and adds tests to ensure the behavior.
The example tests in the method docs for `assert_enqueued_with` and
`assert_performed_with` previously specified `queue` args in their
assertions but not in their setups. Thus, the example tests would not
pass as written. This commit fixes the examples, and properly
demonstrates the `queue` arg.
assert_enqueued_with with a block ignores all the jobs enqueued before
the block for its assertions by counting the number of jobs and dropping
the n first elements from the Array, but since we're now mutating the
Array in perform_enqueued_jobs without a block, it's broken.
This uses another implementation which is correct when the array is
mutated, by getting a duplicated array of jobs, then removing them from
the original array.
Similarly assert_enqueued_jobs with a block was using counts only, now
keeps track of the specific jobs to count them at the end.
Example failure: https://buildkite.com/rails/rails/builds/68010#749ce70d-1e01-4419-90e5-ee4531f66466/1057-1069
Support for testing jobs with a relative time delay was added in #36767.
It was implemented by truncating the `usec` portion of the `at` time in
order to allow for sub-second time differences. However, sub-second
time differences can occur across seconds. Thus, this commit changes
`at` time matching to use an explicit time range.
I guess `NO` is pretty self-explanatory here. But, to be consistent, this commit
describes what does **NO** mean in the context of: `retries, timeaout and priorities`.
This makes sure jobs don't run twice if `perform_enqueued_jobs` is
called twice without a block.
This also mimics the behavior of using `perform_enqueued_jobs` with a
block, where at the end of the block performed jobs are not in
`enqueued_jobs` but instead in `performed_jobs`.
- ### Problem
If we use `perform_enqueued_jobs` without a block, a job that
uses a retry mechanism to reeenqueue itself would get performed
right away.
This behaviour make sense when using `perform_enqueued_jobs` with
a block.
However I'm expecting `perform_enqueued_jobs` without a block to
perform jobs that are **already** in the queue not the ones that
will get enqueued afterwards.
### Solution
Dup the array of jobs given to avoid future mutation.
- ### Problem
If we use `perform_enqueued_jobs` without a block,
a job that raises an error wouldn't be appended to
the list of `performed_jobs`.
### Solution
Push the job in the array before it is actually performed.
author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530
committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159505 +0530
parent 6d0895a489
author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530
committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159327 +0530
Randomize jitter
- This PR attempts to fix a problem with ActiveJob jitter where the `determine_jitter_for_delay` value may not always be randomized. Especially when the jitter delay multplier is between 1 and 2 it always returns 0.
- With this change, we pass a range to `Kernel.rand` beginning with 0 to the `jitter_multiplier`. With positive float values, the return value will be a random float number from the range.
- Includes test cases to verify random wait time when the jitter_multiplier is between 1 and 2.
- Updated relevant test cases stubbing the `Kernel.rand` method, refactored some by removing unwanted stubs for `Kernel.rand` method where jitter is falsey.
Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case
Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case
Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case
Review updates - separated test cases for random wait time with default and exponentially retrying jobs
- Another test case added to make sure negative wait time does not affect the randomization
Review updates
- Instead of using Kernel.rand with range, used simple multiplication with Kernel.rand for calculating delay for jitter
- Updates to the tests according to changes
- ### Problem
In some cirumstances, the deprecation message to warn that AJ won't
run `after_(enqueue/perform)` callbacks when the chain is halted
by a `throw(:abort)` will be thrown even though no `throw(:abort)`
was thrown.
```ruby
run_callback(:foo) do
...
end
```
There is two possible way for the callback body to not be executed:
1) `before` callback throw a `abort`
2) `before` callback raises an error which is rescued by an
around callback (See associated test in this commit for
an example)
When 2) happen we don't want to output a deprecation message,
because what the message says isn't true and doesn't apply.
### Solution
In order to differentiate between 1) and 2), I have added
a `halted_callback_hook` which is called by ActiveSupport callback
whenever the callback chain is halted.
- The `halted_callback_hook` method is called whenever the
`terminator` halt the callback execution.
Usually, this translate to when a `before` callback throw
an `:abort`.
<details>
<summary> Example </summary>
```ruby
class Foo
include ActiveSupport::Callbacks
define_callbacks :save
set_callback(:save, :before) { throw(:abort) }
def run
run_callbacks(:save) do
'hello'
end
end
def halted_callback_hook(filter)
# filter is the proc passed to `set_callback` above
end
end
```
</details>
### Problem
When a class has multiple callbacks, (i.e. `save`, `validate` ...),
it's impossible to tell in the halted_callback_hook which type of
callback halted the execution.
This is useful to take different action based on the callback.
<details>
<summary> Use Case </summary>
```ruby
class Foo
include ActiveSupport::Callbacks
define_callbacks :save
define_callbacks :validate
set_callback(:save, :before) { throw(:abort) }
set_callback(:validate, :before) { throw(:abort) }
def run
run_callbacks(:validate) do
...
end
run_callbacks(:save) do
...
end
end
def halted_callback_hook(filter)
Rails.logger.warn("Couldn't save the record, the ??? callback halted the execution")
end
end
```
</details>
### Solution
Allow `halted_callback_hook` to receive a second argument which is
the name of the callback being run.
- ### Problem
Given the below example the test adapter will retry the job
indefinitely:
```ruby
class BuggyJob < ActiveJob::Base
retry_on(Exception, attempts: 2)
def perform
raise "error"
end
end
BuggyJob.perform_later
perform_enqueued_jobs
```
The problem is that when the job get retried, the
`exception_executions` variable is not serialized/deserialized,
resulting in ActiveJob to not be able to determine how many time
this job was retried.
The solution in this PR is to deserialize the whole job in the test
adapter, and reserialize it before retrying.
Fix#38391
- I made a change in 0d3aec4969 to output a log if a job was aborted
in a before callbacks. I didn't take in consideration that a job
could return a falsy value and thus it would wrongly log
that the job was aborted.
This fixes the problem by checking if the callback chain was halted
rather than the return value of the job.
- Since this is going to be the default in 6.1, let's set it in the
test suite to avoid deprecation warning.
Otherwise one has to do `AS::Deprecation.silence { }` everytime we
add a new test.
Fix#38107
* Add failing ActiveJob exceptions test for "disable retry jitter"
Thanks to @kaspth for the starting point.
* Update ActiveJob retry jitter to correctly use zero value
* Simplify "disable retry jitter" test
We don't need to repeat this many times. Fewer is shorter.
* Refactor determine_delay with jitter
* Fix indentation
* Close the curtains and give JITTER_DEFAULT some privacy
* Use .zero? instead of == to check jitter value
* Add ActiveJob test for explicit zero jitter
Co-authored-by: Kasper Timm Hansen <hey@kaspth.com>
Co-authored-by: Cliff Pruitt <cliff.pruitt@cliffpruitt.com>
ActiveJob will log the entire backtrace when one of the enqueue
callbacks fail. This is not necessary and increases noise. When there is
an Exception object, the object gets bubbled up eventually anyways.
Remove `Array(ex.backtrace).join("\n")` from `def enqueue` and `def
enqueue_at`.
The existing message only mentioned one type of before/after callback,
but the config was named generally. That mismatch is confusing and users
wouldn't necessarily know what the total effect of the config would be.
So instead of handwriting the deprecation warning in the specific instances,
consolidate it in one place and give the appropriate context. That context
is the above, but also that users shouldn't update their app config,
they should uncomment the line in the new defaults file, which now also
has more context.
I'm not totally convinced that we can't move this to when
`after_enqueue`/`after_perform` is called in the job class. Doesn't
seem worth it to blare this after every job enqueue/perform, when we
the score at boot time.
cc @Edouard-chin
-
### Problem
In rails/rails@bbfab0b33a I introduced a change which outputs
a deprecation whenever a class inherits from ActiveJob::Base.
This has the negative effect to trigger a massive amount of
deprecation at boot time especially if your app is eagerloaded
(like it's usually the case on CI).
Another issue with this approach was that the deprecation will
be output no matter if a job define a `after_perform` callbacks
i.e.
```ruby
class MyJob < AJ::Base
before_enqueue { throw(:abort) }
end
# This shouldn't trigger a deprecation since no after callbacks are defined
# The change in 6.2 will be already safe for the app.
```
### Solution
Trigger the deprecation only when a job is abort
(during enqueuing or performing) AND a `after_perform`
callback is defined on the job.
- ### Problem
```ruby
MyJob < ApplicationJob
before_enqueue { throw(:abort) }
after_enqueue { # enters here }
end
```
I find AJ behaviour on after_enqueue and after_perform callbacks
weird as they get run even when the callback chain is halted.
It's counter intuitive to run the after_enqueue callbacks even
though the job wasn't event enqueued.
### Solution
In Rails 6.2, I propose to make the new behaviour the default
and stop running after callbacks when the chain is halted.
For application that wants this behaviour now or in 6.1
they can do so by adding the `config.active_job.skip_after_callbacks_if_terminated = true`
in their configuration file.
- ### Problem
ActiveJob will always log "Enqueued MyJob (Job ID) ..." even
if the job doesn't get enqueued through the adapter.
Same problem happens when performing a Job, "Performed MyJob (Job ID) ..." will be logged even when job wasn't performed at all.
This situation can happen either if the callback chain is terminated
(before_enqueue throwing an `abort`) or if an exception is raised.
### Solution
Check if the callback chain is aborted/exception is raised, and log accordingly.
Follow up to https://github.com/rails/rails/pull/31872
Particularly, the closing `</tt>` tag of the documented wait algorithm had a misplaced `<`. Threw in a space between `15% (0.15)` for good measure.
[ci skip]
Prior to this change, exponentially_longer had adverse consequences
during system-wide downstream failures. This change adds a random value to the
back off calculation in order to prevent the thundering herd
problem, whereby all retry jobs would retry at the same time.
Specifically this change adds a jitter option to retry_on to enable users of it to
scope the randomness calculation to a reasonable amount. The default is
15% of the exponential back off calculation.
class SensitiveJob < ApplicationJob
self.log_arguments = false
def perform(my_sensitive_argument)
end
end
When dealing with sensitive arugments as password and tokens it is
now possible to configure the job to not put the sensitive argument
in the logs.
Closes#34438.