The problem was one hedge was committed before another had drained
start(). This was not testable because HedgingRunnable checks whether
scheduledHedgingRef is cancelled, which is racy, but there's no way to
deterministically trigger either race.
The same problem couldn't be triggered with retries because only one
attempt will be draining at a time. Retries with cancellation also
couldn't trigger it, for the surprising reason that the noop stream used
in cancel() wasn't considered drained.
This commit marks the noop stream as drained with cancel(), which allows
memory to be garbage collected sooner and exposes the race for tests.
That then showed the stream as hanging, because inFlightSubStreams
wasn't being decremented.
Fixes#9185
This change has these main aspects to it:
1. Removal of any name resolution responsibility from ManagedChannelImpl
2. Creation of a new RetryScheduler to own generic retry logic
- Can also be used outside the name resolution context
3. Creation of a new RetryingNameScheduler that can be used to wrap any
polling name resolver to add retry capability
4. A new facility in NameResolver to allow implementations to notify
listeners on the success of name resolution attempts
- RetryingNameScheduler relies on this
ExpectedException is deprecated, so I fixed the new warnings. However,
we are still using ExpectedException many places and had previously
supressed the warning. See
https://github.com/grpc/grpc-java/issues/7467 . I did not fix those
existing instances that had suppressed the warning, since it is
unrelated to the upgrade and we have been free to fix them at any time
since we dropped Java 7.
Use a volatile to publish the value even though it is final. TSAN
ignores the final aspect of a field, which is fair since another thread
may not see the parent's pointer become updated and use a stale value.
The lack of synchronization was clearly against lastServiceConfig's
documentation.
Fixes#9267
This helps to prevent retryable stream from using calloptions.executor when it shouldn't, e.g. call is already notified closed. It is done by delaying notifying upper layer (through masterListener).
Trying to upgrade Gradle to 7.6 improved the checkstyle plugin such that
it appears to have been running in new occasions. That in turn exposed
us to https://github.com/checkstyle/checkstyle/issues/5088. That bug was
fixed in 8.28, which also fixed lots of other bugs. So now we have
better checking and some existing volations needed fixing. Since the
code style fixes generated a lot of noise, this is a pre-fix to reduce
the size of a Gradle upgrade.
I did not upgrade past 8.28 because at some point some other bugs were
introduced, in particular with the Indentation module. I chose the
oldest version that had the particular bug impacting me fixed. Upgrading
to this old-but-newer version still makes it easier to upgrade to a
newer version in the future.
If an artifact on Maven Central exposes a type from gRPC on its API
surface, then consumers of that artifact need that gRPC API in the
compile classpath. Bazel handles this by making hjars for transitive
dependencies, but if the dependencies are runtime_deps then Bazel won't
generate hjars containing the needed symbols.
We don't export netty-shaded because the classes already don't match
Maven Central. If an artifact on Maven Central is exposing a
netty-shaded class on its API surface, it wouldn't work anyway since the
class simply doesn't exist for the Bazel build.
Fixes#9772
This change has these main aspects to it:
1. Removal of any name resolution responsibility from ManagedChannelImpl
2. Creation of a new RetryScheduler to own generic retry logic
- Can also be used outside the name resolution context
3. Creation of a new RetryingNameScheduler that can be used to wrap any
polling name resolver to add retry capability
4. A new facility in NameResolver to allow implementations to notify
listeners on the success of name resolution attempts
- RetryingNameScheduler relies on this
updateAndGet is only available in API level 24+. It is unclear why
AnimalSniffer didn't detect this.
This was noticed when doing the import, but the Android CI has also been
failing because of it.
Add big negative integer to pending stream count when cancelled. The count is used to delay closing master listener until streams fully drained.
Increment pending stream count before creating one. The count is also used to indicate callExecutor is safe to be used. New stream will not be created if big negative number was added, i.e. stream cancelled. New stream is created if not cancelled, callExecutor is safe to be used, because cancel will be delayed.
Create new streams (retry, hedging) is moved to the main thread, before callExecutor calls drain.
Minor refactor the masterListener.close() scenario.
Certain status codes are inappropriate for a control plane to be
returning and indicate a bug in the control plane. These status codes
will be replaced by INTERNAL, to make it clearer to see that the control
plane is misbehaving.
My motivation for making this change is that [`ByteBuffer` is becoming
`sealed`](https://download.java.net/java/early_access/loom/docs/api/java.base/java/nio/ByteBuffer.html)
in new versions of Java. This makes it impossible for Mockito's
_current_ default mockmaker to mock it.
That said, Mockito will likely [switch its default
mockmaker](https://github.com/mockito/mockito/issues/2589) to an
alternative that _is_ able to mock `sealed` classes. However, there are
downside to that, such as [slower
performance](https://github.com/mockito/mockito/issues/2589#issuecomment-1192725206),
so it's probably better to leave our options open by avoiding mocking at
all.
And in this case, it's equally easy to use real objects.
As a bonus, I think that real objects makes the code a little easier to
follow: Before, we created mocks that the code under test never
interacted with in any way. (The code just passed them through to a
delegate.) When I first read the tests, I was confused, since I assumed
that the mock we were creating was the same mock that we then passed to
`verify` at the end of the method. That turned out not to be the case.
Forwarding acceptResolvedAddresses() to a delegate in ForwardingLoadBalancer can cause problems if an extending class expects its handleResolvedAddresses implementation to be called even when a client calls handleResolvedAddresses(). This would not happen as ForwardingLoadBalancer would directly send the call to the delegate.