6efa9ee3 added `volatile` to `attributes` after TSAN detected a data
race that was added in 91d15ce4. The race was because attributes may be
read from another thread after `transportReady()`, and the
post-filtering assignment occurred after `transportReady()`. The code
now filters the attributes separately so they are updated before calling
`transportReady()`.
Original TSAN failure:
```
Read of size 4 at 0x0000cd44769c by thread T23:
#0 io.grpc.netty.NettyClientHandler.getAttributes()Lio/grpc/Attributes; NettyClientHandler.java:327
#1 io.grpc.netty.NettyClientTransport.getAttributes()Lio/grpc/Attributes; NettyClientTransport.java:363
#2 io.grpc.netty.NettyClientTransport.newStream(Lio/grpc/MethodDescriptor;Lio/grpc/Metadata;Lio/grpc/CallOptions;[Lio/grpc/ClientStreamTracer;)Lio/grpc/internal/ClientStream; NettyClientTransport.java:183
#3 io.grpc.internal.MetadataApplierImpl.apply(Lio/grpc/Metadata;)V MetadataApplierImpl.java:74
#4 io.grpc.auth.GoogleAuthLibraryCallCredentials$1.onSuccess(Ljava/util/Map;)V GoogleAuthLibraryCallCredentials.java:141
#5 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Lcom/google/auth/oauth2/OAuth2Credentials$OAuthValue;)V OAuth2Credentials.java:534
#6 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Ljava/lang/Object;)V OAuth2Credentials.java:525
...
Previous write of size 4 at 0x0000cd44769c by thread T24:
#0 io.grpc.netty.NettyClientHandler$FrameListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V NettyClientHandler.java:920
#1 io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V DefaultHttp2ConnectionDecoder.java:515
...
```
io.grpc.util.CertificateUtils does much of the same thing as xds's
CertificateUtils, but also supports EC keys. The xds code pre-dates the
grpc-util class, so it isn't surprising it wasn't using it.
There's a good number of usages of the xds CertificateUtils, so I just
got rid of the duplicate implementation, but didn't yet bother changing
callers io.grpc.util.
* Provide a default implementation for new method added to ManagedTransport.Listener to support ClientTransportFilters
* Relax test constraint to reduce flakiness due to timing.
* Add test for listener.filterTransport.
Currently, if caching is enabled (as is often the case) and AsyncSecurityPolicy returns a failed future, then this future is cached forever, without giving the SecurityPolicy implementation a chance to be retried. Going forward, new invocations will trigger new security checks if the last one could not be completed successfuly.
This fixes two bugs in outbound message size checking:
* When thet checke failed, the thrown StatusRuntimeException with a status code
of RESOURCE_EXHAUSTED was been caught and rewrapped in another
StatusRuntimeException but this time with status code UNKNOWN.
* This applies the max outbound message size check to messages prior to, and
after compression, since compression of a message that is smaller than the
maximum send size can result in a larger message that exceeds the maximum
send size.
gcr.io/distroless/java:8 is no longer being updated. Java 8 isn't a
distroless option any more. Java 11 and 17 are options, but only Java 17
with Debian 12. The main alternative is to stick with Java 8 and use
something like docker.io/library/eclipse-temurin:8-jre . But there
doesn't seem to be much need to use an old JDK for these containers.
jib needed updating to support the oci manifest format used in the
updated image.
Originally you had to confirm that awaitTermination() returned true, but
that was annoying and useless, especially after calling shutdownNow().
The behavior was changed in ce2ae1fb because the awaitTermination()
detection logic could prevent the channel from getting garbage
collected.
Fixes#10732
The KeepAliveManager clamps the keepalive and keepalive time values such that the result is larger than the minimum values specified here. Therefore, a second check here is unnecessary.
All the changes outside libs.versions.toml and examples were
because of ErrorProne. It didn't actually find anything to fix; signal
vs noise has gotten pretty bad with the newer checks.
Status was changed for ErrorProne's SuperCallToObjectMethod. With the
old code it didn't notice the trivial implementation. The fail-for-test
code wasn't used, so it was easiest to just remove it.
Some of the libs had their versions inlined; now that we have
:checkForUpdates it isn't much of a risk for versions to diverge when
there's only a few artifacts sharing a version. If we need 4+ artifacts
to have the same version, then it makes sense to still use a shared
version.
Dependencies not upgraded: google-auth-libray, mockito, netty, cronet
Removing the $ prompt makes it easier to copy+paste. At no point are we
running as root, so there's no # vs $ distinction, not that many people
would even notice the difference.
There is a risk here that not all commands end up getting run. When
pasting multiple commands at once, gradle or another tool might read in
stdin and discard them. But it's probably not worth continuing the
copy-each-command-separately-and-avoiding-the-$.
This change has health checking consumer (new pick first) to install a listener through and health checking producer (outlier detection and client health checking) producing health checks. Health notification chain is built reusing the previous connectivity state chain.
Pickfirst installs the health listener, and is capable of detecting when no health checking producer is installed in the system. In that case, it sets health status to be READY so that health system is no-op.
This commit makes a small change to BinlogHelper to make it compatible with the Protobuf Java Lite runtime.
In the Lite runtime, the `addXBuilder` for repeated fields is not available. Instead, the `addX` method must be used with a manually-constructed Builder.
* Handle slow security policies without blocking gRPC threads.
- Introduce PendingAuthListener to handle a ListenableFuture<Status>, progressing the gRPC through each stage in sequence once the future completes and is OK.
- Move unit tests away from `checkAuthorizationForService` and into `checkAuthorizationForServiceAsync` since that should be the only method called in production now.
- Some tests in `ServerSecurityPolicyTest` had their expectations updated; they previously called synchornous APIs that transformed failed `ListenableFuture<Status>` into one or another status. Now, we call the sync API, so those transformations do not happen anymore, thus the test needs to deal with failed futures directly.
- I couldn't figure out if this PR needs extra tests. AFAICT `BinderSecurityTest` should already cover the new codepaths, but please let me know otherwise.
This removes the benefit of including the PR number in the title without
also requiring using github APIs to query the PR number. It still
provides the same details about the change, and indirectly links to the
PR if the user wants to see the review.
Gradle is forcing a move away from using 'project' during task excution
and because of some interactions there, this is easiest by making them
real classes. That makes them start looking quite strange in the build
file, so they are now moved to buildSrc/. We could have continued using
Groovy, but it is weird in some ways that are more apparent when making
classes and not just scripting. Instead, they were converted to Java.
They are compatible with delayed configuration resolution as well.
We already do this for WRR. Notably, we are no longer trying to avoid
the modulus each pick. It was of questionable value, and removing it is
necessary to continue sharing the same integer when the list size
changes.
The change means we can implement a stronger isEquivalentTo() by
comparing the AtomicInteger references. It is strong enough that the
operation aligns with normal equals(). Using equals() instead of
isEquivalentTo() also made more obvious an equals() optimization that
uses the hashCode() before the more expensive HashSet creation; equals()
should now be very fast except when they are (very likely) equal.
They are a lot faster. Instead of 1-3 minutes of test execution, I now
see 2-22 seconds. There still may be 3 minutes of overhead for the
gcloud command to complete, but the reduction is noticable in the total
execution time. And it seems the tests are actually being run, as there
is some flakiness. The flakiness appears to be at a lower rate.
The script was slightly reorganized to make it easier to copy commands
to run locally.