Protobuf uses Guava 30.1.1, so I upgrade it at the same time. It also
caused an update to rules_jvm_external and reworking the Bazel build.
Protobuf no longer requires bind() so they were dropped. Although
Protobuf's protobuf_deps() brings in rules_jvm_external, and so we don't
need to define it ourselves, it seems better to define it directly and
not depend on transitive deps since we use it directly.
Protobuf now has support for maven_install() by exposing
PROTOBUF_MAVEN_ARTIFACTS, which required reorganizing the WORKSPACE to
use maven_install() after loading protobuf. Protobuf still doesn't
define target overrides for itself so we still maintain those. When
reorganizing the WORKSPACE I noticed http_archive should ideally be
above io_grpc_grpc_java as most users will need it there, so I fixed
that since there were lots of other load()-reordering already.
The `RlsProtoData.RouteLookupConfig` class is out-of-date.
- Some of the fields were long, but now are of `Duration` type.
- Some of the fields are deleted.
- The validation of some of the fields either have been changed or were wrong since beginning.
Now overhaul all the fields in `RlsProtoData.RouteLookupConfig` class based on the spec http://go/grpc-rls-lb-policy-design#heading=h.y3h669gfpown.
Also move the validation logic in json parsing rather than in the constructor of `RouteLookupConfig`.
Fix the NPE as shown in the following stacktrace:
```
Caused by: java.lang.RuntimeException: java.lang.NullPointerException with message: null
at io.grpc.census.CensusStatsModule$ClientTracer.recordFinishedAttempt(CensusStatsModule.java:388) ~[grpc-census-1.42.0.jar:1.42.0]
at io.grpc.census.CensusStatsModule$CallAttemptsTracerFactory.recordFinishedCall(CensusStatsModule.java:525) ~[grpc-census-1.42.0.jar:1.42.0]
at io.grpc.census.CensusStatsModule$CallAttemptsTracerFactory.attemptEnded(CensusStatsModule.java:492) ~[grpc-census-1.42.0.jar:1.42.0]
at io.grpc.census.CensusStatsModule$ClientTracer.streamClosed(CensusStatsModule.java:345) ~[grpc-census-1.42.0.jar:1.42.0]
at io.grpc.internal.StatsTraceContext.streamClosed(StatsTraceContext.java:155) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.closeListener(AbstractClientStream.java:458) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.access$400(AbstractClientStream.java:221) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState$1.run(AbstractClientStream.java:442) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.deframerClosed(AbstractClientStream.java:278) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.Http2ClientStreamTransportState.deframerClosed(Http2ClientStreamTransportState.java:31) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.MessageDeframer.close(MessageDeframer.java:233) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.MessageDeframer.closeWhenComplete(MessageDeframer.java:191) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractStream$TransportState.closeDeframer(AbstractStream.java:200) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:445) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.transportReportStatus(AbstractClientStream.java:401) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.AbstractClientStream$TransportState.inboundTrailersReceived(AbstractClientStream.java:384) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.internal.Http2ClientStreamTransportState.transportTrailersReceived(Http2ClientStreamTransportState.java:183) ~[grpc-core-1.42.0.jar:1.42.0]
at io.grpc.netty.shaded.io.grpc.netty.NettyClientStream$TransportState.transportHeadersReceived(NettyClientStream.java:334) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler.onHeadersRead(NettyClientHandler.java:372) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler.access$1200(NettyClientHandler.java:91) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
at io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler$FrameListener.onHeadersRead(NettyClientHandler.java:934) ~[grpc-netty-shaded-1.42.0.jar:1.42.0]
```
The NPE can happen when `ClientCall.Listener.onClose()` and `StatsTraceContext.streamClosed()` (or `ClientStreamListener.closed()`) are invoked concurrently in different threads. Note that `CensusStatsModule$CallAttemptsTracerFactory.attemptEnded()` in the above stack trace would observe `callEnded==true` in such a race condition.
The following are the possible scenarios that the race between `ClientCall.Listener.onClose()` and `ClientStreamListener.closed()` can happen:
- Deadline exceeded but the underlying real stream is [not committed](https://github.com/grpc/grpc-java/blob/v1.42.0/core/src/main/java/io/grpc/internal/RetriableStream.java#L486-L495), the `ClientCall.Listener` may be closed earlier than the stream listener. (This is the case of the above stack trace, in which the inbound end-of-stream is received observing `callEnded==true`. Even if nothing inbound is received, there is still a chance that the NPE can happen.)
- DelayedClientTransport.PendingStream has created a realStream but `setStream(realStream)` is [not called yet](https://github.com/grpc/grpc-java/blob/v1.42.0/core/src/main/java/io/grpc/internal/DelayedClientTransport.java#L366-L372), when deadline exceeded. (This has little chance to happen, only for the very first RPC on the channel.)
- Hedging case.
In deadline-exceeded cases, the shorter the deadline is, the more likely the race can happen.
Implement applying `server_listener_resource_name_template` and `client_listener_resource_name_template` with xdstp scheme, extracting authorities from xdstp resource URI and lookup authorities map in bootstrap.
As documented in https://developers.google.com/protocol-buffers/docs/proto3#json,
the canonical proto-to-json converter converts int64 (Java long) values to string values in Json rather than Json numbers (Java Double). Conversely, either Json string value or number value are accepted to be converted to int64 proto value.
To better support service configs defined by protobuf messages, support parsing String values as numbers in `JsonUtil`.
This introduces new TLS 1.2 cipher suites (#8610) and prepares the
internal okhttp implementation for TLS1.3. A new method for creating
internal ConnectionSpec was added to be able to use the newly introduced
cipher suites in the OkHttpChannelBuilder. Okhttp cipher suites
synchronized with the ones from netty.
DirectPath is going to support non-default service account. This commit
allows users to pass CallCredentials to GoogleDefaultChannelCredentials.
See design in go/directpath-file-credential-google-default-creds
- Partially revert the change of RlsProtoData.java in #8612 by removing `public` accessor
- Have grpc-xds no longer strongly depend on grpc-rls. The application will need grpc-rls as runtime dependencies if they need route lookup feature in xds.
- Parse RouteLookupServiceClusterSpecifierPlugin config to the Json/Map representation of `io.grpc.lookup.v1.RouteLookupClusterSpecifier` instead of `io.grpc.rls.RlsProtoData.RouteLookupConfig`
Fix bugs:
1. Invalid resource at xdsClient, the watcher should have been delivered an error instead of resource not found.
2. If the resource is properly determined to not exist, it shouldn't cause start() to fail. From A36 xDS for Servers:
"XdsServer's start must not fail due to transient xDS issues, like missing xDS configuration from the xDS server."
The addition of the authz tests in 0d345721 is causing the tests to
exceed their timeout. By itself, the authz test takes about an hour in
this environment. Before the authz tests, xds-k8s was taking an hour
and a half.
Generating a uuid in filterChain breaks the de-duplication detection which causes XdsServer to cycle connections, so removing it.
An empty name is now allowed. The name is currently only used for debug purpose.
Add AbstractXdsInteropTest, XdsTestControlPlaneService and only ping-pong testcase in initial implementation.
AbstractXdsInteropTest sets up the test control plane, create xdsClient and xdServer using bootstrap override, test case extending AbstractXdsInteropTest is supposed to override the control plane config and run the verification.
XdsTestControlPlaneService only has static xds configurations, not able to keep states.
How to run:
./gradlew :grpc-interop-testing:installDist -PskipCodegen=true
./interop-testing/build/install/grpc-interop-testing/bin/xds-e2e-test-client
Addresses a problem where we initially only resolve addresses to the backends, but not the load balancer and then later resolve addresses to both. In this situation the fallback timer was started during the second instance even if it resulted in the timer later failing as we were already using fallback backends.
This change assures that a fallback time is only ever started if we are not already using the fallback backends.
This is a follow-up fix to #8253.
The previous attempt at this CL relied on guava's Hashing class which
is still in beta. This update compares Signature objects directly instead
of SHA256 hashs, removing the need for the Hashing class.
Add additional comments to the security policy class, to mention that
implementing new policies requires significant care.
With that in mind, add security policies to check the peer app's
signature, so people can create cross-app communication without
having to implement their own policy.
Finally, add the UntrustedSecurityPolicies class, since that's
inevitably a policy which is sometimes needed.
Add additional comments to the security policy class, to mention that implementing new policies requires significant care.
Also add security policies which check the Sha256 of a peer apps's signature, so people can do trusted cross-app communication without having to implement their own policy.
Finally, add the UntrustedSecurityPolicies class, since that's inevitably a policy you sometimes need as well.
Add RlsClusterSpecifierPlugin as per the [design doc](http://go/grpc-rls-in-xds#heading=h.dmyrvi6ohebx)
The structure of `ClusterSpecifierPlugin` is very similar to `io.grpc.xds.Filter`.
The following changes to the existing code are made:
- move `ConfigOrError` class out of `Filter` class to be shared with `ClusterSpecifierPlugin`
- make `io.grpc.rls.RlsProtoData` public to be accessible by `io.grpc.xds`
- treat empty defaultTarget in `io.grpc.rls.RlsProtoData.RouteLookupConfig` as null to support both json and proto config without defaultTarget field specified.
Support anonymous in-process servers, and InProcessChannelBuilder.forTarget.
Anonymous servers aren't registered statically, meaning they can't be looked up by name.
Only the AnonymousInProcessSocketAddress passed to InProcessServerBuilder.forAddress(),
(or subsequently fetched from Server.getListenSockets()) can be used to connect to the server.
Supporting InProcessChannelBuilder.forTarget is particularly useful for production
Android usage of in-process servers, where process startup latency is crucial.
A custom name resolver can be used to create the server instance on demand
without directly impacting the startup latency of in-process gRPC clients.
Together, these features support a more-standard approach to "OnDeviceServer" referenced in gRFC L73.
https://github.com/grpc/proposal/blob/master/L73-java-binderchannel.md#ondeviceserver
Fix connectivity state aggregation as per http://go/grpc-rls-lb-policy-design#heading=h.6e8tt7xcwcdn
> Note that, for the purposes of aggregation, when a child policy reports TRANSIENT_FAILURE, we consider it to continue to be in that state until it reports READY (i.e., we ignore CONNECTING in between the two, no matter how many times it bounces back and forth between TRANSIENT_FAILURE and CONNECTING).