Commit Graph

5346 Commits

Author SHA1 Message Date
Eric Anderson de7db565a3
benchmarks: Propagate errors in LoadWorkerTest startup
Also clean up resources at the end of test.
2022-05-03 16:55:08 -07:00
sanjaypujare 41c027c11b
netty: implement UdsNameResolver and UdsNettyChannelProvider (#9113)
* netty: implement UdsNameResolver and UdsNettyChannelProvider
When the scheme is "unix:" we get the UdsNettyChannelProvider to
create a NettyChannelBuilder with DomainSocketAddress type and
other related params needed for UDS sockets
2022-05-02 16:41:50 -07:00
Eric Anderson cb61a5e284 benchmarks: Increase timeout of LoadWorkerTest
This should fix test failures on aarch64.
```
expected to be less than: 0.0
but was                 : 0.0
	at app//io.grpc.benchmarks.driver.LoadWorkerTest.assertWorkOccurred(LoadWorkerTest.java:198)
	at app//io.grpc.benchmarks.driver.LoadWorkerTest.runUnaryBlockingClosedLoop(LoadWorkerTest.java:90)
```

runUnaryBlockingClosedLoop() has been failing but the other tests
suceeding. The failure is complaining that getCount() == 0, which means
no RPCs completed. The slowest successful test has a mean RPC time of
226 ms (the unit was logged incorrectly) and comparing to x86 tests
runUnaryBlockingClosedLoop() is ~2x as slow because it executes first.
So this is probably _barely_ failing and 4 attempts instead of 3 would
be sufficient. While the test tries to wait for 10 RPCs to complete, it
seems likely it is stopping early even for the successful runs on
aarch64. There are 4 concurrent RPCs, so to get 10 RPCs we need to wait
for 3 batches of RPCs to complete which would be 1346 ms (5 loops)
assuming a 452 ms mean latency. Bumping timeout by 10x to give lots of
headroom.
2022-05-02 13:09:34 -07:00
Eric Anderson 2c3eca57e4 benchmarks: Shut down LoadClient at end of test 2022-05-02 12:14:08 -07:00
Ran e258fc743b
Use `ImmutableMap.Builder.buildOrThrow()` instead of deprecated `build()`. (#9132) 2022-05-02 11:51:42 -07:00
Eric Anderson fe5511cf21 benchmarks: Use Truth in LoadWorkerTest
This will produce better error messages when the comparisons fail. This
is to help debug aarch64 test failures.
2022-05-02 10:15:34 -07:00
marvinliu 80f1cbf6c4
binder: add hasPermissions security policy and test 2022-05-02 10:15:22 -07:00
Eric Anderson 812264ef87
interop-testing: Improve ConcurrencyTest error reporting
When a problem happens, it will now report back quickly instead of
waiting until the timeout expires. The timeout exception will also
report each RPC's state.

This is to help diagnose aarch64 test failures.
2022-05-02 10:10:42 -07:00
Eric Anderson 0431aee1ac xds: Remove unnecessary SuppressWarnings("unchecked") 2022-05-02 09:51:18 -07:00
yifeizhuang f3378c8876
xds: fix java doc warnings in orca (#9091) 2022-05-02 07:52:31 -07:00
Eric Anderson e6ddace2b8 rls: Increase RPC timeout for flaky rls_withCustomRlsChannelServiceConfig
The test appears to be slow because of classloading. The failure cases
were very slow at 14-16 seconds, but looking at other logs it succeeds
after 12 seconds. It is the first test in the class, and the other tests
run much faster. This could be solved with warmup code, but increasing
the RPC deadline is easier.

Two back-to-back failures on aarch64:
https://source.cloud.google.com/results/invocations/c4612a28-d594-42e9-b8ab-12c999690b40/targets
https://source.cloud.google.com/results/invocations/3d5d1dc2-6b47-493d-b15c-e99458067d73/targets

```
expected to be true
	at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:267)
```

And the next run failed on a different line but seems the same cause:
https://source.cloud.google.com/results/invocations/546b83d1-cd26-4b87-8871-a7a06a60dc06/targets

```
expected to be true
	at app//io.grpc.rls.CachingRlsLbClientTest.rls_withCustomRlsChannelServiceConfig(CachingRlsLbClientTest.java:273)
```

Reproduced with:
```diff
diff --git a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
index 9fac852fa..631d632eb 100644
--- a/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
+++ b/rls/src/test/java/io/grpc/rls/CachingRlsLbClientTest.java
@@ -264,6 +264,11 @@ public class CachingRlsLbClientTest {

     // initial request
     CachedRouteLookupResponse resp = getInSyncContext(routeLookupRequest);
+    try {
+      Thread.sleep(2000);
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
     assertThat(resp.isPending()).isTrue();

     // server response
```
2022-04-29 16:42:00 -07:00
yifeizhuang 9a5467b2ff
xds: move orca to java package io.grpc.xds.orca (#9086)
1. move orca from xds and from service to io.grpc.xds.orca new package
2. keep CallMetricsRecorder and InternalCallMetricsRecorder in service
3. Added APIs for recording utilization/requestCost/cpuUtilization/memoryUtilzation for per-query requests, added internal data structure equivalent to OrcaLoadReport
2022-04-29 14:56:44 -07:00
Terry Wilson e147b5ebfb
cds: Import Envoy load balancing extension protos. (#9133) 2022-04-29 14:42:27 -07:00
yifeizhuang 5686018d40
xds: NACK EDS resources with duplicate localities in the same priority (#9119) 2022-04-29 10:17:47 -07:00
Sergii Tkachenko b3d4607a62
binder, xds: address minor linter fixes (#9130) 2022-04-29 09:42:15 -07:00
Marvin Liu fdd9ab4f96 adding a security policy that allows access if and only if all given security policies allow access. this contributes to b/221149437 and is similar to cl/442582915 2022-04-28 15:48:23 -07:00
Terry Wilson 4c916c4ed1
xds: A new wrr_locality load balancer. (#9103)
This LB is the parent for weighted_target and will configure it based on the child policy it gets in its configuration and locality weights that come in a ResolvedAddresses attribute.

Described in [A52: gRPC xDS Custom Load Balancer Configuration](https://github.com/grpc/proposal/pull/298)
2022-04-28 13:51:46 -07:00
Marvin Liu 40973aedbe usage of ErrorProne CheckReturnValue 2022-04-28 13:02:43 -07:00
Eric Anderson f2348b0157 bom: Automatically exclude unpublished projects
grpc-observability was accidentally included in grpc-bom in 1.45 even
though it was not published to Maven Central. This is intended to reduce
the likelihood of such things reoccurring. We only include a project in
the bom if it is using maven-publish and if the publishing task is
enabled.

onlyIf is very similar to enabled, except it is processed just before
the task is run. We need a more static property here, so swap to
enabled. If a project uses onlyIf in the future, grpc-bom won't be able
to automatically exclude it.
2022-04-28 09:30:32 -07:00
Sergii Tkachenko 88770009fb
Increase memory in Linux aarch64 (emulated) builds (#9111)
Fix the issue with `Linux aarch64 (emulated)` builds failing with 

```
Expiring Daemon because JVM heap space is exhausted
Daemon will be stopped at the end of the build after running out of JVM memory
```

This fixes the build itself, however certain tests still fail.
2022-04-27 16:02:05 -07:00
Eric Anderson 369f87becd Revert "auth: Add support for Retryable interface"
This reverts commit 0963f3151d. This
causes dependency problems when importing into Google, as
google-auth-library-java needs to be upgraded and that requires an
upgrade to google-http-java-client to bring in
https://github.com/googleapis/google-http-java-client/pull/1505 .
Reverting for now and will roll forward once those upgrades are
performed.
2022-04-27 15:38:13 -07:00
Sergii Tkachenko 7dc4fc929c
xds, api: minor cleanup to address linter suggestions (#9116) 2022-04-27 13:02:46 -07:00
Sergii Tkachenko ecf1b37746
xds: remove google.security.meshca.v1 proto (#9115)
Remove unused xds/third_party/istio/src/main/proto/security/proto/providers/google/meshca.proto
and xds/src/generated/main/grpc/com/google/security/meshca/v1/MeshCertificateServiceGrpc.java
generated from it.
2022-04-26 17:39:09 -07:00
Eric Anderson 0963f3151d
auth: Add support for Retryable interface
Retryable was added in google-auth-library 1.5.3 to make clear the
situations that deserve a retry of the RPC. Bump to that version and
swap away from the imprecise IOException heuristic.
go/auth-correct-retry

Fixes #6808
2022-04-26 08:59:08 -07:00
Casey eeeeff0702 fix artifact name in IO_GRPC_GRPC_JAVA_ARTIFACTS 2022-04-25 17:11:56 -07:00
Sergii Tkachenko b1720f10a5
xds: Envoy proto sync to 2022-04-08 (#9101)
Proto updates:

- cncf/xds: Sort xds/import.sh protos alphabetically
- cncf/xds: Sync protos to cncf/xds@d92e9ce (commit 2021-12-16, corresponding to
  envoy cl/440193522). It's a no-op for used protos, but helpful to import the
  latest matcher.proto
- cncf/xds: Import xds/type/matcher/v3/matcher.proto with dependencies
- envoyproxy/protoc-gen-validate: Sync protos to
  envoyproxy/protoc-gen-validate@dfcdc5e (commit 2022-03-10, corresponding to
  envoy cl/440193522) to pick up ignore_empty field required for the following
  envoy sync
- envoyproxy/envoy Sync protos to envoyproxy/envoy@e33f444 (commit 2022-04-07,
  cl/440193522). This is the minimal version needed to pick up
  ClusterSpecifierPlugin.is_optional. a. Generated code:
  AggregatedDiscoveryServiceGrpc was regenerated from the updated proto. This
  is a no-op, just a minor change to the docblocks. b. Deprecated fields had to
  be taken care of manually, see "Manual updates to the code" below.
- envoyproxy/envoy Sync protos to the latest imported version
  envoyproxy/envoy@5d74719 (commit 2022-04-08, cl/443359189). Not needed for
  anything specific, just the last version, and was easy to import.


Manual updates to the code as the result of envoyproxy/envoy@e33f444 sync:

- Deprecated ConfigSource.path replaced with the ConfigSource.path_config_source
  in test fake resources. The ConfigSource.path isn't in active code paths, so
  no prod code changes needed.
- Suppress CertificateValidationContext.match_subject_alt_names deprecations in
  test files. Surprisingly, we don't report deprecations in prod files, despite
  the fact this field is used in prod code a few times.
2022-04-25 16:38:17 -07:00
sanjaypujare 538db03d56
api: add support for SocketAddress types in ManagedChannelProvider (#9076)
* api: add support for SocketAddress types in ManagedChannelProvider
also add support for SocketAddress types in NameResolverProvider
Use scheme in target URI to select a NameRseolverProvider and get
that provider's supported SocketAddress types.
implement selection in ManagedChannelRegistry of appropriate
ManagedChannelProvider based on NameResolver's SocketAddress types
2022-04-22 09:10:55 -07:00
Terry Wilson 8e65700edc
xds: ClientXdsClient to provide JSON LB configurations (v2) (#9095)
This refactoring is done in preparation of a larger change where LB configuration will be provided in the xDS Cluster proto message load_balancing_policy field. This field will allow for the configuration of custom LB policies with arbitrary configuration data.

- Instead of directly creating Java configuration objects, the client delegates to a new factory class to generate JSON configurations
- This factory is considered a "legacy" one as a separate factory will be introduced to build configs based on the new load_balancing_policy field
- The client will use a LoadBalancerProvider to parse the generated config to assure it is valid.
- Overlapping LB config validation that exists both in ClientXdsClient and LB providers will be removed from the client.

This is a second attempt at #8996 that was reverted by #9092.

The initial PR was reverted because the change caused the duplicate CDS update detection in ClientXdsClient to fail. This was because equality checking of PolicySelection instances cannot be relied on. This PR uses the JSON config instead - CdsLoadBalancer2 will handle the conversion from JSON config to PolicySelection.
2022-04-21 14:18:42 -07:00
Sergii Tkachenko a5829107c3
xds: include node ID in RPC failure status messages from the XdsClient (#9099) 2022-04-21 09:52:31 -07:00
yifeizhuang 3a303af02f
xds: priority reset failover timer when connecting if seen ready or idle since TF (#9093)
changes in priority:
Keep track of whether a child has seen TRANSIENT_FAILURE more recently than IDLE or READY, and use this to decide whether to restart the failover timer when a child reports CONNECTING. This ensures that we properly start the failover timer when the ring_hash child policy transitions from IDLE to CONNECTING at startup.
Behaviour change also affects address updates the current priority from CONNECTING to CONNECTING, previously it reports one CONNECTING, right now it does not report and wait there due to failover timer in effect. This helps to try the next priority.
2022-04-20 10:40:03 -07:00
Terry Wilson e2449a7738
Revert "xds: ClientXdsClient to make PolicySelection determination" (#9092)
* Revert "- Change config builder to a static factory class. - Remove validation and default value logic that already exists in providers from the factory. - Using the PolicySelection in CdsUpdate instead of the JSON config."

This reverts commit 54c72b945e.

* Revert "xds: ClientXdsClient to provide LB config in JSON"

This reverts commit 4903b44a82.
2022-04-19 09:24:04 -07:00
yifeizhuang 467ac7a4e8
xds: fix presubmit lints errors for style (#9090) 2022-04-18 17:44:41 -07:00
yifeizhuang 81c4571282
xds: fix ring-hash-picker behaviour (#9085) 2022-04-18 12:16:08 -07:00
yifeizhuang a0da558b12
xds: change ring_hash LB aggregation rule to handles transient_failures (#9084) 2022-04-17 20:45:34 -07:00
Eric Anderson 592a227686 okhttp: Allow keepalive scheduled executor to be overridden
Users should be able to inject all executors. The transport shouldn't be
hard-coded to create the TIMER_SERVICE, especially since a scheduler is
already available to the builder.
2022-04-15 15:28:58 -07:00
Eric Anderson 8862dca624 okhttp: Use ObjectPool for executors internally in Builder
This matches what we do in ManagedChannelImplBuilder and
NettyChannelBuilder. It also fixes a (probably unimportant) bug where
the factory returned from swapChannelCredentials() didn't have its
references to the executors so could not outlive the parent factory.
2022-04-15 15:28:58 -07:00
Terry Wilson 54c72b945e - Change config builder to a static factory class.
- Remove validation and default value logic that already exists in providers from the factory.
- Using the PolicySelection in CdsUpdate instead of the JSON config.
2022-04-13 12:40:43 -07:00
Terry Wilson 4903b44a82 xds: ClientXdsClient to provide LB config in JSON
This refactoring is done in preparation of a larger change where LB
configuration will be provided in the xDS Cluster proto message
load_balancing_policy field. This field will allow for the configuration
of custom LB policies with arbitrary configuration data.

- Instead of directly creating Java configuration objects, the client
  delegates to a new builder to generate JSON configurations
- This factory is considered a "legacy" one as a separate factory will
  be introduced to build configs based on the new load_balancing_policy
  field
- The client will use a LoadBalancerProvider to parse the generated
  config to assure it is valid.
- CdsLoadBalancer2 will parse to config again to produce the LB config
  object passed down to child LBs.
2022-04-13 12:40:43 -07:00
markb74 4e9dab9c70
Support includeStatusWithCause from InternalInProcess. (#9080) 2022-04-12 16:59:08 +02:00
Eric Anderson 4a137d6ef0 Start 1.47.0 development cycle 2022-04-11 10:41:33 -07:00
Eric Anderson 78ccc81fd5 okhttp: Remove dead code in io.grpc.okhttp.internal.Util
A substantial portion of the methods are unused. While these don't
contribute to the size of Android builds because of dead code
elimination in the build process, they still show up in static analysis
and raise questions like "when are we using MD5" or "when are we special
casing exception message text" (answer: "we're not").
2022-04-08 08:24:35 -07:00
Eric Anderson 569b7b0b95 xds: Unconditionally apply backoff on LRS stream recreation
This would limit LRS stream creation to one per second, even if the
old stream was considered good as it received a response. This is the
same change as made to ADS in 957079194a.

b/224833499
2022-04-07 14:44:36 -07:00
Eric Anderson 054cb49b49
okhttp: Remove RPCs-before-ready tests
In the olden days, before LB policies, transports had to accept RPCs as
soon as they were created. This hasn't been true for a very long time,
so remove the tests.

Since a978c9ed we're using real, legit code flows in the tests. This
allowed TSAN to discover that `attributes` is racy when read when
creating a new stream before the transport is ready. We could use a lock
or volatile, but the value of the attributes would still be incorrect
for any RPCs that are created before the transport is ready.

Since there's now only one test that delays the connection, I inline the
support code.
2022-04-07 13:30:25 -07:00
Eric Anderson 5351fb9c25 okhttp: Pass TransportFactory directly to transport constructor
This greatly reduces the number of arguments passed to the constructor
and allows using the builder in tests to change specific arguments
without having to pass all the other arguments. It also makes it easier
to see where tests are doing something special.

While it is weird to expose fields as package-private for digging-into
in the constructor, it's actually very similar to the pattern of passing
the builder instance into the constuctor. In this case, the weirdness is
because the builder isn't a nested class of the transport and there is
an additional level of building going on (Builder and TransportFactory).
We do this pattern already in ManagedChannelImpl which only has the one
level of building.
2022-04-07 09:07:13 -07:00
yifeizhuang 584622c5fa
Revert "stub: enable GRPC_CLIENT_CALL_REJECT_RUNNABLE in ThreadlessExecutor shutdown (#9035)" (#9067)
This reverts commit c53c3ad01b.
2022-04-06 14:01:05 -07:00
Eric Anderson 9208c49572 rls: Use Ticker for durations
Ticker is powered by System.nanoTime() which is CLOCK_MONOTONIC.
TimeProvider is powered by System.currentTimeMillis() which is
CLOCK_REALTIME. For durations, the monotonic clock is appropriate, not
the wall time which can jump around.
2022-04-06 08:45:53 -07:00
Eric Anderson 1426e2a670 rls: Use FakeClock like rest of grpc tests
No need to create a new (mock-based) ScheduledExecutorService
implementation; it is easy enough to teach FakeClock
scheduleAtFixedRate().
2022-04-06 08:45:53 -07:00
John Cormie fba4ae496a
binder: Work around an Android Intent bug (#9061)
Where filterEquals() can be inconsistent with filterHashCode().

Fixes #9045
2022-04-06 07:37:00 -07:00
Eric Anderson 3c2c357efa binder: Use Ticker for durations
Ticker is powered by System.nanoTime() which is CLOCK_MONOTONIC.
TimeProvider is powered by System.currentTimeMillis() which is
CLOCK_REALTIME. For durations, the monotonic clock is appropriate, not
the wall time which can jump around.
2022-04-06 07:16:21 -07:00
DNVindhya 8d69a352a9
adding @Internal annotation for internal classes (#9063) 2022-04-05 15:02:43 -07:00