An AsciiString object may only use a subsection of its backing byte array. We need to test for this and return a copy of the subsection if necessary.
Big thanks to @normanmaurer for uncovering this issue: https://github.com/netty/netty/issues/5472
Resolves#1756
The thread-unsafe method `io.grpc.testing.TestUtils.pickUnusedPort` causes flakes (#1756) in windows. Need to avoid use of this method in test as in windows the tests are running in different jvms and concurrent calls of this method in multiple processes tend to return the same port number.
There are some usages of this method in benchmarks, so moved the method to `io.grpc.benchmarks.Utils` and the method will only be used in benchmarks and not in test.
A transport is "in use" iff number of streams > 0. In following changes
the channel will use this information when deciding whether it should
transit to the IDLE mode (#1276).
Introduce CallCredentials as a first-class option to allow applications
to set per-call credentials into headers for outgoing RPCs. This will
supersede ClientAuthInterceptor. It has access to more
information (e.g., transport attributes, MethodDescriptor) and allow
results to be returned asynchronously, e.g., from a blocking I/O, which
was problemantic with ClientAuthInterceptor.
adding
ClientStream newStream(MethodDescriptor<?, ?> method, Metadata headers, CallOptions callOptions);
to ClientTransport interface
Created this PR first because both fail fast implementation and another change will be using this interface change
This introduces an AbstractStream2 that is intended to replace the
current AbstractStream. Only server-side is implemented in this commit
which is why AbstractStream remains. This is mostly a reorganization of
AbstractStream and children, but minor internal behavioral changes were
required which makes it appear more like a reimplementation.
A strong focus was on splitting state that is maintained on the
application's thread (with Stream) and state that is maintained by the
transport (and used for StreamListener). By splitting the state it makes
it much easier to verify thread-safety and to reason about interactions.
I consider this a stepping stone for making even more changes to
simplify the Stream implementations and do not think some of the changes
are yet at their logical conclusion. Some of the changes may also
immediately be replaced with something better. The focus was to improve
readability and comprehesibility to more easily make more interesting
changes.
The only thing really removed is some state checking during sending
which is already occurring in ServerCallImpl.
See #933
- Create InternalHandlerRegistry, an immutable look-up table. Handlers
passed to ServerBuilder.addService() go to this registry. This covers
the most common use cases. By keeping the registry internal we could
freely change the registry's interface to accommodate optimizations,
e.g., for hpack.
- The internal registry uses a flat fullMethodName -> handler look-up
table instead of a hierarchical one used before. It faster because it
saves one look-up and a substring.
- Introduces the fallback registry, settable by
ServerBuilder.fallbackHandlerRegistry(), for advanced users who want a
dynamic registry. Moved the current MutableHandlerRegistryImpl to
io.grpc.util.MutableHandlerRegistry as a stock implementation of the
fallback registry. The io.grpc.MutableHandlerRegistry interface is now
removed.
So far, we have passed a custom Executor to the NioEventLoopGroup constructor,
in order to get custom thread names and be compatible with both Netty 4 and
Netty 5. However, Netty 5 is no more (RIP) and Netty's DefaultThreadFactory
includes some optimizations around thread local storage, that Guava's executor
does not have.
The thread names will be a bit different, as DefaultThreadExecutor additionally
puts in the thread pool id after the name prefix.
For example:
Before:
grpc-default-boss-ELG-0
grpc-default-worker-ELG-0
grpc-default-worker-ELG-1
After:
grpc-default-boss-ELG-0-0
grpc-default-worker-ELG-1-0
grpc-default-worker-ELG-1-1
To ManagedChannelImpl, TransportSet and all client transport
implementations, so they can be correlated in the logs. Also added more
life-cycle logging in general.
Long-lived streams or lengthy RPCs can keep the transport open for
minutes after a GOAWAY is received. Previously, during this time any new
RPCs would fail with a message like:
> Cannot create stream 5 since this endpoint has received a GOAWAY frame
> with last stream id 3
All usages of goAwayStatus were replaced with lifecycleManager. Although
note that previously goAwayStatus() would never return null because it
would generate a Status if the current field was null.
getShutdownStatus() does not have this feature, so some code was
rearranged to guarantee the Status is non-null before retrieving it.
The listener handling was simplified by 1) avoiding the need for
thread-safety and 2) moving state keeping into a small class for easy
comprehensibility and simplified usage in tests.
NettyClientTransport.shutdown() no longer calls transportShutdown()
because it lies (because the message can be delayed for quite some time)
and because it was the only usage of lifecycleManager not on the event
loop.
Fixes#1359
Our tests are detecting cases where we are still getting
ClosedChannelException. We need to fix that because it is a useless
status, but until it is fixed we want a stable CI.
Fixes#1513 and NettyTransportTest.serverNotListening failures
Netty client shutdown would race with the negotiation handling and
circumvent AbstractBufferingHandler. Use a new command in order to
leave channel.close() available for abrupt killing of the connection
when connecting.
ping_afterTermination was previously racey that made it succeed. After
fixing the test, Netty would consistently fail to call callback. After
fixing Netty to fail the callback it was not using the right status
because when Netty's channel is closed none of our handlers are run.
This reliably fails the future with ClosedChannelException, which is
useless, so now we special-case that exception and fill in the reason
for shutdown.
To prevent accidentally reporting Status.OK, the transports no longer
use OK when calling transportShutdown. The OK status was already no
longer being consumed, since keying off whether transportReady was
called is more helpful.
This fixes#1330
DelayedClientTransport.PendingStream will override cancel(), which has a
clearer semantic.
Also permitting all status codes except OK in ClientStream.cancel(),
instead of just 4 codes.
Although the changes were determined automatically, they were manually
applied to the codebase.
ClientCalls actually has a bug fix, since the suggestion to add
interrupt() made it obvious that interrupted() was inappropriate.
Always return a completed future from `TransportSet`. If a (real) transport has not been created (e.g., in reconnect back-off), a `DelayedClientTransport` will be returned.
Eventually we will get rid of the transport futures everywhere, and have streams always __owned__ by some transports.
DelayedClientTransport
----------------------
After we get rid of the transport future, this is what `ClientCallImpl` and `LoadBalancer` get when a real transport has not been created yet. It buffers new streams and pings until `setTransport()` is called, after which point all buffered and future streams/pings are transferred to the real transport.
If a buffered stream is cancelled, `DelayedClientTransport` will remove it from the buffer list, thus #1342 will be resolved after the larger refactoring is complete.
This PR only makes `TransportSet` use `DelayedClientTransport`. Follow-up changes will be made to allow `LoadBalancer.pickTransport()` to return null, in which case `ManagedChannelImpl` will give `ClientCallImpl` a `DelayedClientTransport`.
Changes to ClientTransport shutdown semantics
---------------------------------------------
Previously when shutdown() is called, `ClientTransport` should not accept newStream(), and when all existing streams have been closed, `ClientTransport` is terminated. Only when a transport is terminated would a transport owner (e.g., `TransportSet`) remove the reference to it.
`DelayedClientTransport` brings about a new case: when `setTransport()` is called, we switch to the real transport and no longer need the delayed transport. This is achieved by calling `shutdown()` on the delayed transport and letting it terminate. However, as the delayed transport has already been handed out to users, we would like `newStream()` to keep working for them, even though the delayed transport is already shut down and terminated.
In order to make it easy to manage the life-cycle of `DelayedClientTransport`, we redefine the shutdown semantics of transport:
- A transport can own a stream. Typically the transport owns the streams
it creates, but there may be exceptions. `DelayedClientTransport` DOES
NOT OWN the streams it returns from `newStream()` after `setTransport()`
has been called. Instead, the ownership would be transferred to the
real transport.
- After `shutdown()` has been called, the transport stops owning new
streams, and `newStream()` may still succeed. With this idea,
`DelayedClientTransport`, even when terminated, will continue
passing `newStream()` to the real transport.
- When a transport is in shutdown state, and it doesn't own any stream,
it then can enter terminated state.
ManagedClientTransport / ClientTransport
----------------------------------------
Remove life-cycle interfaces from `ClientTransport`, and put them in its subclass - `ManagedClientTransport`, with the same idea that we have `Channel` and `ManagedChannel`. Only the one who creates the transport will get `ManagedClientTransport` thus is able to start and shutdown the transport. The users of transport, e.g., `LoadBalancer`, can only get `ClientTransport` thus are not alter its state. This change clarifies the responsibility of transport life-cycle management.
Fix TransportSet shutdown semantics
-----------------------------------
Currently, if `TransportSet.shutdown()` has been called, it no longer create new transports, which is wrong.
The correct semantics of `TransportSet.shutdown()` should be:
- Shutdown all transports, thus stop new streams being created on them
- Stop `obtainActiveTransport()` from returning transports
- Streams that already created, including those buffered in delayed transport, should continue. That means if delayed transport has buffered streams, we should let the existing reconnect task continue.
Having Handler implement ChannelInboundHandler is overspecifying and
unnecessary, as the code compiles fine just by changing "implements
ChannelInboundHandler" to "implements ChannelHandler".
PlaintextHandler was swapped to ChannelHandlerAdapter instead of
ChannelDuplexHandler because it just needs the ChannelHandler methods.
This just propagates the deprecated annotation from ChannelHandler. Note
that exceptionCaught is _not_ deprecated for ChannelInboundHandler and
ChannelDuplexHandler.
This provides more structured data into the application for it to do
special handling.
In general, we would hope most people don't need this functionality, but
it is a good escape hatch to allow users to workaround infrastructure
problems.
This reverts commit eca1f7c1d6.
We want to preserve the status message identical to what the server
sent. We'll need a better way to communicate debugging details.
We think this broke when the stream lifecycle listener was removed.
Observing the stream lifecycle would be the "proper" fix, but it had
notification ordering issues where streams would close before we were
notified of the event that caused the closure, which made it difficult
to provide useful error messages. The ordering of notifications was also
largely undefined.
The long term fix we look forward to is the HTTP/2 child channels, which
should have clearly defined ordering between error notification and
channel closure, and in the order that we need here.
Fixes#1251
NettyServer wasn't usable as public, since its constructor was
package-private. So although this reduces our API, it shouldn't actually
impact anyone.
Fixes#1047
This isn't expected to have much impact once the connection is
established because AbstractNettyHandler.exceptionCaught() wraps all
unknown exceptions with Http2Exception. Fixing that is future work.
Some cases we may now be unwrapping a StatusException, such as one
thrown by MessageDeframer. In general, that seems a Good Thing™, but it
is unclear exactly if it would be perceivable.
Fixes#1181
A few things to note:
- ByteString has gone away in favor of AsciiString.
- Http2Headers now uses CharSequence for all methods, so there are a few places that we have to explicitly check for AsciiString to get the optimizations.
- We now have to specify a graceful shutdown timeout for our Netty handlers. Using 5 seconds.
We've seen an NPE as a side-effect of a failure in protocol negotiation:
Oct 27, 2015 9:27:09 PM io.grpc.transport.netty.ProtocolNegotiators$AbstractBufferingHandler fail
SEVERE: Transport failed during protocol negotiation
java.lang.NullPointerException
at io.grpc.transport.netty.NettyClientHandler$2.visit(NettyClientHandler.java:218)
- Channel builders decide the default port based on whether TLS is used.
- Channel builders populate the default port via an Attributes object
passed to NameResolver.Factory#newNameResolver
- NameResolverRegistry contains all the official NameResolvers. Users
can also add custom NameResolvers to it. It looks up NameResolver by
try-and-fail. It is the default NameResolver.Factory for builders.
DnsNameResolver.
- Pass target as Strings instead of URIs from the channel builder to
ManagedChannelImpl. A target string is not necessarily a valid URI, in
which case ManagedChannelImpl will add "dns:///" to the beginning of
the target and use it as URI.
- DnsNameResolver will require scheme "dns" to be present. It no longer
allows scheme-absent URIs.
- Add NameResolver and LoadBalancer interfaces.
- ManagedChannelImpl now uses NameResolver and LoadBalancer for
transport selection, which may return transports to multiple
addresses.
- Transports are still owned by ManagedChannelImpl, which implements
TransportManager interface that is provided to LoadBalancer, so that
LoadBalancer doesn't worry about Transport lifecycles.
- Channel builders can be created by forTarget() that accepts the fully
qualified target name, which is an URI. (TODO) it's not tested.
- The old address-based construction pattern is supported by using
AbstractManagedChannelImplBuilder.DirectAddressNameResolver.
- (TODO) DnsNameResolver and SimpleLoadBalancer are currently
incomplete. They merely work for the single-address scenario.
NettyClientHandler currently handles non-HTTP/2 exceptions properly by forcing a shutdown of the connection. We need to do the server-side as well.
Fixes#1097
Previously, if the content type was being ignored the error code would
have been UNKNOWN since there was no grpc-status. That seems very to
accidentally pass, so now if the content type is ignored it would get
OK.
ServerCall already had "headers must be sent before any messages, which
must be sent before closing," but the implementation did not enforce it
and our async server handler didn't obey.
The benefit of forcing sending headers first is that it removes the only
implicit call in our API and interceptors dealing just with metadata
don't need to override sendMessage. The implicit behavior was bug-prone
since it wasn't obvious you were forgetting that headers may not be
sent.
We want to allow overriding authority in the ManagedChannelBuilder for
testing. In doing that, we basically require that all Channels support
authority. In reality, this simplifies things and is already being done
by the C implementation, as their unix domain socket support uses
"localhost" just like our in-process transport now does.
We can debate some whether "localhost" is really the most appropriate
authority for the in-process transport, but that should probably happen
later since "localhost" is "good enough" for now.
Although the functionality is currently available by passing a
manually-created InetAddress, that requires that the user do I/O before
calling our API and does not work with naming in the future.
This provides an API for applications to use gRPC without using
ExperimentalApis. It also allows swapping out a transport implementation
in the future.
Resolves#926. Transports can still be alive when the Server shuts down,
but they are using the worker event loops. Only release the worker event
loops when all transport's channels are closed.
Client:
* New ManagedChannel abstract class.
* Adding ping to Channel.
* Moving builders and implementations to internal.
Server:
* Added lifecycle management API to Server (mirroring ManagedChannel).
* Moved ServerImpl, AbstractServerBuilder and handler registries to internal.
* New ServerBuilder abstract class (mirroring ManagedChannelBuilder).
Fixes#545
The URI no longer needs to be provided to the Credential explicitly,
which prevents needing to know a magic string and allows using the same
Credential with multiple services.
The current process of building a channel is a bit complicated in that transports have to provide a own shutdown hook to the channel builder in order to close shared executors. This somewhat entagled creation pattern makes it difficult to separate the process of channel building from transport building. Better separating these two should make the code more readable and maintainable moving forward.
This takes some steps towards #525, but it won't be fixed until we
officially support OpenSSL. This is due to the fact that ALPN->NPN fallback
isn't supported with Jetty (since only one of the bootstrap plugins can be provided).
Reserve io.grpc for public API only, and all internal stuff in core to
io.grpc.internal, including the non-stable transport API.
Raise the netty/okhttp/inprocess subpackages one level up to io.grpc,
because they are public API and entry points for most users.
Details:
- Rename io.grpc.transport to io.grpc.internal;
- Move SharedResourceHolder and SerializingExecutor to io.grpc.internal
- Rename io.grpc.transport.{netty|okhttp|inprocess} to
io.grpc.{netty|okhttp|inprocess}
ChannelInactive should be called in all cases of channel going down, so
we only need to cancel ping there. Use goAwayStatus for the error, since
we will be putting the most effort into making that status useful.
When connectionError was set, goAwayStatus was also set, so we shouldn't
lose any errors.
NettyClientTransport doesn't really need a Throwable, it just needs a
Status. Passing a Status out of NettyClientHandler reduces the number of
places that need to do transport-specific translation of Throwables into
Status codes.
As described in SslHandler's documentation, handshakeFuture() and
SslHandshakeCompletionEvent are equivalent forms of learning of
handshake completion. Watching both causes double-logging and serves no
purpose.
Otherwise new writes will be written to the channel and will fail in
some unhelpful way.
Logging was removed as we really want to propagate the failure back to
the application via Calls, which is done by failing the
CreateStreamCommand message. Propagating back to the application via
call removes uncontrollable log spam and is necessary anyway to inform
the application what sort of failure occurred in order to appropriately
to react.
Note that even more importantly, this translates a RST_STREAM error code
to a gRPC status code. This is generally useful, but also necessary for
DEADLINE_EXCEEDED to be more reliable in 0eae0d9.
Fixes#687
Improved some consistency. writeHeaders was the only non-final
implementation method of ServerStream, even though it is really no
different than the others.
Resolves#511.
- In generated code, make CONFIG private and METHOD_* fields public.
METHOD_* fields are MethodDescriptors now, users of the CONFIG field
should switch to using the METHOD_* fields.
- Move MethodType into MethodDescriptor (#529).
- Unify the fully qualified method name. It is fully qualified service
name + slash + short method name. It doesn't have the leading slash.
- HandlerRegistry switches the key from short method name to fully
qualified method name.
- Pass CallOptions to Channel.newCall() and
ClientInterceptor.interceptCall().
- Remove timeout from AbstractStub.StubConfigBuilder and add deadline,
which is stored in a CallOptions inside the stub.
- Deadline is in nanoseconds in the clock defined by System.nanoTime().
It is converted to timeout before transmitting on the wire. Fail the
call with DEADLINE_EXCEEDED if it's already expired.
The mapping is poorly suited for gRPC. C and Go don't even do any
mapping. We can improve the mapping in the future, but it is very
important that users don't start depending on the current mapping.
This change is "inspired by" the original code, but is even more
conservative.
Fixes#477
GCM is very slow, and doesn't provide any benefit in unit tests. Even if
we were using tcnative and GCM is fast, using more available ciphers in
tests still makes sense. With this change building with Java 7 works
again, although that isn't the reason for the change.
On my machine with parallel building, it cuts full build time from
92 seconds to 39 seconds. For an incremental build after only changing
an interop test, the build time is cut from 73 seconds to 15 seconds.
When shutting down the Netty event loop, we have already guaranteed that
all users of it are no longer running. Doing a shutdownGracefully is
just delaying graceful JVM termination by two seconds. This is very
noticeable for short-lived processes, like our integration tests.
We would actually also prefer to shutdown quickly and get a
RejectedExecutionException for any newly queued tasks, because that
would be a legitimate bug.
shutdown() is deprecated, thus we do shutdownGracefully with a timeout
of 0.
isReady() can provide pushback while the call is in progress, but it
can also provide the pushback necessary when the client creates more
streams than permitted by MAX_CONCURRENT_STREAMS.
As part of this commit, OkHttp is now calling onReady() after call
creation (it previously never called onReady()).