OpenCloudOS-Kernel/include
Jason Baron df0108c5da epoll: add EPOLLEXCLUSIVE flag
Currently, epoll file descriptors or epfds (the fd returned from
epoll_create[1]()) that are added to a shared wakeup source are always
added in a non-exclusive manner.  This means that when we have multiple
epfds attached to a shared fd source they are all woken up.  This creates
thundering herd type behavior.

Introduce a new 'EPOLLEXCLUSIVE' flag that can be passed as part of the
'event' argument during an epoll_ctl() EPOLL_CTL_ADD operation.  This new
flag allows for exclusive wakeups when there are multiple epfds attached
to a shared fd event source.

The implementation walks the list of exclusive waiters, and queues an
event to each epfd, until it finds the first waiter that has threads
blocked on it via epoll_wait().  The idea is to search for threads which
are idle and ready to process the wakeup events.  Thus, we queue an event
to at least 1 epfd, but may still potentially queue an event to all epfds
that are attached to the shared fd source.

Performance testing was done by Madars Vitolins using a modified version
of Enduro/X.  The use of the 'EPOLLEXCLUSIVE' flag reduce the length of
this particular workload from 860s down to 24s.

Sample epoll_clt text:

EPOLLEXCLUSIVE

  Sets an exclusive wakeup mode for the epfd file descriptor that is
  being attached to the target file descriptor, fd.  Thus, when an event
  occurs and multiple epfd file descriptors are attached to the same
  target file using EPOLLEXCLUSIVE, one or more epfds will receive an
  event with epoll_wait(2).  The default in this scenario (when
  EPOLLEXCLUSIVE is not set) is for all epfds to receive an event.
  EPOLLEXCLUSIVE may only be specified with the op EPOLL_CTL_ADD.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Tested-by: Madars Vitolins <m@silodev.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
..
acpi Merge branches 'acpi-scan', 'acpi-bus', 'acpi-osl' and 'acpi-pm' 2016-01-12 01:10:03 +01:00
asm-generic virtio: barrier rework+fixes 2016-01-18 16:44:24 -08:00
clocksource arm64: KVM: Implement timer save/restore 2015-12-14 11:30:39 +00:00
crypto Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-01-17 19:13:15 -08:00
drm Merge tag 'topic/drm-misc-2016-01-17' of git://anongit.freedesktop.org/drm-intel into drm-next 2016-01-18 07:01:16 +10:00
dt-bindings The clk framework and driver changes for 4.5 look pretty typical. The 2016-01-15 18:21:28 -08:00
keys Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2015-12-26 16:06:53 +11:00
kvm KVM: arm/arm64: vgic-v3: Make the LR indexing macro public 2015-12-14 11:30:38 +00:00
linux include/linux/radix-tree.h: fix error in docs about locks 2016-01-20 17:09:18 -08:00
math-emu
media [media] media-entitiy: add a function to create multiple links 2016-01-11 12:19:26 -02:00
memory
misc
net tcp_memcontrol: Forward declare cgroup_subsys and mem_cgroup stucts 2016-01-17 12:02:37 -05:00
pcmcia
ras
rdma IB/mad: Require CM send method for everything except ClassPortInfo 2015-12-08 12:19:11 -05:00
rxrpc
scsi Merge branch 'jejb-scsi' into misc 2016-01-07 15:51:13 -08:00
soc QE: Move QE from arch/powerpc to drivers/soc 2015-12-22 17:12:56 -06:00
sound ALSA: hda_intel: add card number to irq description 2016-01-12 21:05:16 +01:00
target target: Fix race for SCF_COMPARE_AND_WRITE_POST checking 2015-11-28 19:33:15 -08:00
trace Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-01-18 12:44:40 -08:00
uapi epoll: add EPOLLEXCLUSIVE flag 2016-01-20 17:09:18 -08:00
video omapdss: remove CONFIG_OMAP2_DSS_VENC from omapdss.h 2015-12-29 11:07:46 +02:00
xen virtio: barrier rework+fixes 2016-01-18 16:44:24 -08:00
Kbuild