It used to be that:
git ls-files "*Kconfig*"
would find all Kconfig files and would only find Kconfig files. This
commit renames TREE_RCU-Kconfig.txt to TREE_RCU-kconfig.txt so that this
is once again true.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit adds the kvm-recheck-lock.sh plug-in for locktorture to
print out lock-specific progress statistics.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit adds a trivial set of configuration files for lock
torturing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit uses the standard software ploy of introducing another
level of indirection below the configs directory. This allows each
torture-test suite to have its own set of Kconfig files, boot parameters,
and version-specific scripts. Initially, we have only rcu, but lock
will follow soonish.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The kvm-test-1-rcu.sh is not specific to RCU, so this commit renames it
to kvm-test-1-run.sh.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The current set of functions in ver_functions.sh have APIs that are
specific to RCU. This commit therefore makes an RCU-independent function
that outputs version-specific boot arguments. This has the benefit that
a test-type-independent call in kvm-test-1-rcu.sh can now handle any type
of test, given a test-type-specific set of files in a configs directory.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Currently, CONFIG_RCU_TORTURE_TEST=y is hardcoded into the
kvm-test-1-rcu.sh script and CONFIG_PRINTK_TIME=y is mentioned in each
and every configs file. This commit creates a CFcommon file for these
two Kconfig parameters, and modifies kvm-test-1-rcu.sh to copy this new
file into the .config file during the build. This change will allow
these scripts to operate on torture types other than just rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit adds a pair of files in the configs directory to allow
test-the-test runs of rcutorture via a "--configs BUSTED" argument to
the kvm.sh script.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit creates a plug-in to allow kvm-recheck.sh to process
non-rcutorture console output.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit prevents the results directory from being created for
dryruns. However, a script generated from a dryrun will create
the results directory should it be run.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org
This commit adds a "(!)" flag after the number of CPUs required by a
given test if that test requires more than the available number of CPUs.
Note that these flags appear only when the number of CPUs is specified
using the --cpus argument. In the absence of a --cpus argument, no
tests are flagged.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Running the standard set of rcutorture tests on 24 CPUs results in
the following sub-optimal schedule:
----start batch----
TREE07 16
----start batch----
TREE08 16
SRCU-P 8
----start batch----
TREE01 8
TREE02 8
TREE03 8
----start batch----
TREE04 8
TREE05 8
TREE06 8
----start batch----
SRCU-N 4
TINY01 1
TINY02 1
TREE09 1
If one of the eight-CPU runs were to be moved into the first batch,
the test suite would complete in four batches rather than five.
This commit therefore uses a greedy algorithm to re-order the test
entries so that the sequential batching will produce an optimal schedule
in this case:
----start batch----
TREE07 16
SRCU-P 8
----start batch----
TREE08 16
TREE01 8
----start batch----
TREE02 8
TREE03 8
TREE04 8
----start batch----
TREE05 8
TREE06 8
SRCU-N 4
TINY01 1
TINY02 1
TREE09 1
Please note that this is still not an optimal bin-packing algorithm,
however, it does produce optimal solutions for most common scenarios.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit fixes handling numbering of multiple runs of the same test
so as to disambiguate output.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Actual rcutorture tests take considerable time and machine resources,
so it is inconvenient to actually do an rcutorture run when optimizing
the bin-packing algorithm. This commit therefore adds a --dryrun
argument, which defaults to doing a run, but for which "sched"
says to simply print the run schedule and "script" dumps the script
without running it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The message complains about a build directory when it should instead
be complaining about the results directory, so this commit fixes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcutorture tests run by default range from using one CPU to using
sixteen of them. Therefore, rcutorture testing could be sped up
significantly simply by running the kernels in parallel. Building
them in parallel is not all that helpful: "make -j" is usually a
better bet. So this commit takes a new "--cpus" argument that
specifies how many CPUs rcutorture is permitted to use for its
parallel runs. The default of zero does sequential runs as before.
The bin-packing is minimal, and will be grossly suboptimal for
some configurations. However, powers of two work reasonably well.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Both SRCU-P and SRCU-N specify eight CPUs, which results in four
iterations for a parallel run on 32 CPUs. This commit reduces SRCU-N
to four CPUs (but leaving SRCU-P at eight) to speed up parallel runs,
while maintaining essentially the same test coverage.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Currently, most qemu flags are calculated in kvm-test-1-rcu.sh,
except that -nographics is set up by kvm.sh. This commit promotes
one-stop shopping by consolidating the determination of qemu flags into
kvm-test-1-rcu.sh.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Parallel rcutorture runs is valuable on large systems, but it is not a
good idea to do (say) five builds in parallel if each build believes it
has the whole system at its disposal, especially if the system is shared.
It is also bad to restrict the build to (say) a single CPU just because
the corresponding rcutorture run uses only a single CPU. This commit
therefore adds a kvm-test-1-rcu.sh ability to pause after the build
completes, which will allow kvm.sh to do a number of builds serially
(with each build thus having the full system at its disposal), then
allow the rcutorture runs to proceed in parallel.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Currently, most boot flags are calculated in kvm-test-1-rcu.sh, except
that rcutorture.test_no_idle_hz and rcutorture.verbose are set up by
kvm.sh. This commit promotes one-stop shopping by consolidating the
determination of boot flags into kvm-test-1-rcu.sh.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Although the script name and arguments are logged in the results directory,
it is more convenient to see it in the output. This commit therefore
adds the output of this information.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Sometime problems can manifest themselves as unusually slow grace periods.
This commit therefore prints the number of rcutorture updates during the
test and the number per second. These statistics are harvested from the
config.out and qemu-cmd files, and are silently omitted if these files
are not available, as would be the case if there was a build failure or
a boot-time hang.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
To help avoid an architecture failing to correctly check kernel/user
boundaries when handling copy_to_user, copy_from_user, put_user, or
get_user, perform some simple tests and fail to load if any of them
behave unexpectedly.
Specifically, this is to make sure there is a way to notice if things
like what was fixed in commit 8404663f81 ("ARM: 7527/1: uaccess:
explicitly check __user pointer when !CPU_USE_DOMAINS") ever regresses
again, for any architecture.
Additionally, adds new "user" selftest target, which loads this module.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All of the rcutorture scripts has the usual GPL header, which contains
a long-obsolete postal address for FSF. To avoid the need to track the
FSF office's movements, this commit substitutes the URL where GPL may
be found.
Reported-by: Greg KH <gregkh@linuxfoundation.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The output of the rcutorture scripts often requires interpretation, so
this commit simplifies this interpretation by tagging messages as
BUGs (colored red) or WARNINGs (colored yellow).
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Repeatedly running a given test, for example, by repeating the name
as in "--configs "TREE08 TREE08 TREE08" records the results only of
the last run of this test. This is because the earlier results are
overwritten by the later results.
This commit therefore checks for earlier results, using numbered
file extensions to distinguish multiple runs. The earlier example
would therefore create directories TREE01, TREE01.2, and TREE01.3.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The commit causes kvm.sh to invoke kvm-recheck.sh at the end of each
run, and causes kvm-recheck.sh to print only the name of the test, not
the full path to the corresponding Kconfig file.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The TREE08 Kconfig fragment does not enable tracing, which is appropriate
for its test case. However, this can be inconvenient in cases where
TREE08 locates RCU bugs. This commit therefore adds a TREE08-T that
differs from TREE08 only in enabling CONFIG_RCU_TRACE.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit adds the --kmake-arg to kvm.sh, which allows passing in
things like "V=1" to see the build commands, as well as enabling the
CROSS_COMPILE= make macro used for cross-building.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit adds the --no-initrd argument to kvm.sh, which permits
initrd to be contained in a root partition specified by the --bootargs
argument. Without --no-initrd, the kernel build expects an initrd
directory in the same rcutorture directory that contains bin and configs.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commits adds the --qemu-args argument to kvm.sh that is required
to pass boot devices down through to qemu.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit allows easy specification of trace_event lists, among other
things.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit adds --buildonly, which does the builds specified by the
--configs argument, but does not boot or test the resulting kernels.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Don't grab the configuration fragment from the configs directory because
it might well have been changed since the test was run. Instead, use
the ConfigFragment file that was placed in the results directory.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
As it stands, the default kernel boot parameters generated from
the Kconfig fragment will override any supplied with the .boot
file that can optionally accompany a Kconfig fragment. Rearrange
ordering to permit the specific .boot arguments to override those
generated by analyzing the Kconfig fragment.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit expands the checks for what architecture is running to generate
additional qemu-system- commands, then uses the resulting qemu-system-
command name to choose different qemu arguments as needed for different
architectures.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The --rcu-kvm argument was intended to allow the scripts to live in
an alternate location. Unfortunately, this prevents the kvm.sh script
from using common functions until after it finished parsing arguments,
because it doesn't know where to find them until then. However, "cp -a"
and "ln -s" work pretty well, so lack of an --rcu-kvm argument can be
easily worked around.
This commit therefore removes this argument.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The qemu -name argument doesn't seem to be useful in this environment,
so this commit removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The task of working out which flavor of qemu to use gets more complex
as more types of CPUs are supported. Adding Power makes three in addition
to 32-bit and 64-bit x86, so it is time to pull this out into a function.
This commit therefore creates an identify_qemu function and also adds
a --qemu-cmd command-line argument for the inevitable case where the
identify_qemu cannot figure it out.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The commit uses configcheck.sh from within configinit.sh, replacing the
imperfect inline expansion that was there before.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit drops no-longer-needed diagnostics from the output. Some of
them are retained in logfiles, in case they are ever needed.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The TINY_RCU test cases were first put in place many years ago, and have
been incrementally modified rather than being reworked. This commit
therefore completes a long-overdue reworking of the TINY_RCU test cases.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The TREE_RCU test cases were first put in place many years ago, and have
been incrementally modified rather than being reworked. This commit
therefore completes a long-overdue reworking of the TREE_RCU test cases.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Use .boot facility to ease inclusion of SRCU into automated testing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The v3.12 version of the kernel added the CONFIG_NO_HZ_FULL_SYSIDLE
Kconfig parameter, so this commit adds a version transition at that
point.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Some Kconfig fragments require rcutorture module parameters to
do optimal testing, for example, a configuration for SRCU would
need rcutorture.torture_type=srcu. This commit therefore adds a
per-Kconfig-fragment boot-parameter capability.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Different Kconfig parameters apply to different kernel versions, as
do different rcutorture module parameters. This commit allows the
rcutorture test scripts to adjust for different kernel versions.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Allow datestamp to be specified to allow tests to be broken up and run
in parallel.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
This commit adds the test framework that I used to test RCU under KVM.
This consists of a group of scripts and Kconfig fragments.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
The err variable is intended to receive the timer_create() return before
checking it
Signed-off-by: Felipe Pena <felipensp@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit adds a test of instruction counting using the PMU on powerpc.
Although the bulk of the code is architecture agnostic, the code needs to
run a precisely sized loop which is implemented in assembler.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This commit adds support code used by upcoming powerpc tests.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This commit adds a powerpc subdirectory to tools/testing/selftests,
for tests that are powerpc specific.
On other architectures nothing is built. The makefile supports cross
compilation if the user sets ARCH and CROSS_COMPILE.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull networking updates from David Miller:
"This is a re-do of the net-next pull request for the current merge
window. The only difference from the one I made the other day is that
this has Eliezer's interface renames and the timeout handling changes
made based upon your feedback, as well as a few bug fixes that have
trickeled in.
Highlights:
1) Low latency device polling, eliminating the cost of interrupt
handling and context switches. Allows direct polling of a network
device from socket operations, such as recvmsg() and poll().
Currently ixgbe, mlx4, and bnx2x support this feature.
Full high level description, performance numbers, and design in
commit 0a4db187a9 ("Merge branch 'll_poll'")
From Eliezer Tamir.
2) With the routing cache removed, ip_check_mc_rcu() gets exercised
more than ever before in the case where we have lots of multicast
addresses. Use a hash table instead of a simple linked list, from
Eric Dumazet.
3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
Marek Puzyniak, Michal Kazior, and Sujith Manoharan.
4) Support reporting the TUN device persist flag to userspace, from
Pavel Emelyanov.
5) Allow controlling network device VF link state using netlink, from
Rony Efraim.
6) Support GRE tunneling in openvswitch, from Pravin B Shelar.
7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
Daniel Borkmann and Eric Dumazet.
8) Allow controlling of TCP quickack behavior on a per-route basis,
from Cong Wang.
9) Several bug fixes and improvements to vxlan from Stephen
Hemminger, Pravin B Shelar, and Mike Rapoport. In particular,
support receiving on multiple UDP ports.
10) Major cleanups, particular in the area of debugging and cookie
lifetime handline, to the SCTP protocol code. From Daniel
Borkmann.
11) Allow packets to cross network namespaces when traversing tunnel
devices. From Nicolas Dichtel.
12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
manner akin to how we monitor real network traffic via ptype_all.
From Daniel Borkmann.
13) Several bug fixes and improvements for the new alx device driver,
from Johannes Berg.
14) Fix scalability issues in the netem packet scheduler's time queue,
by using an rbtree. From Eric Dumazet.
15) Several bug fixes in TCP loss recovery handling, from Yuchung
Cheng.
16) Add support for GSO segmentation of MPLS packets, from Simon
Horman.
17) Make network notifiers have a real data type for the opaque
pointer that's passed into them. Use this to properly handle
network device flag changes in arp_netdev_event(). From Jiri
Pirko and Timo Teräs.
18) Convert several drivers over to module_pci_driver(), from Peter
Huewe.
19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
O(1) calculation instead. From Eric Dumazet.
20) Support setting of explicit tunnel peer addresses in ipv6, just
like ipv4. From Nicolas Dichtel.
21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.
22) Prevent a single high rate flow from overruning an individual cpu
during RX packet processing via selective flow shedding. From
Willem de Bruijn.
23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
Dumazet.
24) Don't just drop GSO packets which are above the TBF scheduler's
burst limit, chop them up so they are in-bounds instead. Also
from Eric Dumazet.
25) VLAN offloads are missed when configured on top of a bridge, fix
from Vlad Yasevich.
26) Support IPV6 in ping sockets. From Lorenzo Colitti.
27) Receive flow steering targets should be updated at poll() time
too, from David Majnemer.
28) Fix several corner case regressions in PMTU/redirect handling due
to the routing cache removal, from Timo Teräs.
29) We have to be mindful of ipv4 mapped ipv6 sockets in
upd_v6_push_pending_frames(). From Hannes Frederic Sowa.
30) Fix L2TP sequence number handling bugs, from James Chapman."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
drivers/net: caif: fix wrong rtnl_is_locked() usage
drivers/net: enic: release rtnl_lock on error-path
vhost-net: fix use-after-free in vhost_net_flush
net: mv643xx_eth: do not use port number as platform device id
net: sctp: confirm route during forward progress
virtio_net: fix race in RX VQ processing
virtio: support unlocked queue poll
net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
Documentation: Fix references to defunct linux-net@vger.kernel.org
net/fs: change busy poll time accounting
net: rename low latency sockets functions to busy poll
bridge: fix some kernel warning in multicast timer
sfc: Fix memory leak when discarding scattered packets
sit: fix tunnel update via netlink
dt:net:stmmac: Add dt specific phy reset callback support.
dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
dt:net:stmmac: Allocate platform data only if its NULL.
net:stmmac: fix memleak in the open method
ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
net: ipv6: fix wrong ping_v6_sendmsg return value
...
Pull timer core updates from Thomas Gleixner:
"The timer changes contain:
- posix timer code consolidation and fixes for odd corner cases
- sched_clock implementation moved from ARM to core code to avoid
duplication by other architectures
- alarm timer updates
- clocksource and clockevents unregistration facilities
- clocksource/events support for new hardware
- precise nanoseconds RTC readout (Xen feature)
- generic support for Xen suspend/resume oddities
- the usual lot of fixes and cleanups all over the place
The parts which touch other areas (ARM/XEN) have been coordinated with
the relevant maintainers. Though this results in an handful of
trivial to solve merge conflicts, which we preferred over nasty cross
tree merge dependencies.
The patches which have been committed in the last few days are bug
fixes plus the posix timer lot. The latter was in akpms queue and
next for quite some time; they just got forgotten and Frederic
collected them last minute."
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits)
hrtimer: Remove unused variable
hrtimers: Move SMP function call to thread context
clocksource: Reselect clocksource when watchdog validated high-res capability
posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting
posix_timers: fix racy timer delta caching on task exit
posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()
selftests: add basic posix timers selftests
posix_cpu_timers: consolidate expired timers check
posix_cpu_timers: consolidate timer list cleanups
posix_cpu_timer: consolidate expiry time type
tick: Sanitize broadcast control logic
tick: Prevent uncontrolled switch to oneshot mode
tick: Make oneshot broadcast robust vs. CPU offlining
x86: xen: Sync the CMOS RTC as well as the Xen wallclock
x86: xen: Sync the wallclock when the system time is set
timekeeping: Indicate that clock was set in the pvclock gtod notifier
timekeeping: Pass flags instead of multiple bools to timekeeping_update()
xen: Remove clock_was_set() call in the resume path
hrtimers: Support resuming with two or more CPUs online (but stopped)
timer: Fix jiffies wrap behavior of round_jiffies_common()
...
The x bit can easily get lost (patch(1) loses it, for example).
Reported-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the confusing naming indicates, this test has some overlap with
pre-existing tests. Would be nice to merge them eventually. But since it
is only test code, cleanliness is much less important than mere existence.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
thuge-gen was forgotten. Fix it by removing the duplication, so we don't
get too many repeats.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In case this ever gets scripted, it should return 0 on success and 1 on
failure. Parsing the output should be left to meatbags.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add some initial basic tests on a few posix timers interface such as
setitimer() and timer_settime().
These simply check that expiration happens in a reasonable timeframe after
expected elapsed clock time (user time, user + system time, real time,
...).
This is helpful for finding basic breakages while hacking
on this subsystem.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Olivier Langlois <olivier@trillion01.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The TPACKET_V3 test code consists of a lot of unecessary macro
wrappers that rather obfuscate what members are accessed in what
way. So get rid of them and make the code more readable. Also
credit Chetan for providing tpacket_v3 example code. Furthermore,
get rid of private offset usage, as we do not need it here.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Revert commit 58c7be84fe ("selftest: add simple test for soft-dirty
bit"). This is the self test for Pavel's pagemap2 patches which didn't
actually get merged.
Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights (1721 non-merge commits, this has to be a record of some
sort):
1) Add 'random' mode to team driver, from Jiri Pirko and Eric
Dumazet.
2) Make it so that any driver that supports configuration of multiple
MAC addresses can provide the forwarding database add and del
calls by providing a default implementation and hooking that up if
the driver doesn't have an explicit set of handlers. From Vlad
Yasevich.
3) Support GSO segmentation over tunnels and other encapsulating
devices such as VXLAN, from Pravin B Shelar.
4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.
5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
Dukkipati.
6) In the PHY layer, allow supporting wake-on-lan in situations where
the PHY registers have to be written for it to be configured.
Use it to support wake-on-lan in mv643xx_eth.
From Michael Stapelberg.
7) Significantly improve firewire IPV6 support, from YOSHIFUJI
Hideaki.
8) Allow multiple packets to be sent in a single transmission using
network coding in batman-adv, from Martin Hundebøll.
9) Add support for T5 cxgb4 chips, from Santosh Rastapur.
10) Generalize the VXLAN forwarding tables so that there is more
flexibility in configurating various aspects of the endpoints.
From David Stevens.
11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
from Dmitry Kravkov.
12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
Neira Ayuso.
13) Start adding networking selftests.
14) In situations of overload on the same AF_PACKET fanout socket, or
per-cpu packet receive queue, minimize drop by distributing the
load to other cpus/fanouts. From Willem de Bruijn and Eric
Dumazet.
15) Add support for new payload offset BPF instruction, from Daniel
Borkmann.
16) Convert several drivers over to mdoule_platform_driver(), from
Sachin Kamat.
17) Provide a minimal BPF JIT image disassembler userspace tool, from
Daniel Borkmann.
18) Rewrite F-RTO implementation in TCP to match the final
specification of it in RFC4138 and RFC5682. From Yuchung Cheng.
19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
you like netlink, so I implemented netlink dumping of netlink
sockets.") From Andrey Vagin.
20) Remove ugly passing of rtnetlink attributes into rtnl_doit
functions, from Thomas Graf.
21) Allow userspace to be able to see if a configuration change occurs
in the middle of an address or device list dump, from Nicolas
Dichtel.
22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
Frederic Sowa.
23) Increase accuracy of packet length used by packet scheduler, from
Jason Wang.
24) Beginning set of changes to make ipv4/ipv6 fragment handling more
scalable and less susceptible to overload and locking contention,
from Jesper Dangaard Brouer.
25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
instead. From Hong Zhiguo.
26) Optimize route usage in IPVS by avoiding reference counting where
possible, from Julian Anastasov.
27) Convert IPVS schedulers to RCU, also from Julian Anastasov.
28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
Eitzenberger.
29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
nfnetlink_log, and nfnetlink_queue. From Gao feng.
30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.
31) Support several new r8169 chips, from Hayes Wang.
32) Support tokenized interface identifiers in ipv6, from Daniel
Borkmann.
33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.
34) Add 802.1ad vlan offload support, from Patrick McHardy.
35) Support mmap() based netlink communication, also from Patrick
McHardy.
36) Support HW timestamping in mlx4 driver, from Amir Vadai.
37) Rationalize AF_PACKET packet timestamping when transmitting, from
Willem de Bruijn and Daniel Borkmann.
38) Bring parity to what's provided by /proc/net/packet socket dumping
and the info provided by netlink socket dumping of AF_PACKET
sockets. From Nicolas Dichtel.
39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
Poirier"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
filter: fix va_list build error
af_unix: fix a fatal race with bit fields
bnx2x: Prevent memory leak when cnic is absent
bnx2x: correct reading of speed capabilities
net: sctp: attribute printl with __printf for gcc fmt checks
netlink: kconfig: move mmap i/o into netlink kconfig
netpoll: convert mutex into a semaphore
netlink: Fix skb ref counting.
net_sched: act_ipt forward compat with xtables
mlx4_en: fix a build error on 32bit arches
Revert "bnx2x: allow nvram test to run when device is down"
bridge: avoid OOPS if root port not found
drivers: net: cpsw: fix kernel warn on cpsw irq enable
sh_eth: use random MAC address if no valid one supplied
3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
tg3: fix to append hardware time stamping flags
unix/stream: fix peeking with an offset larger than data in queue
unix/dgram: fix peeking with an offset larger than data in queue
unix/dgram: peek beyond 0-sized skbs
openvswitch: Remove unneeded ovs_netdev_get_ifindex()
...
* Dump signals from process-wide and per-thread queues with
different sizes of buffers.
* Check error paths for buffers with restricted permissions. A part of
buffer or a whole buffer is for read-only.
* Try to get nonexistent signal.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pedro Alves <palves@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It creates a mapping of 3 pages and checks that reads, writes and
clear-refs result in present and soft-dirt bits reported from pagemap2
set as expected.
[akpm@linux-foundation.org: alphasort the Makefile TARGETS to reduce rejects]
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Testing like this for TP_STATUS_AVAILABLE clearly is a stupid bug
since it always returns true. Fix this by only checking for flags
where the kernel owns the packet and negate this result, since we
also could run into the non-zero status TP_STATUS_WRONG_FORMAT
and need to reclaim frames.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a simple test case that probes the packet socket's
TPACKET_V1, TPACKET_V2 and TPACKET_V3 behavior regarding mmap(2)'ed
I/O for a small burst of 100 packets. The test currently runs for ...
TPACKET_V1: RX_RING, TX_RING
TPACKET_V2: RX_RING, TX_RING
TPACKET_V3: RX_RING
... and will output on success:
test: TPACKET_V1 with PACKET_RX_RING .................... 100 pkts (9600 bytes)
test: TPACKET_V1 with PACKET_TX_RING .................... 100 pkts (9600 bytes)
test: TPACKET_V2 with PACKET_RX_RING .................... 100 pkts (9600 bytes)
test: TPACKET_V2 with PACKET_TX_RING .................... 100 pkts (9600 bytes)
test: TPACKET_V3 with PACKET_RX_RING .................... 100 pkts (9600 bytes)
OK. All tests passed
Reusable parts of psock_fanout.c have been put into a psock_lib.h
file for common usage. Test case successfully tested on x86_64.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The packetsocket fanout test uses a packet ring. Use TPACKET_V2
instead of TPACKET_V1 to work around a known 32/64 bit issue in
the older ring that manifests on sparc64.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix flaky results with PACKET_FANOUT_HASH depending on whether the
two flows hash into the same packet socket or not.
Also adds tests for PACKET_FANOUT_LB and PACKET_FANOUT_CPU and
replaces the counting method with a packet ring.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changes:
v3->v2: rebase (no other changes)
passes selftest
v2->v1: read f->num_members only once
fix bug: test rollover mode + flag
Minimize packet drop in a fanout group. If one socket is full,
roll over packets to another from the group. Maintain flow
affinity during normal load using an rxhash fanout policy, while
dispersing unexpected traffic storms that hit a single cpu, such
as spoofed-source DoS flows. Rollover breaks affinity for flows
arriving at saturated sockets during those conditions.
The patch adds a fanout policy ROLLOVER that rotates between sockets,
filling each socket before moving to the next. It also adds a fanout
flag ROLLOVER. If passed along with any other fanout policy, the
primary policy is applied until the chosen socket is full. Then,
rollover selects another socket, to delay packet drop until the
entire system is saturated.
Probing sockets is not free. Selecting the last used socket, as
rollover does, is a greedy approach that maximizes chance of
success, at the cost of extreme load imbalance. In practice, with
sufficiently long queues to absorb bursts, sockets are drained in
parallel and load balance looks uniform in `top`.
To avoid contention, scales counters with number of sockets and
accesses them lockfree. Values are bounds checked to ensure
correctness.
Tested using an application with 9 threads pinned to CPUs, one socket
per thread and sufficient busywork per packet operation to limits each
thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
packets, a FANOUT_CPU setup processes 32 Kpps in total without this
patch, 270 Kpps with the patch. Tested with read() and with a packet
ring (V1).
Also, passes psock_fanout.c unit test added to selftests.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stricter validation was introduced with commit da27a24383
("efivarfs: guid part of filenames are case-insensitive") and commit
47f531e8ba ("efivarfs: Validate filenames much more aggressively"),
which is necessary for the guid portion of efivarfs filenames, but we
don't need to be so strict with the first part, the variable name. The
UEFI specification doesn't impose any constraints on variable names
other than they be a NULL-terminated string.
The above commits caused a regression that resulted in users seeing
the following message,
$ sudo mount -v /sys/firmware/efi/efivars mount: Cannot allocate memory
whenever pstore EFI variables were present in the variable store,
since their variable names failed to pass the following check,
/* GUID should be right after the first '-' */
if (s - 1 != strchr(str, '-'))
as a typical pstore filename is of the form, dump-type0-10-1-<guid>.
The fix is trivial since the guid portion of the filename is GUID_LEN
bytes, we can use (len - GUID_LEN) to ensure the '-' character is
where we expect it to be.
(The bogus ENOMEM error value will be fixed in a separate patch.)
Reported-by: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Reported-by: Lingzhu Xiang <lxiang@redhat.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: <stable@vger.kernel.org> # v3.8
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
This change adds a little documentation to the tests under
tools/testing/selftests/, based on akpm's explanation.
[akpm@linux-foundation.org: move from Documentation to tools/testing/selftests/README.txt]
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do it one-per-line to reduce patch conflict pain.
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Test that reads from a newly-created efivarfs file (with no data written)
will return EOF.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This change adds a few initial efivarfs tests to the
tools/testing/selftests directory.
The open-unlink test is based on code from Lingzhu Xiang.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This test can be used to check wheither kernel supports IPC message queue
copy and restore features (required by CRIU project).
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I was curious why sys_kcmp wasn't working, which led me to the testcase.
It turned out I hadn't enabled CHECKPOINT_RESTORE in the kernel I was
testing. Add a decoding of errno to the testcase to make that obvious.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In case breakpoint test exit non zero value it will cause make error.
Better way is just print the test failure status.
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In case kcmp_test exit non zero value it will cause make error.
Better way is just print the test failure status.
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
make run_tests need the target is run_tests instead of run-tests
Also gcc output should be kcmp_test. Fix these two issues.
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original behavior:
bash-4.1$ make -C mqueue run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
./mq_open_tests /test1
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
make: *** [run_tests] Error 1
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
After applying the patch:
bash-4.1$ make -C mqueue run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
mq_open_tests: [FAIL]
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
mq_perf_tests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original behavior:
bash-4.1$ make -C vm run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
/bin/sh ./run_vmtests
./run_vmtests: line 24: /proc/sys/vm/nr_hugepages: Permission denied
Please run this test as root
make: *** [run_tests] Error 1
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
After applying the patch:
bash-4.1$ make -C vm run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
./run_vmtests: line 24: /proc/sys/vm/nr_hugepages: Permission denied
Please run this test as root
vmtests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit 03a7beb55b ("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.
We'll revisit this for 3.8.
: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
: and maintaining a user-space cache that contains information
: corresponding to each file descriptor being monitored by
: epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
: a file descriptor from the epoll interest list, and
: delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
: previously done an epoll_wait() that retrieved information
: about the fd in question, and may be in the middle of using
: information in the cache that relates to that fd. Thus,
: there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
: so would require applying a mutex across the epoll_wait()
: call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
: descriptor means (safely) doing the following:
: (a) Deleting the file descriptor from the epoll interest list
: using EPOLL_CTL_DEL
: (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
: conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
: conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
: EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
: a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
: should EPOLLONESHOT.
:
: b. When a thread wants to delete a file descriptor, it
: should do the following:
:
: [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
: [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
: was zero, then the file descriptor can be safely
: deleted by the thread that made this call.
: [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
: then the descriptor is in use. In this case, the calling
: thread should set a flag in the user-space cache to
: indicate that the thread that is using the descriptor
: should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
: (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Latest Linus head run of "make selftests" in the tools directory failed
with references to undefined variables. Reference was to
'write_thread_data' which is the name of a struct that is being used, not
the variable itself. Change reference so it points to the variable.
Signed-off-by: Daniel Hazelton <dshadowwolf@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enhanced epoll_ctl to support EPOLL_CTL_DISABLE, which disables an epoll
item. If epoll_ctl doesn't return -EBUSY in this case, it is then safe to
delete the epoll item in a multi-threaded environment. Also added a new
test_epoll self- test app to both demonstrate the need for this feature
and test it.
Signed-off-by: Paton J. Lewis <palewis@adobe.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Holland <pholland@adobe.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull the trivial tree from Jiri Kosina:
"Tiny usual fixes all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
doc: fix old config name of kprobetrace
fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc
btrfs: fix the commment for the action flags in delayed-ref.h
btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID
vfs: fix kerneldoc for generic_fh_to_parent()
treewide: fix comment/printk/variable typos
ipr: fix small coding style issues
doc: fix broken utf8 encoding
nfs: comment fix
platform/x86: fix asus_laptop.wled_type module parameter
mfd: printk/comment fixes
doc: getdelays.c: remember to close() socket on error in create_nl_socket()
doc: aliasing-test: close fd on write error
mmc: fix comment typos
dma: fix comments
spi: fix comment/printk typos in spi
Coccinelle: fix typo in memdup_user.cocci
tmiofb: missing NULL pointer checks
tools: perf: Fix typo in tools/perf
tools/testing: fix comment / output typos
...
This adds two selftests
* tools/testing/selftests/cpu-hotplug/on-off-test.sh is testing script
for CPU hotplug
1. Online all hot-pluggable CPUs
2. Offline all hot-pluggable CPUs
3. Online all hot-pluggable CPUs again
4. Exit if cpu-notifier-error-inject.ko is not available
5. Offline all hot-pluggable CPUs in preparation for testing
6. Test CPU hot-add error handling by injecting notifier errors
7. Online all hot-pluggable CPUs in preparation for testing
8. Test CPU hot-remove error handling by injecting notifier errors
* tools/testing/selftests/memory-hotplug/on-off-test.sh is doing the
similar thing for memory hotplug.
1. Online all hot-pluggable memory
2. Offline 10% of hot-pluggable memory
3. Online all hot-pluggable memory again
4. Exit if memory-notifier-error-inject.ko is not available
5. Offline 10% of hot-pluggable memory in preparation for testing
6. Test memory hot-add error handling by injecting notifier errors
7. Online all hot-pluggable memory in preparation for testing
8. Test memory hot-remove error handling by injecting notifier errors
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While doing the checkpoint-restore in the user space one need to determine
whether various kernel objects (like mm_struct-s of file_struct-s) are
shared between tasks and restore this state.
The 2nd step can be solved by using appropriate CLONE_ flags and the
unshare syscall, while there's currently no ways for solving the 1st one.
One of the ways for checking whether two tasks share e.g. mm_struct is to
provide some mm_struct ID of a task to its proc file, but showing such
info considered to be not that good for security reasons.
Thus after some debates we end up in conclusion that using that named
'comparison' syscall might be the best candidate. So here is it --
__NR_kcmp.
It takes up to 5 arguments - the pids of the two tasks (which
characteristics should be compared), the comparison type and (in case of
comparison of files) two file descriptors.
Lookups for pids are done in the caller's PID namespace only.
At moment only x86 is supported and tested.
[akpm@linux-foundation.org: fix up selftests, warnings]
[akpm@linux-foundation.org: include errno.h]
[akpm@linux-foundation.org: tweak comment text]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: Michal Marek <mmarek@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the mq_perf_tests tool I used when creating my mq performance patch.
Also add a local .gitignore to keep the binaries from showing up in git
status output.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a directory to house POSIX message queue subsystem specific tests.
Add first test which checks the operation of mq_open() under various
corner conditions.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Joe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hugepage-mmap.c, hugepage-shm.c and map_hugetlb.c in Documentation/vm are
simple pass/fail tests, It's better to promote them to
tools/testing/selftests.
Thanks suggestion of Andrew Morton about this. They all need firstly
setting up proper nr_hugepages and hugepage-mmap need to mount hugetlbfs.
So I add a shell script run_vmtests to do such work which will call the
three test programs and check the return value of them.
Changes to original code including below:
a. add run_vmtests script
b. return error when read_bytes mismatch with writed bytes.
c. coding style fixes: do not use assignment in if condition
[akpm@linux-foundation.org: build the targets before trying to execute them]
[akpm@linux-foundation.org: Documentation/vm/ no longer has a Makefile. Fixes "make clean"]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So a "make run_tests" will build the tests before trying to run them.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the run_tests script and launch the selftests by calling "make
run_tests" from the selftests top directory instead. This delegates to
the Makefile in each selftest directory, where it is decided how to launch
the local test.
This removes the need to add each selftest directory to the now removed
"run_tests" top script.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bring a first selftest in the relevant directory. This tests several
combinations of breakpoints and watchpoints in x86, as well as icebp traps
and int3 traps. Given the amount of breakpoint regressions we raised
after we merged the generic breakpoint infrastructure, such selftest
became necessary and can still serve today as a basis for new patches that
touch the do_debug() path.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bring a new kernel selftests directory in tools/testing/selftests. To
add a new selftest, create a subdirectory with the sources and a
makefile that creates a target named "run_test" then add the
subdirectory name to the TARGET var in tools/testing/selftests/Makefile
and tools/testing/selftests/run_tests script.
This can help centralizing and maintaining any useful selftest that
developers usually tend to let rust in peace on some random server.
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>