Commit Graph

171 Commits

Author SHA1 Message Date
Matthew Wilcox ad3d6c7263 xarray: Add XArray load operation
The xa_load function brings with it a lot of infrastructure; xa_empty(),
xa_is_err(), and large chunks of the XArray advanced API that are used
to implement xa_load.

As the test-suite demonstrates, it is possible to use the XArray functions
on a radix tree.  The radix tree functions depend on the GFP flags being
stored in the root of the tree, so it's not possible to use the radix
tree functions on an XArray.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:57 -04:00
Matthew Wilcox 01959dfe77 xarray: Define struct xa_node
This is a direct replacement for struct radix_tree_node.  A couple of
struct members have changed name, so convert those.  Use a #define so
that radix tree users continue to work without change.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21 10:45:56 -04:00
Matthew Wilcox f8d5d0cc14 xarray: Add definition of struct xarray
This is a direct replacement for struct radix_tree_root.  Some of the
struct members have changed name; convert those, and use a #define so
that radix_tree users continue to work without change.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21 10:45:53 -04:00
Matthew Wilcox 02c02bf12c xarray: Change definition of sibling entries
Instead of storing a pointer to the slot containing the canonical entry,
store the offset of the slot.  Produces slightly more efficient code
(~300 bytes) and simplifies the implementation.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29 22:47:49 -04:00
Matthew Wilcox 3159f943aa xarray: Replace exceptional entries
Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries.  This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry).  It is also a change in emphasis; exceptional entries are
intimidating and different.  As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29 22:47:49 -04:00
Matthew Wilcox 66ee620f06 idr: Permit any valid kernel pointer to be stored
An upcoming change to the encoding of internal entries will set the bottom
two bits to 0b10.  Unfortunately, m68k only aligns some data structures
to 2 bytes, so the IDR will interpret them as internal entries and things
will go badly wrong.

Change the radix tree so that it stops either when the node indicates
that it's the bottom of the tree (shift == 0) or when the entry is not an
internal entry.  This means we cannot insert an arbitrary kernel pointer
as a multiorder entry, but the IDR does not permit multiorder entries.

Annoyingly, this means the IDR can no longer take advantage of the radix
tree's ability to store a single entry at offset 0 without allocating
memory.  A pointer which is 2-byte aligned cannot be stored directly in
the root as it would be indistinguishable from a node, so we must allocate
a node in order to store a 2-byte pointer at index 0.  The idr_replace()
function does not take a GFP flags argument, so cannot allocate memory.
If a user inserts a 4-byte aligned pointer at index 0 and then replaces
it with a 2-byte aligned pointer, we must be able to store it.

Arbitrary pointer values are still not permitted; pointers of the
form 2 + (i * 4) for values of i between 0 and 1023 are reserved for
the implementation.  These are not valid kernel pointers as they would
point into the zero page.

This change does cause a runtime memory consumption regression for
the IDA.  I will recover that later.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
2018-09-29 22:47:48 -04:00
Matthew Wilcox f272668deb test_ida: check_ida_destroy and check_ida_alloc
Move these tests from the userspace test-suite to the kernel test-suite.
Also convert check_ida_random to the new API.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:21 -04:00
Matthew Wilcox 5c78b0b1eb test_ida: Convert check_ida_conv to new API
Move as much as possible to kernel space; leave the parts in user space
that rely on checking memory allocation failures to detect the
transition between an exceptional entry and a bitmap.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:20 -04:00
Matthew Wilcox 161b47e31f test_ida: Move ida_check_max
Convert to new API and move to kernel space.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:20 -04:00
Matthew Wilcox 0a3856392c test_ida: Move ida_check_leaf
Convert to new API and move to kernel space.  Take the opportunity to
test the situation a little more thoroughly (ie at different offsets).

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:20 -04:00
Matthew Wilcox 06b0111366 idr-test: Convert ida_check_nomem to new API
We can't move this test to kernel space because there's no way to
force kmalloc to fail.  But we can use the new API and check this
works when the test is in userspace.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:20 -04:00
Matthew Wilcox 8ab8ba38d4 ida: Start new test_ida module
Start transitioning the IDA tests into kernel space.  Framework heavily
cribbed from test_xarray.c.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:54:20 -04:00
Matthew Wilcox d1c0d5e3c6 radix tree test suite: Enable ubsan
Add support for the undefined behaviour sanitizer and fix the bugs
that ubsan pointed out.  Nothing major, and all in the test suite,
not the code.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:31:20 -04:00
Matthew Wilcox c9b933521a radix tree test suite: Fix compilation
An include of xarray.h was added to lib/idr.c without updating the test
suite.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-08-21 23:31:20 -04:00
Matthew Wilcox 7a4deea1aa idr: fix invalid ptr dereference on item delete
If the radix tree underlying the IDR happens to be full and we attempt
to remove an id which is larger than any id in the IDR, we will call
__radix_tree_delete() with an uninitialised 'slot' pointer, at which
point anything could happen.  This was easiest to hit with a single
entry at id 0 and attempting to remove a non-0 id, but it could have
happened with 64 entries and attempting to remove an id >= 64.

Roman said:

  The syzcaller test boils down to opening /dev/kvm, creating an
  eventfd, and calling a couple of KVM ioctls. None of this requires
  superuser. And the result is dereferencing an uninitialized pointer
  which is likely a crash. The specific path caught by syzbot is via
  KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are
  other user-triggerable paths, so cc:stable is probably justified.

Matthew added:

  We have around 250 calls to idr_remove() in the kernel today. Many of
  them pass an ID which is embedded in the object they're removing, so
  they're safe. Picking a few likely candidates:

  drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl.
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar
  drivers/atm/nicstar.c could be taken down by a handcrafted packet

Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org
Fixes: 0a835c4f09 ("Reimplement IDR and IDA using the radix tree")
Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com>
Debugged-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-25 18:12:10 -07:00
Ross Zwisler fd8f58c40b radix tree test suite: multi-order iteration race
Add a test which shows a race in the multi-order iteration code.  This
test reliably hits the race in under a second on my machine, and is the
result of a real bug report against kernel a production v4.15 based
kernel (4.15.6-300.fc27.x86_64).  With a real kernel this issue is hit
when using order 9 PMD DAX radix tree entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree.  Remember that an order 2
entry looks like this:

  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

  radix_tree_delete()
    radix_tree_delete_item()
      __radix_tree_delete()
        replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL.  This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

                                      V preceding slot
  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                              ^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

                                             V preceding slot
  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                    ^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers.  This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

In the radix tree test suite this will be caught by the address
sanitizer:

  ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address
  0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0
  READ of size 8 at 0x60c0008ae400 thread T3
      #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660
      #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567
      #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655
      #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a)
      #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e)

Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Ross Zwisler 3e252fa7d4 radix tree test suite: add item_delete_rcu()
Currently the lifetime of "struct item" entries in the radix tree are
not controlled by RCU, but are instead deleted inline as they are
removed from the tree.

In the following patches we add a test which has threads iterating over
items pulled from the tree and verifying them in an
rcu_read_lock()/rcu_read_unlock() section.  This means that though an
item has been removed from the tree it could still be being worked on by
other threads until the RCU grace period expires.  So, we need to
actually free the "struct item" structures at the end of the grace
period, just as we do with "struct radix_tree_node" items.

Link: http://lkml.kernel.org/r/20180503192430.7582-4-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Ross Zwisler 8d9fa88edd radix tree test suite: fix mapshift build target
Commit c6ce3e2fe3 ("radix tree test suite: Add config option for map
shift") introduced a phony makefile target called 'mapshift' that ends
up generating the file generated/map-shift.h.  This phony target was
then added as a dependency of the top level 'targets' build target,
which is what is run when you go to tools/testing/radix-tree and just
type 'make'.

Unfortunately, this phony target doesn't actually work as a dependency,
so you end up getting:

  $ make
  make: *** No rule to make target 'generated/map-shift.h', needed by 'main.o'.  Stop.
  make: *** Waiting for unfinished jobs....

Fix this by making the file generated/map-shift.h our real makefile
target, and add this a dependency of the top level build target.

Link: http://lkml.kernel.org/r/20180503192430.7582-2-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Matthew Wilcox fa290cda10 radix tree: use GFP_ZONEMASK bits of gfp_t for flags
Patch series "XArray", v9.  (First part thereof).

This patchset is, I believe, appropriate for merging for 4.17.  It
contains the XArray implementation, to eventually replace the radix
tree, and converts the page cache to use it.

This conversion keeps the radix tree and XArray data structures in sync
at all times.  That allows us to convert the page cache one function at
a time and should allow for easier bisection.  Other than renaming some
elements of the structures, the data structures are fundamentally
unchanged; a radix tree walk and an XArray walk will touch the same
number of cachelines.  I have changes planned to the XArray data
structure, but those will happen in future patches.

Improvements the XArray has over the radix tree:

 - The radix tree provides operations like other trees do; 'insert' and
   'delete'. But what most users really want is an automatically
   resizing array, and so it makes more sense to give users an API that
   is like an array -- 'load' and 'store'. We still have an 'insert'
   operation for users that really want that semantic.

 - The XArray considers locking as part of its API. This simplifies a
   lot of users who formerly had to manage their own locking just for
   the radix tree. It also improves code generation as we can now tell
   RCU that we're holding a lock and it doesn't need to generate as much
   fencing code. The other advantage is that tree nodes can be moved
   (not yet implemented).

 - GFP flags are now parameters to calls which may need to allocate
   memory. The radix tree forced users to decide what the allocation
   flags would be at creation time. It's much clearer to specify them at
   allocation time.

 - Memory is not preloaded; we don't tie up dozens of pages on the off
   chance that the slab allocator fails. Instead, we drop the lock,
   allocate a new node and retry the operation. We have to convert all
   the radix tree, IDA and IDR preload users before we can realise this
   benefit, but I have not yet found a user which cannot be converted.

 - The XArray provides a cmpxchg operation. The radix tree forces users
   to roll their own (and at least four have).

 - Iterators take a 'max' parameter. That simplifies many users and will
   reduce the amount of iteration done.

 - Iteration can proceed backwards. We only have one user for this, but
   since it's called as part of the pagefault readahead algorithm, that
   seemed worth mentioning.

 - RCU-protected pointers are not exposed as part of the API. There are
   some fun bugs where the page cache forgets to use rcu_dereference()
   in the current codebase.

 - Value entries gain an extra bit compared to radix tree exceptional
   entries. That gives us the extra bit we need to put huge page swap
   entries in the page cache.

 - Some iterators now take a 'filter' argument instead of having
   separate iterators for tagged/untagged iterations.

The page cache is improved by this:

 - Shorter, easier to read code

 - More efficient iterations

 - Reduction in size of struct address_space

 - Fewer walks from the top of the data structure; the XArray API
   encourages staying at the leaf node and conducting operations there.

This patch (of 8):

None of these bits may be used for slab allocations, so we can use them
as radix tree flags as long as we mask them off before passing them to
the slab allocator. Move the IDR flag from the high bits to the
GFP_ZONEMASK bits.

Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:39 -07:00
Matthew Wilcox 4b0ad07653 idr: Fix handling of IDs above INT_MAX
Khalid reported that the kernel selftests are currently failing:

selftests: test_bpf.sh
========================================
test_bpf: [FAIL]
not ok 1..8 selftests:  test_bpf.sh [FAIL]

He bisected it to 6ce711f275 ("idr: Make
1-based IDRs more efficient").

The root cause is doing a signed comparison in idr_alloc_u32() instead
of an unsigned comparison.  I went looking for any similar problems and
found a couple (which would each result in the failure to warn in two
situations that aren't supposed to happen).

I knocked up a few test-cases to prove that I was right and added them
to the test-suite.

Reported-by: Khalid Aziz <khalid.aziz@oracle.com>
Tested-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-26 14:39:30 -05:00
Matthew Wilcox 3d4d5d6186 radix tree test suite: Fix build
- Add an empty linux/compiler_types.h (now being included by kconfig.h)
 - Add __GFP_ZERO
 - Add kzalloc
 - Test __GFP_DIRECT_RECLAIM instead of __GFP_NOWARN

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-25 06:00:11 -05:00
Matthew Wilcox 6ce711f275 idr: Make 1-based IDRs more efficient
About 20% of the IDR users in the kernel want the allocated IDs to start
at 1.  The implementation currently searches all the way down the left
hand side of the tree, finds no free ID other than ID 0, walks all the
way back up, and then all the way down again.  This patch 'rebases' the
ID so we fill the entire radix tree, rather than leave a gap at 0.

Chris Wilson says: "I did the quick hack of allocating index 0 of the
idr and that eradicated idr_get_free() from being at the top of the
profiles for the many-object stress tests. This improvement will be
much appreciated."

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-06 16:41:29 -05:00
Matthew Wilcox 460488c58c idr: Remove idr_alloc_ext
It has no more users, so remove it.  Move idr_alloc() back into idr.c,
move the guts of idr_alloc_cmn() into idr_alloc_u32(), remove the
wrappers around idr_get_free_cmn() and rename it to idr_get_free().
While there is now no interface to allocate IDs larger than a u32,
the IDR internals remain ready to handle a larger ID should a need arise.

These changes make it possible to provide the guarantee that, if the
nextid pointer points into the object, the object's ID will be initialised
before a concurrent lookup can find the object.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-06 16:41:28 -05:00
Matthew Wilcox 6e6d301490 IDR test suite: Check handling negative end correctly
One of the charming quirks of the idr_alloc() interface is that you
can pass a negative end and it will be interpreted as "maximum".  Ensure
we don't break that.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-06 15:07:20 -05:00
Matthew Wilcox 490645d027 idr test suite: Fix ida_test_random()
The test was checking the wrong errno; ida_get_new_above() returns
EAGAIN, not ENOMEM on memory allocation failure.  Double the number of
threads to increase the chance that we actually exercise this path
during the test suite (it was a bit sporadic before).

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-06 15:07:20 -05:00
Matthew Wilcox 051803c0d0 radix tree test suite: Remove ARRAY_SIZE
This is now defined in tools/include/linux/kernel.h, so our
definition generates a warning.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-06 15:07:20 -05:00
Mel Gorman c7df8ad291 mm, truncate: do not check mapping for every page being truncated
During truncation, the mapping has already been checked for shmem and
dax so it's known that workingset_update_node is required.

This patch avoids the checks on mapping for each page being truncated.
In all other cases, a lookup helper is used to determine if
workingset_update_node() needs to be called.  The one danger is that the
API is slightly harder to use as calling workingset_update_node directly
without checking for dax or shmem mappings could lead to surprises.
However, the API rarely needs to be used and hopefully the comment is
enough to give people the hint.

sparsetruncate (tiny)
                              4.14.0-rc4             4.14.0-rc4
                             oneirq-v1r1        pickhelper-v1r1
Min          Time      141.00 (   0.00%)      140.00 (   0.71%)
1st-qrtle    Time      142.00 (   0.00%)      141.00 (   0.70%)
2nd-qrtle    Time      142.00 (   0.00%)      142.00 (   0.00%)
3rd-qrtle    Time      143.00 (   0.00%)      143.00 (   0.00%)
Max-90%      Time      144.00 (   0.00%)      144.00 (   0.00%)
Max-95%      Time      147.00 (   0.00%)      145.00 (   1.36%)
Max-99%      Time      195.00 (   0.00%)      191.00 (   2.05%)
Max          Time      230.00 (   0.00%)      205.00 (  10.87%)
Amean        Time      144.37 (   0.00%)      143.82 (   0.38%)
Stddev       Time       10.44 (   0.00%)        9.00 (  13.74%)
Coeff        Time        7.23 (   0.00%)        6.26 (  13.41%)
Best99%Amean Time      143.72 (   0.00%)      143.34 (   0.26%)
Best95%Amean Time      142.37 (   0.00%)      142.00 (   0.26%)
Best90%Amean Time      142.19 (   0.00%)      141.85 (   0.24%)
Best75%Amean Time      141.92 (   0.00%)      141.58 (   0.24%)
Best50%Amean Time      141.69 (   0.00%)      141.31 (   0.27%)
Best25%Amean Time      141.38 (   0.00%)      140.97 (   0.29%)

As you'd expect, the gain is marginal but it can be detected.  The
differences in bonnie are all within the noise which is not surprising
given the impact on the microbenchmark.

radix_tree_update_node_t is a callback for some radix operations that
optionally passes in a private field.  The only user of the callback is
workingset_update_node and as it no longer requires a mapping, the
private field is removed.

Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:06 -08:00
Greg Kroah-Hartman b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Matthew Wilcox f0f3f2d0a3 radix tree test suite: Specify -m32 in LDFLAGS too
Michael's patch to use the default make rule for linking and the patch
from Rehas to use -m32 if building a 32-bit test-suite on a 64-bit
platform don't work well together.

Reported-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:24 -05:00
Matthew Wilcox 4ecd9542db ida: Free correct IDA bitmap
There's a relatively rare race where we look at the per-cpu preallocated
IDA bitmap, see it's NULL, allocate a new one, and atomically update it.
If the kmalloc() happened to sleep and we were rescheduled to a different
CPU, or an interrupt came in at the exact right time, another task
might have successfully allocated a bitmap and already deposited it.
I forgot what the semantics of cmpxchg() were and ended up freeing the
wrong bitmap leading to KASAN reporting a use-after-free.

Dmitry found the bug with syzkaller & wrote the patch.  I wrote the test
case that will reproduce the bug without his patch being applied.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:23 -05:00
Matthew Wilcox 3f1b6f9d49 radix tree test suite: Depend on Makefile and quieten grep
Changing the CFLAGS in the Makefile didn't always lead to a
recompilation because the OFILES didn't depend on the Makefile.
Also, after doing make clean, grep would still complain about
a missing map-shift.h; we need -s as well as -q.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:22 -05:00
Michael Ellerman 284d96a494 radix tree test suite: Fix build with --as-needed
Currently the radix tree test suite doesn't build with toolchains that
use --as-needed by default, for example Ubuntu's:

  cc -I. -I../../include -g -O2 -Wall -D_LGPL_SOURCE -fsanitize=address -lpthread -lurcu main.o ... -o main
  /usr/bin/ld: regression1.o: undefined reference to symbol 'pthread_join@@GLIBC_2.17'
  /lib/powerpc64le-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
  collect2: error: ld returned 1 exit status

This is caused by the custom makefile rules placing LDFLAGS before the
.o files that need the libraries.

We could fix it by using --no-as-needed, or rewriting the custom rules.
But we can also just drop the custom rules and move the libraries to
LDLIBS, and then the default rules work correctly - with the one caveat
that we need to add -fsanitize=address to LDFLAGS because that must be
passed to the linker as well as the compiler.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:22 -05:00
Rehas Sachdeva c4634b08d9 radix tree test suite: Build 32 bit binaries
Add option 'make BUILD=32' for building 32-bit binaries.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:21 -05:00
Rehas Sachdeva 54f4d3341c radix tree test suite: Add performance test for radix_tree_join()
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:21 -05:00
Rehas Sachdeva 6478581c85 radix tree test suite: Add performance test for radix_tree_split()
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:20 -05:00
Rehas Sachdeva 0d4a41c1a0 radix tree test suite: Add performance benchmarks
Add performance benchmarks for radix tree insertion, tagging and deletion.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:20 -05:00
Rehas Sachdeva c629a344ac radix tree test suite: Add test for radix_tree_clear_tags()
Assert that radix_tree_clear_tags() clears the tags on the passed node and
slot. Assert that the case where the radix tree has only one entry at index
zero and the node is NULL, is also handled.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:19 -05:00
Rehas Sachdeva 166bb1f532 radix tree test suite: Add tests for ida_simple_get() and ida_simple_remove()
Assert that ida_simple_get() allocates an id in the passed range or returns
error on failure, and ida_simple_remove() releases an allocated id.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:19 -05:00
Rehas Sachdeva 2eacc79c27 radix tree test suite: Add test for idr_get_next()
Assert that idr_get_next() returns the next populated entry in the tree with
an ID greater than or equal to the value pointed to by @nextid argument.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-03-07 13:18:18 -05:00
Rehas Sachdeva c6ce3e2fe3 radix tree test suite: Add config option for map shift
Add config option "SHIFT=<value>" to Makefile for building test suite
with any value of RADIX_TREE_MAP_SHIFT between 3 and 7 inclusive.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
[mawilcox@microsoft.com: .gitignore, quieten grep, remove on clean]
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:10 -05:00
Matthew Wilcox c0cdbf819c radix tree test suite: Run iteration tests for longer
If the -l flag is set, run the tests for 100 seconds each instead of
the normal 10 seconds.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:08 -05:00
Matthew Wilcox 3b7869c31f radix tree test suite: Fix split/join memory leaks
The last of the memory leaks in the test suite was a couple of places in
the split/join testing where I forgot to free the element being removed
from the tree.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:08 -05:00
Matthew Wilcox 6da0396cac radix tree test suite: Fix leaks in regression2.c
None of the malloc'ed data structures were ever being freed.  Found with
-fsanitize=address.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:07 -05:00
Matthew Wilcox 18d0c57394 radix tree test suite: Fix leaky tests
If item_insert() or item_insert_order() failed to insert an item, they
would leak the item they had just created.  This was causing runaway
memory consumption while running the iteration_check testcase, which
proves that Ross has too much memory in his workstation ;-)

Make sure to free the item on error.  Found with -fsanitize=address.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:06 -05:00
Matthew Wilcox 829f83d365 radix tree test suite: Enable address sanitizer
I was looking for a memory scribble and instead found a pile of memory
leaks.  Ensure no more occur in future.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:06 -05:00
Matthew Wilcox 1293d5c5f5 radix-tree: Chain preallocated nodes through ->parent
Chaining through the ->private_data member means we have to zero
->private_data after removing preallocated nodes from the list.
We're about to initialise ->parent anyway, so we can avoid zeroing it.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:04 -05:00
Rehas Sachdeva 73bc029b76 radix tree test suite: Dial down verbosity with -v
Make the output of radix tree test suite less verbose by default and add
-v and -vv command line options for increasing level of verbosity.

Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:04 -05:00
Matthew Wilcox 5eeb2d23df radix tree test suite: Introduce kmalloc_verbose
To help track down where memory leaks may be, add the ability to turn
on/off printing allocations, frees and delayed frees.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:03 -05:00
Matthew Wilcox 8ac0486831 radix tree test suite: Build separate binaries for some tests
To allow developers to run a subset of tests, build separate multiorder
and idr-test binaries which will run just the tests in those files.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 21:44:02 -05:00
Matthew Wilcox d37cacc5ad ida: Use exceptional entries for small IDAs
We can use the root entry as a bitmap and save allocating a 128 byte
bitmap for an IDA that contains only a few entries (30 on a 32-bit
machine, 62 on a 64-bit machine).  This costs about 300 bytes of kernel
text on x86-64, so as long as 3 IDAs fall into this category, this
is a net win for memory consumption.

Thanks to Rasmus Villemoes for his work documenting the problem and
collecting statistics on IDAs.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:02 -05:00
Matthew Wilcox 7ad3d4d85c ida: Move ida_bitmap to a percpu variable
When we preload the IDA, we allocate an IDA bitmap.  Instead of storing
that preallocated bitmap in the IDA, we store it in a percpu variable.
Generally there are more IDAs in the system than CPUs, so this cuts down
on the number of preallocated bitmaps that are unused, and about half
of the IDA users did not call ida_destroy() so they were leaking IDA
bitmaps.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:01 -05:00
Matthew Wilcox 0a835c4f09 Reimplement IDR and IDA using the radix tree
The IDR is very similar to the radix tree.  It has some functionality that
the radix tree did not have (alloc next free, cyclic allocation, a
callback-based for_each, destroy tree), which is readily implementable on
top of the radix tree.  A few small changes were needed in order to use a
tag to represent nodes with free space below them.  More extensive
changes were needed to support storing NULL as a valid entry in an IDR.
Plain radix trees still interpret NULL as a not-present entry.

The IDA is reimplemented as a client of the newly enhanced radix tree.  As
in the current implementation, it uses a bitmap at the last level of the
tree.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-13 21:44:01 -05:00
Matthew Wilcox bfa11193c4 radix tree test suite: Remove obsolete CONFIG
radix-tree.c doesn't use these CONFIG options any more.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 16:09:43 -05:00
Matthew Wilcox 991af734c4 radix tree test suite: Use vpath to find lib files
Instead of specifying how to build find_bit.o from lib/find_bit.o,
use vpath to tell make where to find find_bit.c.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 16:09:42 -05:00
Matthew Wilcox ab3a1ffd11 radix tree test suite: Reduce kernel.h
Many of the definitions in the radix-tree kernel.h are redundant with
others in tools/include, or are no longer used, such as panic().
Move the definition of __init to init.h and in_interrupt() to preempt.h

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 16:09:42 -05:00
Matthew Wilcox 7a4f11b889 radix tree test suite: Remove export.h
The tools/include export.h contains everything we need.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 16:09:41 -05:00
Matthew Wilcox 12ea65390b radix tree test suite: Remove types.h
Move the pieces we still need to tools/include and update a few implicit
includes.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 16:09:41 -05:00
Matthew Wilcox a3c7890790 radix tree test suite: Remove mempool
The radix tree hasn't used a mempool since the beginning of git history.
Remove the userspace mempool implementation.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 16:09:36 -05:00
Matthew Wilcox 43a30c2a31 radix tree test suite: Depend on tools/include/asm files
Changing tools/include/asm/bug.h showed a missing dependency in the
Makefile.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
2017-02-13 16:08:47 -05:00
Matthew Wilcox b246a9d267 tools: Provide a definition of WARN_ON
The definition of WARN_ON being used by the radix tree test suite was
deficient in two ways: it did not provide a return value, and it stopped
execution instead of continuing.  This version of WARN_ON tells you
which file & line the assertion was triggered in.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-01-27 21:29:39 -05:00
Matthew Wilcox c68a2aab33 radix tree test suite: Remove duplicate bitops code
By adding __set_bit and __clear_bit to the tools include directory, we
can share the bitops code.  This reveals an include loop between kernel.h,
log2.h, bitmap.h and bitops.h.  Break it the same way as the kernel does;
by moving the kernel.h include from bitops.h to bitmap.h.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-01-27 21:29:39 -05:00
Matthew Wilcox b9a0deb96b redo: radix tree test suite: fix compilation
[ This resurrects commit 53855d10f4, which was reverted in
  2b41226b39.  It depended on commit d544abd5ff ("lib/radix-tree:
  Convert to hotplug state machine") so now it is correct to apply ]

Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
suite as it adds a call to cpuhp_setup_state_nocalls() which is not
currently emulated in the test suite.  Add it, and delete the emulation
of the old CPU hotplug mechanism.

Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-15 11:04:20 -08:00
Matthew Wilcox e1e14ab841 radix tree test suite: delete unused rcupdate.c
This file was used to implement call_rcu() before liburcu implemented
that function.  It hasn't even been compiled since before the test suite
was added to the kernel.  Remove it to reduce confusion.

Link: http://lkml.kernel.org/r/1481667692-14500-5-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 092bc0b225 radix tree test suite: add new tag check
We have a check that setting a tag on a single entry at root succeeds,
but we were missing a check that clearing a tag on that same entry also
succeeds.

Link: http://lkml.kernel.org/r/1481667692-14500-4-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox e8de434076 radix-tree: ensure counts are initialised
radix_tree_join() was freeing nodes with a non-zero ->exceptional count,
and radix_tree_split() wasn't zeroing ->exceptional when it allocated
the new node.  Fix this by making all callers of radix_tree_node_alloc()
pass in the new counts (and some other always-initialised fields), which
will prevent the problem recurring if in future we decide to do
something similar.

Link: http://lkml.kernel.org/r/1481667692-14500-3-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox bbe9d71f2c radix tree test suite: cache recently freed objects
The kmem_cache_alloc implementation simply allocates new memory from
malloc() and calls the ctor, which zeroes out the entire object.  This
means it cannot spot bugs where the object isn't properly reinitialised
before being freed.

Add a small (11 objects) cache before freeing objects back to malloc.
This is enough to let us write a test to catch it, although the memory
allocator is now aware of the structure of the radix tree node, since it
chains free objects through ->private_data (like the percpu cache does).

Link: http://lkml.kernel.org/r/1481667692-14500-2-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox de1af8f62a radix tree test suite: add some more functionality
IDR needs more functionality from the kernel: kmalloc()/kfree(), and
xchg().

Link: http://lkml.kernel.org/r/1480369871-5271-67-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 3e3cdc68be radix tree test suite: check multiorder iteration
The random iteration test only inserts order-0 entries currently.
Update it to insert entries of order between 7 and 0.  Also make the
maximum index configurable, make some variables static, make the test
duration variable, remove some useless spinning, and add a fifth thread
which calls tag_tagged_items().

Link: http://lkml.kernel.org/r/1480369871-5271-62-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox a90eb3a2a4 radix-tree: fix replacement for multiorder entries
When replacing an entry with NULL, we need to delete any sibling
entries.  Also account deleting exceptional entries properly.  Also fix
a bug with radix_tree_iter_replace() where we would fail to remove
entirely freed nodes.  Also fix accounting bug when switching between
normal and exceptional entries with replace_slot.  Also add testcases
for all these bugs.

Link: http://lkml.kernel.org/r/1480369871-5271-61-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 2791653a68 radix-tree: add radix_tree_split_preload()
Calculate how many nodes we need to allocate to split an old_order entry
into multiple entries, each of size new_order.  The test suite checks
that we allocated exactly the right number of nodes; neither too many
(checked by rtp->nr == 0), nor too few (checked by comparing
nr_allocated before and after the call to radix_tree_split()).

Link: http://lkml.kernel.org/r/1480369871-5271-60-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox e157b55594 radix-tree: add radix_tree_split
This new function splits a larger multiorder entry into smaller entries
(potentially multi-order entries).  These entries are initialised to
RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't
confused.  The caller should then call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries.  Tags are replicated from the original multiorder
entry into each new entry.

Link: http://lkml.kernel.org/r/1480369871-5271-59-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 175542f575 radix-tree: add radix_tree_join
This new function allows for the replacement of many smaller entries in
the radix tree with one larger multiorder entry.  From the point of view
of an RCU walker, they may see a mixture of the smaller entries and the
large entry during the same walk, but they will never see NULL for an
index which was populated before the join.

Link: http://lkml.kernel.org/r/1480369871-5271-58-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 268f42de71 radix-tree: delete radix_tree_range_tag_if_tagged()
This is an exceptionally complicated function with just one caller
(tag_pages_for_writeback).  We devote a large portion of the runtime of
the test suite to testing this one function which has one caller.  By
introducing the new function radix_tree_iter_tag_set(), we can eliminate
all of the complexity while keeping the performance.  The caller can now
use a fairly standard radix_tree_for_each() loop, and it doesn't need to
worry about tricksy things like 'start' wrapping.

The test suite continues to spend a large amount of time investigating
this function, but now it's testing the underlying primitives such as
radix_tree_iter_resume() and the radix_tree_for_each_tagged() iterator
which are also used by other parts of the kernel.

Link: http://lkml.kernel.org/r/1480369871-5271-57-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 478922e2b0 radix-tree: delete radix_tree_locate_item()
This rather complicated function can be better implemented as an
iterator.  It has only one caller, so move the functionality to the only
place that needs it.  Update the test suite to follow the same pattern.

Link: http://lkml.kernel.org/r/1480369871-5271-56-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 148deab223 radix-tree: improve multiorder iterators
This fixes several interlinked problems with the iterators in the
presence of multiorder entries.

1. radix_tree_iter_next() would only advance by one slot, which would
   result in the iterators returning the same entry more than once if
   there were sibling entries.

2. radix_tree_next_slot() could return an internal pointer instead of
   a user pointer if a tagged multiorder entry was immediately followed by
   an entry of lower order.

3. radix_tree_next_slot() expanded to a lot more code than it used to
   when multiorder support was compiled in.  And I wasn't comfortable with
   entry_to_node() being in a header file.

Fixing radix_tree_iter_next() for the presence of sibling entries
necessarily involves examining the contents of the radix tree, so we now
need to pass 'slot' to radix_tree_iter_next(), and we need to change the
calling convention so it is called *before* dropping the lock which
protects the tree.  Also rename it to radix_tree_iter_resume(), as some
people thought it was necessary to call radix_tree_iter_next() each time
around the loop.

radix_tree_next_slot() becomes closer to how it looked before multiorder
support was introduced.  It only checks to see if the next entry in the
chunk is a sibling entry or a pointer to a node; this should be rare
enough that handling this case out of line is not a performance impact
(and such impact is amortised by the fact that the entry we just
processed was a multiorder entry).  Also, radix_tree_next_slot() used to
force a new chunk lookup for untagged entries, which is more expensive
than the out of line sibling entry skipping.

Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 0629573e6b radix tree test suite: use common find-bit code
Remove the old find_next_bit code in favour of linking in the find_bit
code from tools/lib.

Link: http://lkml.kernel.org/r/1480369871-5271-48-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 101d9607ff radix tree test suite: record order in each item
This probably doubles the size of each item allocated by the test suite
but it lets us check a few more things, and may be needed for upcoming
API changes that require the caller pass in the order of the entry.

Link: http://lkml.kernel.org/r/1480369871-5271-46-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox 3ad75f8a1d radix tree test suite: handle exceptional entries
item_kill_tree() assumes that everything in the tree is a pointer to a
struct item, which is annoying when testing the behaviour of exceptional
entries.  Fix it to delete exceptional entries on the assumption they
don't need to be freed.

Link: http://lkml.kernel.org/r/1480369871-5271-45-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:10 -08:00
Matthew Wilcox af1c5cca90 radix tree test suite: use rcu_barrier
Calling rcu_barrier() allows all of the rcu-freed memory to be actually
returned to the pool, and allows nr_allocated to return to 0.  As well
as allowing diffs between runs to be more useful, it also lets us
pinpoint leaks more effectively.

Link: http://lkml.kernel.org/r/1480369871-5271-44-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Konstantin Khlebnikov cfa40bcfd6 radix tree test suite: benchmark for iterator
This adds simple benchmark for iterator similar to one I've used for
commit 78c1d78488 ("radix-tree: introduce bit-optimized iterator")

Building with make BENCHMARK=1 set radix tree order to 6, this allows to
get performance comparable to in kernel performance.

Link: http://lkml.kernel.org/r/1480369871-5271-43-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox ba20cd60c9 radix tree test suite: iteration test misuses RCU
Each thread needs to register itself with RCU, otherwise the reading
thread's read lock has no effect and the freeing thread will free the
memory in the tree without waiting for the read lock to be dropped.

Link: http://lkml.kernel.org/r/1480369871-5271-42-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox 061ef3936b radix tree test suite: make runs more reproducible
Instead of reseeding the random number generator every time around the
loop in big_gang_check(), seed it at the beginning of execution.  Use
rand_r() and an independent base seed for each thread in
iteration_test() so they don't stomp all over each others state.  Since
this particular test depends on the kernel scheduler, the iteration test
can't be reproduced based purely on the random seed, but at least it
won't pollute the other tests.

Print the seed, and allow the seed to be specified so that a run which
hits a problem can be reproduced.

Link: http://lkml.kernel.org/r/1480369871-5271-41-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox 6df5ee7867 radix tree test suite: free preallocated nodes
It can be a source of mild concern when the test suite shows that we're
leaking nodes.  While poring over the source code looking for leaks can
lead to some fascinating bugs being discovered, sometimes the leak is
simply that these nodes were preallocated and are sitting on the per-CPU
list.  Free them by calling the CPU dead callback.

Link: http://lkml.kernel.org/r/1480369871-5271-40-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox 847d357635 radix tree test suite: track preempt_count
Rather than simply NOP out preempt_enable() and preempt_disable(), keep
track of preempt_count and display it regularly in case either the test
suite or the code under test is forgetting to balance the enables &
disables.  Only found a test-case that was forgetting to re-enable
preemption, but it's a possibility worth checking.

Link: http://lkml.kernel.org/r/1480369871-5271-39-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox 31023cd664 radix tree test suite: allow GFP_ATOMIC allocations to fail
In order to test the preload code, it is necessary to fail GFP_ATOMIC
allocations, which requires defining GFP_KERNEL and GFP_ATOMIC properly.
Remove the obsolete __GFP_WAIT and copy the definitions of the __GFP
flags which are used from the kernel include files.  We also need the
real definition of gfpflags_allow_blocking() to persuade the radix tree
to actually use its preallocated nodes.

Link: http://lkml.kernel.org/r/1480369871-5271-38-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Matthew Wilcox ebb9a9aedb tools: add WARN_ON_ONCE
Patch series "Radix tree patches for 4.10", v3.

Mostly these are improvements; the only bug fixes in here relate to
multiorder entries (which are unused in the 4.9 tree).

This patch (of 32):

The radix tree uses its own buggy WARN_ON_ONCE.  Replace it with the
definition from asm-generic/bug.h

Link: http://lkml.kernel.org/r/1480369871-5271-37-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Johannes Weiner 6d75f366b9 lib: radix-tree: check accounting of existing slot replacement users
The bug in khugepaged fixed earlier in this series shows that radix tree
slot replacement is fragile; and it will become more so when not only
NULL<->!NULL transitions need to be caught but transitions from and to
exceptional entries as well.  We need checks.

Re-implement radix_tree_replace_slot() on top of the sanity-checked
__radix_tree_replace().  This requires existing callers to also pass the
radix tree root, but it'll warn us when somebody replaces slots with
contents that need proper accounting (transitions between NULL entries,
real entries, exceptional entries) and where a replacement through the
slot pointer would corrupt the radix tree node counts.

Link: http://lkml.kernel.org/r/20161117193021.GB23430@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:08 -08:00
Linus Torvalds 2b41226b39 Revert "radix tree test suite: fix compilation"
This reverts commit 53855d10f4.

It shouldn't have come in yet - it depends on the changes in linux-next
that will come in during the next merge window.  As Matthew Wilcox says,
the test suite is broken with the current state without the revert.

Requested-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-09 10:41:42 -08:00
Matthew Wilcox 53855d10f4 radix tree test suite: fix compilation
Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
suite as it adds a call to cpuhp_setup_state_nocalls() which is not
currently emulated in the test suite.  Add it, and delete the emulation
of the old CPU hotplug mechanism.

Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07 17:10:00 -08:00
Ross Zwisler e0176a2f1e radix-tree tests: properly initialize mutex
The pthread_mutex_t in regression1.c wasn't being initialized properly.

Link: http://lkml.kernel.org/r/20160815194237.25967-4-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Ross Zwisler eec4852543 radix-tree tests: add iteration test
There are four cases I can see where we could end up with a NULL 'slot' in
radix_tree_next_slot().  This unit test exercises all four of them, making
sure that if in the future we have an unsafe path through
radix_tree_next_slot(), we'll catch it.

Here are details on the four cases:

1) radix_tree_iter_retry() via a non-tagged iteration like
radix_tree_for_each_slot().  In this case we currently aren't seeing a bug
because radix_tree_iter_retry() sets

    iter->next_index = iter->index;

which means that in in the else case in radix_tree_next_slot(), 'count' is
zero, so we skip over the while() loop and effectively just return NULL
without ever dereferencing 'slot'.

2) radix_tree_iter_retry() via tagged iteration like
radix_tree_for_each_tagged().  This case was giving us NULL pointer
dereferences in testing, and was fixed with this commit:

commit 3cb9185c67 ("radix-tree: fix radix_tree_iter_retry() for tagged
iterators.")

This fix doesn't explicitly check for 'slot' being NULL, though, it works
around the NULL pointer dereference by instead zeroing iter->tags in
radix_tree_iter_retry(), which makes us bail out of the if() case in
radix_tree_next_slot() before we dereference 'slot'.

3) radix_tree_iter_next() via via a non-tagged iteration like
radix_tree_for_each_slot().  This currently happens in shmem_tag_pins()
and shmem_partial_swap_usage().

As with non-tagged iteration, 'count' in the else case of
radix_tree_next_slot() is zero, so we skip over the while() loop and
effectively just return NULL without ever dereferencing 'slot'.

4) radix_tree_iter_next() via tagged iteration like
radix_tree_for_each_tagged().  This happens in shmem_wait_for_pins().

radix_tree_iter_next() zeros out iter->tags, so we end up exiting
radix_tree_next_slot() here:

    if (flags & RADIX_TREE_ITER_TAGGED) {
	    void *canon = slot;

	    iter->tags >>= 1;
	    if (unlikely(!iter->tags))
		    return NULL;

Link: http://lkml.kernel.org/r/20160815194237.25967-3-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Linus Torvalds 597f03f9d1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Yet another batch of cpu hotplug core updates and conversions:

   - Provide core infrastructure for multi instance drivers so the
     drivers do not have to keep custom lists.

   - Convert custom lists to the new infrastructure. The block-mq custom
     list conversion comes through the block tree and makes the diffstat
     tip over to more lines removed than added.

   - Handle unbalanced hotplug enable/disable calls more gracefully.

   - Remove the obsolete CPU_STARTING/DYING notifier support.

   - Convert another batch of notifier users.

   The relayfs changes which conflicted with the conversion have been
   shipped to me by Andrew.

   The remaining lot is targeted for 4.10 so that we finally can remove
   the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  cpufreq: Fix up conversion to hotplug state machine
  blk/mq: Reserve hotplug states for block multiqueue
  x86/apic/uv: Convert to hotplug state machine
  s390/mm/pfault: Convert to hotplug state machine
  mips/loongson/smp: Convert to hotplug state machine
  mips/octeon/smp: Convert to hotplug state machine
  fault-injection/cpu: Convert to hotplug state machine
  padata: Convert to hotplug state machine
  cpufreq: Convert to hotplug state machine
  ACPI/processor: Convert to hotplug state machine
  virtio scsi: Convert to hotplug state machine
  oprofile/timer: Convert to hotplug state machine
  block/softirq: Convert to hotplug state machine
  lib/irq_poll: Convert to hotplug state machine
  x86/microcode: Convert to hotplug state machine
  sh/SH-X3 SMP: Convert to hotplug state machine
  ia64/mca: Convert to hotplug state machine
  ARM/OMAP/wakeupgen: Convert to hotplug state machine
  ARM/shmobile: Convert to hotplug state machine
  arm64/FP/SIMD: Convert to hotplug state machine
  ...
2016-10-03 19:43:08 -07:00
Matthew Wilcox 62fd5258eb radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
When we replace a multiorder entry, check that all indices reflect the
new value.

Also, compile the test suite with -O2, which shows other problems with
the code due to some dodgy pointer operations in the radix tree code.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-25 11:49:16 -07:00
Thomas Gleixner ee1e714b94 cpu/hotplug: Remove CPU_STARTING and CPU_DYING notifier
All users are converted to state machine, remove CPU_STARTING and the
corresponding CPU_DYING.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160818125731.27256-2-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06 18:30:19 +02:00
Valdis Kletnieks 117dec978c tools/testing/radix-tree/linux/gfp.h: fix bitrotted value
Apparently, the tools/testing version dates to a few flags ago, and
we've sprouted 4 new ones since.  Keep in sync with the value in the
main tree...

Link: http://lkml.kernel.org/r/23400.1469702675@turing-police.cc.vt.edu
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
Dan Carpenter b301aac5ad testing/radix-tree: fix a macro expansion bug
There are no parentheses around this macro and it causes a problem when
we do:

	index = rand() % THRASH_SIZE;

Link: http://lkml.kernel.org/r/20160715210953.GC19522@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-23 10:25:54 +09:00
Matthew Wilcox 8c1244de00 radix-tree: tidy up next_chunk
Convert radix_tree_next_chunk to use 'child' instead of 'slot' as the
name of the child node.  Also use node_maxindex() where it makes sense.

The 'rnode' variable was unnecessary; it doesn't overlap in usage with
'node', so we can just use 'node' the whole way through the function.

Improve the testcase to start the walk from every index in the carefully
constructed tree, and to accept any index within the range covered by
the entry.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox b194d16c27 radix-tree: rename radix_tree_is_indirect_ptr()
As with indirect_to_ptr(), ptr_to_indirect() and
RADIX_TREE_INDIRECT_PTR, change radix_tree_is_indirect_ptr() to
radix_tree_is_internal_node().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox 4dd6c0987c radix-tree: rename indirect_to_ptr() to entry_to_node()
Mirrors the earlier commit introducing node_to_entry().

Also change the type returned to be a struct radix_tree_node pointer.
That lets us simplify a couple of places in the radix tree shrink &
extend paths where we could convert an entry into a pointer, modify the
node, then convert the pointer back into an entry.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Matthew Wilcox 0694f0c9e2 radix tree test suite: remove dependencies on height
verify_node() can use node->shift instead of the height.

tree_verify_min_height() can be converted over to using node_maxindex()
and shift_maxindex() instead of radix_tree_maxindex().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00