Commit Graph

1044717 Commits

Author SHA1 Message Date
Naohiro Aota a85f05e59b btrfs: zoned: avoid chunk allocation if active block group has enough space
The current extent allocator tries to allocate a new block group when the
existing block groups do not have enough space. On a ZNS device, a new
block group means a new active zone. If the number of active zones has
already reached the max_active_zones, activating a new zone needs to finish
an existing zone, leading to wasting the free space there.

So, instead, it should reuse the existing active block groups as much as
possible when we can't activate any other zones without sacrificing an
already activated block group.

While at it, I converted find_free_extent_update_loop() to check the
found_extent() case early and made the other conditions simpler.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota a12b0dc0aa btrfs: move ffe_ctl one level up
We are passing too many variables as it is from btrfs_reserve_extent() to
find_free_extent(). The next commit will add min_alloc_size to ffe_ctl, and
that means another pass-through argument. Take this opportunity to move
ffe_ctl one level up and drop the redundant arguments.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota eb66a010d5 btrfs: zoned: activate new block group
Activate new block group at btrfs_make_block_group(). We do not check the
return value. If failed, we can try again later at the actual extent
allocation phase.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 2e654e4bb9 btrfs: zoned: activate block group on allocation
Activate a block group when trying to allocate an extent from it. We check
read-only case and no space left case before trying to activate a block
group not to consume the number of active zones uselessly.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 68a384b5ab btrfs: zoned: load active zone info for block group
Load activeness of underlying zones of a block group. When underlying zones
are active, we add the block group to the fs_info->zone_active_bgs list.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota afba2bc036 btrfs: zoned: implement active zone tracking
Add zone_is_active flag to btrfs_block_group. This flag indicates the
underlying zones are all active. Such zone active block groups are tracked
by fs_info->active_bg_list.

btrfs_dev_{set,clear}_active_zone() take responsibility for the underlying
device part. They set/clear the bitmap to indicate zone activeness and
count the number of zones we can activate left.

btrfs_zone_{activate,finish}() take responsibility for the logical part and
the list management. In addition, btrfs_zone_finish() wait for any writes
on it and send REQ_OP_ZONE_FINISH to the zone.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota dafc340dbd btrfs: zoned: introduce physical_map to btrfs_block_group
We will use a block group's physical location to track active zones and
finish fully written zones in the following commits. Since the zone
activation is done in the extent allocation context which already holding
the tree locks, we can't query the chunk tree for the physical locations.
So, copy the location info into a block group and use it for activation.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota ea6f8ddcde btrfs: zoned: load active zone information from devices
The ZNS specification defines a limit on the number of zones that can be in
the implicit open, explicit open or closed conditions. Any zone with such
condition is defined as an active zone and correspond to any zone that is
being written or that has been only partially written. If the maximum
number of active zones is reached, we must either reset or finish some
active zones before being able to chose other zones for storing data.

Load queue_max_active_zones() and track the number of active zones left on
the device.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 8376d9e1ed btrfs: zoned: finish superblock zone once no space left for new SB
If there is no more space left for a new superblock in a superblock zone,
then it is better to ZONE_FINISH the zone and frees up the active zone
count.

Since btrfs_advance_sb_log() can now issue REQ_OP_ZONE_FINISH, we also need
to convert it to return int for the error case.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 9658b72ef3 btrfs: zoned: locate superblock position using zone capacity
sb_write_pointer() returns the write position of next superblock. For READ,
we need a previous location. When the pointer is at the head, the previous
one is the last one of the other zone. Calculate the last one's position
from zone capacity.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 5daaf552d1 btrfs: zoned: consider zone as full when no more SB can be written
We cannot write beyond zone capacity. So, we should consider a zone as
"full" when the write pointer goes beyond capacity - the size of super
info.

Also, take this opportunity to replace a subtle duplicated code with a loop
and fix a typo in comment.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota d8da0e8567 btrfs: zoned: tweak reclaim threshold for zone capacity
With the introduction of zone capacity, the range [capacity, length] is
always zone unusable. Counting this region as a reclaim target will
cause reclaiming too early. Reclaim block groups based on bytes that can
be usable after resetting.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:59 +02:00
Naohiro Aota 98173255bd btrfs: zoned: calculate free space from zone capacity
Now that we introduced capacity in a block group, we need to calculate free
space using the capacity instead of the length. Thus, bytes we account
capacity - alloc_pointer as free, and account bytes [capacity, length] as
zone unusable.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Naohiro Aota c46c4247ab btrfs: zoned: move btrfs_free_excluded_extents out of btrfs_calc_zone_unusable
btrfs_free_excluded_extents() is not neccessary for
btrfs_calc_zone_unusable() and it makes btrfs_calc_zone_unusable()
difficult to reuse. Move it out and call btrfs_free_excluded_extents()
in proper context.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Naohiro Aota 8eae532be7 btrfs: zoned: load zone capacity information from devices
The ZNS specification introduces the concept of a Zone Capacity.  A zone
capacity is an additional per-zone attribute that indicates the number of
usable logical blocks within each zone, starting from the first logical
block of each zone. It is always smaller or equal to the zone size.

With the SINGLE profile, we can set a block group's "capacity" as the same
as the underlying zone's Zone Capacity. We will limit the allocation not
to exceed in a following commit.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Qu Wenruo c22a3572cb btrfs: defrag: enable defrag for subpage case
With the new infrastructure which has taken subpage into consideration,
now we should be safe to allow defrag to work for subpage case.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Qu Wenruo c635757365 btrfs: defrag: remove the old infrastructure
Now the old infrastructure can all be removed, defrag

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Qu Wenruo 7b508037d4 btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()
The function defrag_one_cluster() is able to defrag one range well
enough, we only need to do preparation for it, including:

- Clamp and align the defrag range
- Exclude invalid cases
- Proper inode locking

The old infrastructures will not be removed in this patch, as it would
be too noisy to review.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:58 +02:00
Qu Wenruo b18c3ab234 btrfs: defrag: introduce helper to defrag one cluster
This new helper, defrag_one_cluster(), will defrag one cluster (at most
256K):

- Collect all initial targets

- Kick in readahead when possible

- Call defrag_one_range() on each initial target
  With some extra range clamping.

- Update @sectors_defragged parameter

This involves one behavior change, the defragged sectors accounting is
no longer as accurate as old behavior, as the initial targets are not
consistent.

We can have new holes punched inside the initial target, and we will
skip such holes later.
But the defragged sectors accounting doesn't need to be that accurate
anyway, thus I don't want to pass those extra accounting burden into
defrag_one_range().

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:57 +02:00
Qu Wenruo e9eec72151 btrfs: defrag: introduce helper to defrag a range
A new helper, defrag_one_range(), is introduced to defrag one range.

This function will mostly prepare the needed pages and extent status for
defrag_one_locked_target().

As we can only have a consistent view of extent map with page and extent
bits locked, we need to re-check the range passed in to get a real
target list for defrag_one_locked_target().

Since defrag_collect_targets() will call defrag_lookup_extent() and lock
extent range, we also need to teach those two functions to skip extent
lock.  Thus new parameter, @locked, is introduced to skip extent lock if
the caller has already locked the range.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:57 +02:00
Qu Wenruo 22b398eeee btrfs: defrag: introduce helper to defrag a contiguous prepared range
A new helper, defrag_one_locked_target(), introduced to do the real part
of defrag.

The caller needs to ensure both page and extents bits are locked, and no
ordered extent exists for the range, and all writeback is finished.

The core defrag part is pretty straight-forward:

- Reserve space
- Set extent bits to defrag
- Update involved pages to be dirty

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:07:13 +02:00
Qu Wenruo eb793cf857 btrfs: defrag: introduce helper to collect target file extents
Introduce a helper, defrag_collect_targets(), to collect all possible
targets to be defragged.

This function will not consider things like max_sectors_to_defrag, thus
caller should be responsible to ensure we don't exceed the limit.

This function will be the first stage of later defrag rework.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:06:53 +02:00
Qu Wenruo 5767b50c00 btrfs: defrag: factor out page preparation into a helper
In cluster_pages_for_defrag(), we have complex code block inside one
for() loop.

The code block is to prepare one page for defrag, this will ensure:

- The page is locked and set up properly.
- No ordered extent exists in the page range.
- The page is uptodate.

This behavior is pretty common and will be reused by later defrag
rework.

So factor out the code into its own helper, defrag_prepare_one_page(),
for later usage, and cleanup the code by a little.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:06:34 +02:00
Qu Wenruo 76068cae63 btrfs: defrag: replace hard coded PAGE_SIZE with sectorsize
When testing subpage defrag support, I always find some strange inode
nbytes error, after a lot of debugging, it turns out that
defrag_lookup_extent() is using PAGE_SIZE as size for
lookup_extent_mapping().

Since lookup_extent_mapping() is calling __lookup_extent_mapping() with
@strict == 1, this means any extent map smaller than one page will be
ignored, prevent subpage defrag to grab a correct extent map.

There are quite some PAGE_SIZE usage in ioctl.c, but most of them are
correct usages, and can be one of the following cases:

- ioctl structure size check
  We want ioctl structure to be contained inside one page.

- real page operations

The remaining cases in defrag_lookup_extent() and
check_defrag_in_cache() will be addressed in this patch.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:06:15 +02:00
Qu Wenruo cae7968680 btrfs: defrag: also check PagePrivate for subpage cases in cluster_pages_for_defrag()
In function cluster_pages_for_defrag() we have a window where we unlock
page, either start the ordered range or read the content from disk.

When we re-lock the page, we need to make sure it still has the correct
page->private for subpage.

Thus add the extra PagePrivate check here to handle subpage cases
properly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:05:18 +02:00
Qu Wenruo 1ccc2e8a86 btrfs: defrag: pass file_ra_state instead of file to btrfs_defrag_file()
Currently btrfs_defrag_file() accepts both "struct inode" and "struct
file" as parameter.  We can easily grab "struct inode" from "struct
file" using file_inode() helper.

The reason why we need "struct file" is just to re-use its f_ra.

Change this to pass "struct file_ra_state" parameter, so that it's more
clear what we really want.  Since we're here, also add some comments on
the function btrfs_defrag_file().

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:04:39 +02:00
Anand Jain a09f23c355 btrfs: rename and switch to bool btrfs_chunk_readonly
btrfs_chunk_readonly() checks if the given chunk is writeable. It
returns 1 for readonly, and 0 for writeable. So the return argument type
bool shall suffice instead of the current type int.

Also, rename btrfs_chunk_readonly() to btrfs_chunk_writeable() as we
check if the bg is writeable, and helps to keep the logic at the parent
function simpler to understand.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:03:57 +02:00
Sidong Yang 44bee215f7 btrfs: reflink: initialize return value to 0 in btrfs_extent_same()
Fix a warning reported by smatch that ret could be returned without
initialized.  The dedupe operations are supposed to to return 0 for a 0
length range but the caller does not pass olen == 0. To keep this
behaviour and also fix the warning initialize ret to 0.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:03:57 +02:00
Qu Wenruo 72a69cd030 btrfs: subpage: pack all subpage bitmaps into a larger bitmap
Currently we use u16 bitmap to make 4k sectorsize work for 64K page
size.

But this u16 bitmap is not large enough to contain larger page size like
128K, nor is space efficient for 16K page size.

To handle both cases, here we pack all subpage bitmaps into a larger
bitmap, now btrfs_subpage::bitmaps[] will be the ultimate bitmap for
subpage usage.

Each sub-bitmap will has its start bit number recorded in
btrfs_subpage_info::*_start, and its bitmap length will be recorded in
btrfs_subpage_info::bitmap_nr_bits.

All subpage bitmap operations will be converted from using direct u16
operations to bitmap operations, with above *_start calculated.

For 64K page size with 4K sectorsize, this should not cause much
difference.

While for 16K page size, we will only need 1 unsigned long (u32) to
store all the bitmaps, which saves quite some space.

Furthermore, this allows us to support larger page size like 128K and
258K.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:03:55 +02:00
Qu Wenruo 8481dd80ab btrfs: subpage: introduce btrfs_subpage_bitmap_info
Currently we use fixed size u16 bitmap for subpage bitmap.  This is fine
for 4K sectorsize with 64K page size.

But for 4K sectorsize and larger page size, the bitmap is too small,
while for smaller page size like 16K, u16 bitmaps waste too much space.

Here we introduce a new helper structure, btrfs_subpage_bitmap_info, to
record the proper bitmap size, and where each bitmap should start at.

By this, we can later compact all subpage bitmaps into one u32 bitmap.
This patch is the first step.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Qu Wenruo 651fb41927 btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directly
The existing calling convention of btrfs_alloc_subpage() is pretty
awful.  Change it to a more common pattern by returning struct
btrfs_subpage directly and let the caller to determine if the call
succeeded.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Qu Wenruo fdf250db89 btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller than PAGE_SIZE
There are two call sites of btrfs_alloc_subpage():

- btrfs_attach_subpage()
  We have ensured sectorsize is smaller than PAGE_SIZE

- alloc_extent_buffer()
  We call btrfs_alloc_subpage() unconditionally.

The alloc_extent_buffer() forces us to check the sectorsize size against
page size inside btrfs_alloc_subpage().

Since the function name, btrfs_alloc_subpage(), already indicates it
should only get called for subpage cases, do the check in
alloc_extent_buffer() and add an ASSERT() in btrfs_alloc_subpage().

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Su Yue 9675ea8c9d btrfs: update comment for fs_devices::seed_list in btrfs_rm_device
Update it since commit 944d3f9fac ("btrfs: switch seed device to
list api") did conversion from fs_devices::seed to fs_devices::seed_list.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Anand Jain 991a3daeda btrfs: drop unnecessary ret in ioctl_quota_rescan_status
There is no need for the variable ret after d66105cfa873 ("btrfs:
allocate btrfs_ioctl_quota_rescan_args on stack"), remove it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Marcos Paulo de Souza 0e3dd5bce8 btrfs: send: simplify send_create_inode_if_needed
The out label is being overused, we can simply return if the condition
permits.

No functional changes.

Reviewed-by: Su Yue <l@damenly.su>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Nikolay Borisov f6f39f7a0a btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk
The user facing function used to allocate new chunks is
btrfs_chunk_alloc, unfortunately there is yet another similar sounding
function - btrfs_alloc_chunk. This creates confusion, especially since
the latter function can be considered "private" in the sense that it
implements the first stage of chunk creation and as such is called by
btrfs_chunk_alloc.

To avoid the awkwardness that comes with having similarly named but
distinctly different in their purpose function rename btrfs_alloc_chunk
to btrfs_create_chunk, given that the main purpose of this function is
to orchestrate the whole process of allocating a chunk - reserving space
into devices, deciding on characteristics of the stripe size and
creating the in-memory structures.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Linus Torvalds 3906fe9bb7 Linux 5.15-rc7 2021-10-25 11:30:31 -07:00
Matthew Wilcox (Oracle) cb68543239 secretmem: Prevent secretmem_users from wrapping to zero
Commit 110860541f ("mm/secretmem: use refcount_t instead of atomic_t")
attempted to fix the problem of secretmem_users wrapping to zero and
allowing suspend once again.

But it was reverted in commit 87066fdd2e ("Revert 'mm/secretmem: use
refcount_t instead of atomic_t'") because of the problems it caused - a
refcount_t was not semantically the right type to use.

Instead prevent secretmem_users from wrapping to zero by forbidding new
users if the number of users has wrapped from positive to negative.
This stops a long way short of reaching the necessary 4 billion users
where it wraps to zero again, so there's no need to be clever with
special anti-wrap types or checking the return value from atomic_inc().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jordy Zomer <jordy@pwning.systems>
Cc: Kees Cook <keescook@chromium.org>,
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 11:27:31 -07:00
Linus Torvalds ac8a6eba2a spi: Fix tegra20 build with CONFIG_PM=n once again
Commit efafec27c5 ("spi: Fix tegra20 build with CONFIG_PM=n") already
fixed the build without PM support once.  There was an alternative fix
by Guenter in commit 2bab94090b ("spi: tegra20-slink: Declare runtime
suspend and resume functions conditionally"), and Mark then merged the
two correctly in ffb1e76f4f ("Merge tag 'v5.15-rc2' into spi-5.15").

But for some inexplicable reason, Mark then merged things _again_ in
commit 59c4e190b1 ("Merge tag 'v5.15-rc3' into spi-5.15"), and screwed
things up at that point, and the __maybe_unused attribute on
tegra_slink_runtime_resume() went missing.

Reinstate it, so that alpha (and other architectures without PM support)
builds cleanly again.

Btw, this is another prime example of how random back-merges are not
good.  Just don't do them.  Subsystem developers should not merge my
tree in any normal circumstances.  Both of those merge commits pointed
to above are bad: even the one that got the merge result right doesn't
even mention _why_ it was done, and the one that got it wrong is
obviously broken.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 10:46:41 -07:00
Linus Torvalds c2b43854aa ARM updates for 5.15:
- Fix clang-related relocation warning in futex code
 - Fix incorrect use of get_kernel_nofault()
 - Fix bad code generation in __get_user_check() when kasan is enabled
 - Ensure TLB function table is correctly aligned
 - Remove duplicated string function definitions in decompressor
 - Fix link-time orphan section warnings
 - Fix old-style function prototype for arch_init_kprobes()
 - Only warn about XIP address when not compile testing
 - Handle BE32 big endian for keystone2 remapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmF2tT8ACgkQ9OeQG+St
 rGRfpxAAhAco8l1Lm5+0zHAozIi4CogLAcg1EigsQEgorrpJJBSQb0PP5VS0BAnU
 Q48KmE4r5WNGouWNwhHALXMX7Vzv72S5XoFf1Df/LImrIP9qUvuqgbr1gfvgt8M0
 Ktc5P1eS4HC9WxrHHAcWsKaO/Uye+M3adNLNl5K50ADywSExa9VpY6I7ak/OfPot
 BlO9bXkk2991yI/Fg+9cqW7ub9WkabayioYWLuCaTtt99+MSNCDmYcZkTUQkLeSQ
 btF0+jW6/+odUXrA8zFx5QvIp8v35uO2w6fAw8FPjrXm0u2copr7JOAb/yVN4hJR
 sSKSrr+kRFZa/TCjUt8t+2fephQU6ppxJI9BlL+lQ3dyX+dyUmMKRjZP1ju8R2wc
 xKv6cMGfXbZu4jBUgpkekXZsvPs+05nr9op/yBDDHsuIuz0wa3n+oSVNaE9sm0az
 d/QUxA9ZeofKxnzWMvo2D4RWOVprCoqASqt4700Z1KXvNWd6kZaBL0HVREOaFdvt
 /AFNh3nVNDxjhED4NPVPFPv+INrY1EtUF6q8QUJmlj+7xcqaqkV7CgC6Ku8Wkbi1
 ELTx7gy0mZlM8wuMbMeNCdW/qQxuR/8JCETtNTF+ZpiTWWQ+uox7lEWi4xzFQNtG
 NDMoHd077Y8WFRJKaSpQHAbdcPlfSv6TDQ0uOkiPL1D6yiBeqvw=
 =cxhy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - Fix clang-related relocation warning in futex code

 - Fix incorrect use of get_kernel_nofault()

 - Fix bad code generation in __get_user_check() when kasan is enabled

 - Ensure TLB function table is correctly aligned

 - Remove duplicated string function definitions in decompressor

 - Fix link-time orphan section warnings

 - Fix old-style function prototype for arch_init_kprobes()

 - Only warn about XIP address when not compile testing

 - Handle BE32 big endian for keystone2 remapping

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
  ARM: 9141/1: only warn about XIP address when not compile testing
  ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
  ARM: 9138/1: fix link warning with XIP + frame-pointer
  ARM: 9134/1: remove duplicate memcpy() definition
  ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
  ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
  ARM: 9125/1: fix incorrect use of get_kernel_nofault()
  ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
2021-10-25 10:28:52 -07:00
Linus Torvalds 4862649f16 libata fixes for 5.15-rc7
A single fix in this pull request addressing an invalid error code
 return in the sata_mv driver (from Zheyu).
 
 Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYXX2FwAKCRDdoc3SxdoY
 dh2CAQC4Cha8PiSZvR6QVzJDaznQj3O59wP236x/lO9Y8+AFPwEAgFhyVyuMnipR
 fslV8bA4PsiSIluXHT0ACJhFC25Lmgw=
 =pkuU
 -----END PGP SIGNATURE-----

Merge tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull libata fix from Damien Le Moal:
 "A single fix in this pull request addressing an invalid error code
  return in the sata_mv driver (from Zheyu)"

* tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: sata_mv: Fix the error handling of mv_chip_id()
2021-10-25 09:57:28 -07:00
Linus Torvalds a51aec4109 Pin control fixes for the v5.15 series:
- Three fixes pertaining to Broadcom DT bindings. Some stuff
   didn't work out as inteded, we need to back out.
 
 - A resume bug fix in the STM32 driver.
 
 - Disable and mask the interrupts on probe in the AMD pinctrl
   driver, affecting Microsoft surface.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmF1vjkACgkQQRCzN7AZ
 XXNhGBAAsCP22E192zDwDFNpZMhf1yI7nfnCl12xMl3Q1iKz+oCKn5xOv/ShSEZf
 Ude+5GT3s7BrP7ZlP15ZkIxSZ6IRIsXAgovXnlUdDD9vUZ2z0yc0nTy3SVJbkF4I
 oJ2G2NFONkXLJWb1e7RdXUqSf3Uc6/Ka+/5W2xHLbhWVgJNzH9JWmus2atWXRbmn
 MR8tHOXqmFoeNEkipQE32YV7EOT+Yi2Kp5WtFNDZn8Un/f/4/cDpxPI3iRAvTGKC
 NPOGFqYV0U3l/Xvrez3jflt5FutdGt/kB492criRN4aWmDHmHK7eWHFzp1ZehK1F
 3wj+iPIMceKNofmdyEr8BHGjVCXTDT93+Qjlrg9Sl7lglMZjtz6OOxq6cvYrQO9T
 usxdJoD08NyRr8087TSzYZvjzGT+/FE7EjYqbVZT18G0VYsoT62tCAP2vLjkwxTv
 VW011TiQX3KSZVQQb4MtpnB1Mnr1TBOka8DY7lqqAnKE5EaQqo7EkhaLbqGiG7o8
 EMBEz73xTvzF2T9lyIujyHS1hCl/TxsCP0Bv2q3taVFiqBxnqThu05yASeQ1OSAY
 Gs8dId8Sp7dEJvC8KUk3sQkIP9hJc8faOuuaiBnj7BEKTp8x5NVXPR6D9Hyp7q82
 cS5sVplBjH+lTXbbxwniQ4jch62mJrUMwdl0jESLHSfPAU/4jso=
 =cgpF
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Some late pin control fixes, the most generally annoying will probably
  be the AMD IRQ storm fix affecting the Microsoft surface.

  Summary:

   - Three fixes pertaining to Broadcom DT bindings. Some stuff didn't
     work out as inteded, we need to back out

   - A resume bug fix in the STM32 driver

   - Disable and mask the interrupts on probe in the AMD pinctrl driver,
     affecting Microsoft surface"

* tag 'pinctrl-v5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: amd: disable and mask interrupts on probe
  pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume()
  Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
  dt-bindings: pinctrl: brcm,ns-pinmux: drop unneeded CRU from example
  Revert "dt-bindings: pinctrl: bcm4708-pinmux: rework binding to use syscon"
2021-10-25 09:47:18 -07:00
LABBE Corentin 00568b8a63 ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
My intel-ixp42x-welltech-epbx100 no longer boot since 4.14.
This is due to commit 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel
mapping regression")
which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config.

Suggested-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-25 13:11:34 +01:00
Zheyu Ma a0023bb9dd ata: sata_mv: Fix the error handling of mv_chip_id()
mv_init_host() propagates the value returned by mv_chip_id() which in turn
gets propagated by mv_pci_init_one() and hits local_pci_probe().

During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.

Since this is a bug rather than a recoverable runtime error we should
use dev_alert() instead of dev_err().

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-10-25 08:53:04 +09:00
Linus Torvalds 87066fdd2e Revert "mm/secretmem: use refcount_t instead of atomic_t"
This reverts commit 110860541f.

Converting the "secretmem_users" counter to a refcount is incorrect,
because a refcount is special in zero and can't just be incremented (but
a count of users is not, and "no users" is actually perfectly valid and
not a sign of a free'd resource).

Reported-by: syzbot+75639e6a0331cd61d3e2@syzkaller.appspotmail.com
Cc: Jordy Zomer <jordy@pwning.systems>
Cc: Kees Cook <keescook@chromium.org>,
Cc: Jordy Zomer <jordy@jordyzomer.github.io>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-24 09:48:33 -10:00
Linus Torvalds b20078fd69 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull autofs fix from Al Viro:
 "Fix for a braino of mine (in getting rid of open-coded
  dentry_path_raw() in autofs a couple of cycles ago).

  Mea culpa...  Obvious -stable fodder"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  autofs: fix wait name hash calculation in autofs_wait()
2021-10-24 09:36:06 -10:00
Linus Torvalds 6c62666d88 - Reset clang's Shadow Call Stack on hotplug to prevent it from overflowing
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF1Kz4ACgkQEsHwGGHe
 VUoQDA//UQhp6iDIAS9IVeca/ZZH3PWeyEJPQW/067UOM8jx+LkNvVBZnHn2mWai
 Zwlz9MvUsfo7O5mB0SMKly9hT10E9kHDD9jBDPeLS4sVN3wv1Ku5YdkK4esS+49X
 gbHHAPwL0SzR77Gx835I3grMUNbFrXgBkgP//DBcYSxX0nusey1XdgEuAoijTCC8
 tDWEmd5Wz7dSPgrw8ntxGrWsM2SwRPTfY3culuRJ+Xws0gE+THs3cQ2HUnNW6qiu
 g08fBBS+vD0X5UTv4iL0LHlPzmLUiMo/v6CsP1tyMoia3QgVYTVczz8CK0aAOFp8
 i7O8rD/k8BE0hNlwTjoB2R99weN69RqCJtHJYo5898AhHXZ3A0I1N1H/eZNXldo8
 cXlbFB4XPfhm+JwF+NPTNR5u2+YbyrT4+yrdCvljYGtm5w4imn0RGOhUkXEMYnEp
 XqhRSP3k8KUD0YIpMrHcBRHKbrZxo5ldNzXp7U//gLn5W2hTrNl+LPAArqfUx9DM
 NTjxc93gZjYm7/S9CUhPUaofLiU8nm+SDZDJi7NuxWO7d9OpyBckYk2y4yi+tGML
 MxdBtxGxUUwTWSvls0H+gPrnpLjllw1VZz1OnURypCu2I2HntHW9yTswDpnzPzAL
 Uykd5Ha4l8DEE59Qhy4ICKKiwpSQSe6ED/0LPPxPt5gW05tRnVM=
 =z6Fz
 -----END PGP SIGNATURE-----

Merge tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:
 "Reset clang's Shadow Call Stack on hotplug to prevent it from
  overflowing"

* tag 'sched_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/scs: Reset the shadow stack when idle_task_exit
2021-10-24 07:04:21 -10:00
Linus Torvalds 16bc177666 - Add Dave Hansen to the x86 maintainers team
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF1KRMACgkQEsHwGGHe
 VUoC+w/7B3nxBZ7VOgFXuR0+nN4SvMsb6RsR8qFdhGYEZxuyFSvsslD7JRiOtSWc
 8ZfCloLkGxfpuf5baOihR8bd9xoAuedFFZf5MTeIfsG3gnKo2vB6/dKKAY8M0fpQ
 Wpwc9zFWDNzs/RiW1vnU4ry/GYBh+o5WKP+kwlCdXfQmX6QAP3LG7xb0mqS2+anw
 WSQCek+B8eEvbVrZ0dbcPGvaNXlcBdt/t+Mit8ofpqPNVOAfWiN20dChPsm5EAoZ
 6A+Lg+T44Vb7jyUmsvP65t0ZF9e5ta+KKVXgOsioFtAWonv/Bmtz3xVsgnnwDMQW
 thUbjHPIKyN0Q5Egr16TSB+6vqex0miSLVDDYODSkQkBhzMR6MaRDwwq0u5M4XJM
 xNLpbD/Z3iJMosu1XCCoX0gc13Y7ji4YIFdj1I0nexC9xBtICXB8yQ6OebecMcl6
 DtZ83JBGC/SRj+8b1DoESiAb4OmHRc+nr37wolbHiWS6arDjwJV9MZ3cNZ6EDC+q
 qusRcA1ozcxpHQQTehRpOaJk7z4LRc6wab9EjdtgYa0yVGbz2JFAKVI1WWDoQ37r
 7LY9wrXuF7UYhi35CLG6RNRFjriDjAwiCS3f+gCHFrLMnrNgydkCRgjcHTM3GU4P
 C48Uk7zqJ3Wuos0iJp1ZwZpcpKjcu8m6LTA+QKWgrCzWPVgzpZE=
 =iEPb
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
 "A single change adding Dave Hansen to our maintainers team"

* tag 'x86_urgent_for_v5.15_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add Dave Hansen to the x86 maintainer team
2021-10-24 07:00:15 -10:00
Linus Torvalds c460e7896e Ten fixes for the ksmbd kernel server, for improved security and additional buffer overflow checks
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmFzb/4ACgkQiiy9cAdy
 T1HhOQv/bTsGX85V5QFBQyW7F7ymYaSuDpvb/nYFzzs4fMz8/lbCJ6GQHMK7G+uR
 ZYFdZw5c5A+KsfjRoRAKOc6HJbGERnKWvIlRkNWnmcpsXcR3PFhhOxXKXxCTBjga
 fHxAJBENbsyCtGz25pUtuOKVG9zVaz31lyXgnz53b0ATrb8CxIll7zdKTp7aGdK0
 UIWUEYeUwctIhvyBXyms8ni+15keop6/7Y1KhUvodL/VhU4YCkKerFFQojsqLEBk
 4N6ZrgEEvzZPSjMt3KkbAapMNvf8Jgy6hKrB10MWB2sJG3bG1A4MuWYzVz7hlLbx
 KXztSjbEiExCw7Y8ZXeA0pT/P51TVB9uxSLaDJhGVrXIxkZ4exUz2pS0Tp2E8eMH
 4BGylAV9vrZqjmF3HQHJu8c/+f833dwbmMzgDFlFgR01U8NQYQvLf6ZoxmnQxdJC
 5CVzO2rGV1bAVEbCJAN/wnKSCpEzjcN0ruz9bcje8MFF9mXXiJxATIgNahuaj17t
 AvOnAbwd
 =FhM8
 -----END PGP SIGNATURE-----

Merge tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd

Pull ksmbd fixes from Steve French:
 "Ten fixes for the ksmbd kernel server, for improved security and
  additional buffer overflow checks:

   - a security improvement to session establishment to reduce the
     possibility of dictionary attacks

   - fix to ensure that maximum i/o size negotiated in the protocol is
     not less than 64K and not more than 8MB to better match expected
     behavior

   - fix for crediting (flow control) important to properly verify that
     sufficient credits are available for the requested operation

   - seven additional buffer overflow, buffer validation checks"

* tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: add buffer validation in session setup
  ksmbd: throttle session setup failures to avoid dictionary attacks
  ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests
  ksmbd: validate credit charge after validating SMB2 PDU body size
  ksmbd: add buffer validation for smb direct
  ksmbd: limit read/write/trans buffer size not to exceed 8MB
  ksmbd: validate compound response buffer
  ksmbd: fix potencial 32bit overflow from data area check in smb2_write
  ksmbd: improve credits management
  ksmbd: add validation in smb2_ioctl
2021-10-24 06:43:59 -10:00
Linus Torvalds 0f386a604c SCSI fixes on 20211023
Ten fixes, seven of which are in drivers.  The core fixes are one to
 fix a potential crash on resume, one to sort out our reference count
 releases to avoid releasing in-use modules and one to adjust the cmd
 per lun calculation to avoid an overflow in hyper-v.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYXQw8SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbRFAQCv6lLT
 ZRbWK+uCGlmrVNyutqsRWC9D8KOhxHKOosOSlQEA3d53mBpFuvgyKvZWyybMTkGj
 u77ksdhMbSdr7wRCaIA=
 =F1xU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Ten fixes, seven of which are in drivers.

  The core fixes are one to fix a potential crash on resume, one to sort
  out our reference count releases to avoid releasing in-use modules and
  one to adjust the cmd per lun calculation to avoid an overflow in
  hyper-v"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: ufs-pci: Force a full restore after suspend-to-disk
  scsi: qla2xxx: Fix unmap of already freed sgl
  scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
  scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
  scsi: sd: Fix crashes in sd_resume_runtime()
  scsi: mpi3mr: Fix duplicate device entries when scanning through sysfs
  scsi: core: Put LLD module refcnt after SCSI device is released
  scsi: storvsc: Fix validation for unsolicited incoming packets
  scsi: iscsi: Fix set_param() handling
  scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma()
2021-10-24 06:23:48 -10:00