Threaded interrupts can perform the function of the tasklet, and much
more safely too - without races when trying to take the tasklet and
interrupt down on device removal.
With the old code, there is a window where we call tasklet_kill(). If
the interrupt handler happens to be running on a different CPU, and
subsequently calls tasklet_schedule(), the tasklet will be re-scheduled
for execution.
Switching to a hardirq/threadirq combination implementation avoids this,
and it also means generic code deals with the teardown sequencing of the
threaded and non-threaded parts.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper to map the source scatterlist into the descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper function to perform the descriptor allocation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX
hardware, this will never happen. However, dma_map_sg() can fail, and
we completely fail to check its return value. So, fix this properly.
Arrange the code to map the scatterlist early, so we know how many
scatter table entries to allocate, and then fill them in. This allows
us to keep relatively simple error cleanup paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that we clean up allocations and DMA mappings after encountering
an error rather than just giving up and leaking memory and resources.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since the extended descriptor includes the hardware descriptor, and the
sec4 scatterlist immediately follows this, we can declare it as a array
at the very end of the extended descriptor. This allows us to get rid
of an initialiser for every site where we allocate an extended
descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mark the hardware descriptor as being cache line aligned; on DMA
incoherent architectures, the hardware descriptor should sit in a
separate cache line from the CPU accessed data to avoid polluting
the caches.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rather than giving the descriptor as hw_desc[0], give it's real size.
All places where we allocate an ahash_edesc incorporate DESC_JOB_IO_LEN
bytes of job descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
caamhash contains this weird code:
src_nents = sg_count(req->src, req->nbytes);
dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE);
...
edesc->src_nents = src_nents;
sg_count() returns zero when sg_nents_for_len() returns zero or one.
This means we don't need to use a hardware scatterlist. However,
setting src_nents to zero causes problems when we unmap:
if (edesc->src_nents)
dma_unmap_sg_chained(dev, req->src, edesc->src_nents,
DMA_TO_DEVICE, edesc->chained);
as zero here means that we have no entries to unmap. This causes us
to leak DMA mappings, where we map one scatterlist entry and then
fail to unmap it.
This can be fixed in two ways: either by writing the number of entries
that were requested of dma_map_sg(), or by reworking the "no SG
required" case.
We adopt the re-work solution here - we replace sg_count() with
sg_nents_for_len(), so src_nents now contains the real number of
scatterlist entries, and we then change the test for using the
hardware scatterlist to src_nents > 1 rather than just non-zero.
This change passes my sshd, openssl tests hashing /bin and tcrypt
tests.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Properly allocate enough memory to respect the fallback.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the probe function only emits an output on success
when debug is specifically enabled. It would be more useful
if this happens by default.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the img-hash accelerator does not probe
successfully due to a change in the checks made during
registration with the crypto framework. This is due to
import and export functions not being defined. Correct
this.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Current img hash claims sys and periph gate clocks
and this can be gated in system suspend scenarios.
Add support for Device pm ops for img hash to gate
the clocks claimed by img hash.
Signed-off-by: Govindraj Raja <Govindraj.Raja@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Burst length of 16 drives the hash accelerator out of spec
and causes stability issues in some cases. Reduce this to
stop data being lost.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move 0 length buffer to end of structure to stop overwriting
fallback request data. This doesn't cause a bug itself as the
buffer is never used alongside the fallback but should be
changed.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Sporadic null pointer exceptions came from here. Fix them.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
"if (!ret == template[i].fail)" is confusing to compilers (gcc5):
crypto/testmgr.c: In function '__test_aead':
crypto/testmgr.c:531:12: warning: logical not is only applied to the
left hand side of comparison [-Wlogical-not-parentheses]
if (!ret == template[i].fail) {
^
Let there be 'if (template[i].fail == !ret) '.
Signed-off-by: Yanjiang Jin <yanjiang.jin@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A second CCP is available, identical to the first, with
its ownn PCI ID. Make it available for use by the crypto
subsystem, as well as for DMA activity and random
number generation.
This device is not pre-configured at at boot time. The
driver must configure it (during the probe) for use.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Every CCP is capable of providing general DMA services.
Register the device as a provider.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Available queue space is used to decide (by counting free slots)
if we have to put a command on hold or if it can be sent
to the engine immediately.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make the RNG support code common (where possible) in
preparation for adding a v5 device.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the KSB access/management functions to the v3
device file, and add function pointers to the actions
structure. At the operations layer all of the references
to the storage block will be generic (virtual). This is
in preparation for a version 5 device, in which the
private storage block is managed differently.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Form and use of the local storage block in the CCP is
particular to the device version. Much of the code that
accesses the storage block can treat it as a virtual
resource, and will under go some renaming. Device-specific
access to the memory will be moved into device file.
Service functions will be added to the actions
structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use more concise field names; "perform_" is too verbose.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Device-specific values for the BAR and offset should be found
in the version data structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Most error branches following the call to npe_request contain a call to
npe_request. This patch add a call to npe_release to error branches
following the call to npe_request that do not have it.
This issue was found with Hector.
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since 6de62f15b5 ("crypto: algif_hash - Require setkey before
accept(2)"), the AF_ALG interface requires userspace to provide a key
to any algorithm that has a setkey method. However, the non-HMAC
algorithms are not keyed, so setting a key is unnecessary.
Fix this by removing the setkey method from the non-keyed hash
algorithms.
Fixes: 6de62f15b5 ("crypto: algif_hash - Require setkey before accept(2)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The optimised crc32c implementation depends on VMX (aka. Altivec)
instructions, so the kernel must be built with Altivec support in order
for the crc32c code to build.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To be able to generate shared descriptors for AEAD, the authentication size
needs to be known. However, there is no imposed order of calling .setkey,
.setauthsize callbacks.
Thus, in case authentication size is not known at .setkey time, defer it
until .setauthsize is called.
The authsize != 0 check was incorrectly removed when converting the driver
to the new AEAD interface.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are a few things missed by the conversion to the
new AEAD interface:
1 - echainiv(authenc) encrypt shared descriptor
The shared descriptor is incorrect: due to the order of operations,
at some point in time MATH3 register is being overwritten.
2 - buffer used for echainiv(authenc) encrypt shared descriptor
Encrypt and givencrypt shared descriptors (for AEAD ops) are mutually
exclusive and thus use the same buffer in context state: sh_desc_enc.
However, there's one place missed by s/sh_desc_givenc/sh_desc_enc,
leading to errors when echainiv(authenc(...)) algorithms are used:
DECO: desc idx 14: Header Error. Invalid length or parity, or
certain other problems.
While here, also fix a typo: dma_mapping_error() is checking
for validity of sh_desc_givenc_dma instead of sh_desc_enc_dma.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On 32-bit (e.g. with m68k-linux-gnu-gcc-4.1):
crypto/sha3_generic.c:27: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:28: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:29: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:29: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:31: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:31: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:32: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:32: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:32: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:33: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:33: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:34: warning: integer constant is too large for ‘long’ type
crypto/sha3_generic.c:34: warning: integer constant is too large for ‘long’ type
Fixes: 53964b9ee6 ("crypto: sha3 - Add SHA-3 hash algorithm")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull more block fixes from Jens Axboe:
"As mentioned in the pull the other day, a few more fixes for this
round, all related to the bio op changes in this series.
Two fixes, and then a cleanup, renaming bio->bi_rw to bio->bi_opf. I
wanted to do that change right after or right before -rc1, so that
risk of conflict was reduced. I just rebased the series on top of
current master, and no new ->bi_rw usage has snuck in"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: rename bio bi_rw to bi_opf
target: iblock_execute_sync_cache() should use bio_set_op_attrs()
mm: make __swap_writepage() use bio_set_op_attrs()
block/mm: make bdev_ops->rw_page() take a bool for read/write
Pull drm zpos property support from Dave Airlie:
"This tree was waiting on some media stuff I hadn't had time to get a
stable branchpoint off, so I just waited until it was all in your tree
first.
It's been around a bit on the list and shouldn't affect anything
outside adding the generic API and moving some ARM drivers to using
it"
* tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux:
drm: rcar: use generic code for managing zpos plane property
drm/exynos: use generic code for managing zpos plane property
drm: sti: use generic zpos for plane
drm: add generic zpos property
Since commit 63a4cc2486, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.
No intended functional changes in this commit.
Signed-off-by: Jens Axboe <axboe@fb.com>
The original commit missed this function, it needs to mark it a
write flush.
Cc: Mike Christie <mchristi@redhat.com>
Fixes: e742fc32fc ("target: use bio op accessors")
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit abf545484d changed it from an 'rw' flags type to the
newer ops based interface, but now we're effectively leaking
some bdev internals to the rest of the kernel. Since we only
care about whether it's a read or a write at that level, just
pass in a bool 'is_write' parameter instead.
Then we can also move op_is_write() and friends back under
CONFIG_BLOCK protection.
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
"make help" if sphinx isn't present.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXo8sIAAoJEI3ONVYwIuV6po0P/0ZZo+YF0GrPvOHr7uuUqAND
0+4WRfSsT74z5Rn/W3apeX6CM7IGBMSR2zM89E2nWmbE2Uo7bIbrwj6C+Y6gMMfd
aws0Xi9899Jr6hVkeFVZ9foze+M2yc3tE1vFBby035uW3Zwyz2XHzaU/9vyLOLkJ
c7jhqCWebqFEqOSWtw2FZYegt2oHSjUsQgGCh3kk2pCU+DzLHntwbblJLeMuTy+h
zPVxTTBcBkUZcIjpkSvhqc/oCLCiWKLElmwxPBwfpNU9UlE0rol2Lx1eLClxadFl
QVlb1UAIjPcLnHQoM8dL9NR0tkfGopIDuNCL26GU5ie9N4zurOj5a6hj+G5mZKLB
tsMqIw+N7ig5FnaQhaCx3oN/VMZ0djxURu9XvKsHBmOCd2Bp8SDoqpCkTwCqCxcN
DVdUjpS1WUT9w2A1jhH32mx+23eRwJa5oaTFpM3Y0z7Bl9N40ScY2WJcgBKWqHgx
LRROJAzNOPojbBkwTDNsRValwgtutCcqaRw5mNQTp3YjjmltmqylCvJH3AST+z5r
CmMDO96O3rUGsCZYoBhxafC2FUUh5RkUwQq/Cy8nrioMookE3Yd5A9DN6wWQ2pTt
tev/z6s3ov8dygeF6u3noOHCa8GPUpSHO62FyHUKYnn6Tl8xh3x7rmUkUqrJZi5a
dnXOZzp34eVhev5xDeDN
=iD7L
-----END PGP SIGNATURE-----
Merge tag 'doc-4.8-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"Three fixes for the docs build, including removing an annoying warning
on 'make help' if sphinx isn't present"
* tag 'doc-4.8-fixes' of git://git.lwn.net/linux:
DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1
Documenation: update cgroup's document path
Documentation/sphinx: do not warn about missing tools in 'make help'
First off, the intention of this pull is to declare that I'll be the
binfmt_misc maintainer (mainly on the grounds of you touched it last,
it's yours). There's no MAINTAINERS entry, but get_maintainers.pl
will now finger me.
The update itself is to allow architecture emulation containers to
function such that the emulation binary can be housed outside the
container itself. The container and fs parts both have acks from
relevant experts.
The change is user visible. To use the new feature you have to add an
F option to your binfmt_misc configuration. However, the existing
tools, like systemd-binfmt work with this without modification.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJXmW5WAAoJEAVr7HOZEZN4K1QQAKgx5MPkoTU3QKKgzaMBBnWH
pSMdoN8BhVSwENE/YJGMEyLaRa0zmrHVtFcnH2CHQE/GoXNnaej9l3LtBIwJ9K2P
nrv4Rlhla5BxjhDkg8IWf3iG7iKDDHGZoyuVPx4dwxHFK1yCNH4SDeHaJCKK5qsC
aLltMJMRnjsgJvBUC01dCUlp8srkWywHcyk9M9ic/Fr5vJ6JzdUr6/Md29eHmAXe
NgCGwkVgSDiKfnTGZjIMsAtpwPsJ6RqBWQTcTdM/mkIpqwrMiVuaVOHqu2cmMU2i
j4cQE6rQpy3sedDKZbHBQMOfYJNT4QYgYGuvyIWce9EPkIpOWHzQ7kYPJ/A/jZCE
lN37TeyodbUDCnyuKk1YOrTBjJ0qdtc4FXJ1aq5s92GkgDs+LtxMdGzKDf3yUGiU
W0TsE/wVy4rmEaeiyut33661ud4vivP4WklWK1Y+bklQcIcKQKKWnOCnDFDR5vuz
CbL5ykVcJb3F28YhGYHvGLeXl0YcR3SwngWnnPCDPtBCeSirohuKb1SEe21C/RaB
rm9S27d+LcKCXJyCqKh8BGsqroZ0iSZQI0Lbdqt+BCuuBw2rQhGStDeccDDUp9jg
MOwpQwabjEseK0n75+hZ2SFS5Q+TQ6pccMlUJIDiBKWmRly8NpKlSKKWvBX8obIe
0Gq6hgX1IwQnXI1O8QMC
=6OjN
-----END PGP SIGNATURE-----
Merge tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc
Pull binfmt_misc update from James Bottomley:
"This update is to allow architecture emulation containers to function
such that the emulation binary can be housed outside the container
itself. The container and fs parts both have acks from relevant
experts.
To use the new feature you have to add an F option to your binfmt_misc
configuration"
From the docs:
"The usual behaviour of binfmt_misc is to spawn the binary lazily when
the misc format file is invoked. However, this doesn't work very well
in the face of mount namespaces and changeroots, so the F mode opens
the binary as soon as the emulation is installed and uses the opened
image to spawn the emulator, meaning it is always available once
installed, regardless of how the environment changes"
* tag 'binfmt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/binfmt_misc:
binfmt_misc: add F option description to documentation
binfmt_misc: add persistent opened binary handler for containers
fs: add filp_clone_open API
In most cases, EPERM is returned on immutable inode, and there're only a
few places returning EACCES. I noticed this when running LTP on
overlayfs, setxattr03 failed due to unexpected EACCES on immutable
inode.
So converting all EACCES to EPERM on immutable inode.
Acked-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull more vfs updates from Al Viro:
"Assorted cleanups and fixes.
In the "trivial API change" department - ->d_compare() losing 'parent'
argument"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
cachefiles: Fix race between inactivating and culling a cache object
9p: use clone_fid()
9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
vfs: make dentry_needs_remove_privs() internal
vfs: remove file_needs_remove_privs()
vfs: fix deadlock in file_remove_privs() on overlayfs
get rid of 'parent' argument of ->d_compare()
cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
affs ->d_compare(): don't bother with ->d_inode
fold _d_rehash() and __d_rehash() together
fold dentry_rcuwalk_invalidate() into its only remaining caller
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXpRdVAAoJEK3oKUf0dfod2tkP/24f1Znl9OQEPHoSZty9nXF0
dSjOzE2lHbR4xjjuYjbn1siFnIX0A5nPPqleBYmt3gatiO+24vE1BiNWjM6Y/y7r
3KHENRqmfSj26ha6wl/TUNaKnuFooBcQ0BaHI1IExFROitOSvZgPJPSrk29AH/Er
OVJkaoi3N3o9mrfUpF9/M55Yi/DhQiPBYxkqcXvaqcakbL91EIj5TLZ72MJqgfje
d6og33zxb21EDx9eIJEA0cWX4MLO2UQqFAuiJLzk2RkSAm6vRjbRJyYGG9jv81tP
9ZX1gAw47v0qk3nPVyAgbi862ukYCYzmr1g2b4S2b0UKLXxQb8Fw8D2mRbFXl2wg
wq0nKLg9jwsd8Yo7k8qOrUI9nl/E9Ytmj8t92Y49XvPjtsVFZREoCw3ojyjmlyZA
9BywL5BzMHF6SsXe6LBGJpoebrxCnq5176FREBnpmH7UHM0BcWa4YSekQShwg3DW
PFlBOxk5saz4Ktr5V3YUY+G6XgZ/AXWKlDox5+dESLIOgG0hyzbiVbPNSTQgDrnR
m9yUJPef1NQj2JWSZbqKn7FSZDO6/IT2aeokn1KuoaDJww5HC80juyB1VThmpZnl
QJGN6nmsYDVCLYjbT6scAzyGMYw9ZVhTM7eEk3kqAtCBf/nEyqJM+H0HYUDjfg9B
cG5cRtZNDDkc30lFezJX
=nXKv
-----END PGP SIGNATURE-----
Merge tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull more xfs updates from Dave Chinner:
"This is the second part of the XFS updates for this merge cycle, and
contains the new reverse block mapping feature for XFS.
Reverse mapping allows us to track the owner of a specific block on
disk precisely. It is implemented as a set of btrees (one per
allocation group) that track the owners of allocated extents.
Effectively it is a "used space tree" that is updated when we allocate
or free extents. i.e. it is coherent with the free space btrees we
already maintain and never overlaps with them.
This reverse mapping infrastructure is the building block of several
upcoming features - reflink, copy-on-write data, dedupe, online
metadata and data scrubbing, highly accurate bad sector/data loss
reporting to users, and significantly improved reconstruction of
damaged and corrupted filesystems. There's a lot of new stuff coming
along in the next couple of cycles,a nd it all builds in the rmap
infrastructure.
As such, it's a huge chunk of new code with new on-disk format
features and internal infrastructure. It warns at mount time as an
experimental feature and that it may eat data (as we do with all new
on-disk features until they stabilise). We have not released
userspace suport for it yet - userspace support currently requires
download from Darrick's xfsprogs repo and build from source, so the
access to this feature is really developer/tester only at this point.
Initial userspace support will be released at the same time kernel
with this code in it is released.
The new rmap enabled code regresses 3 xfstests - all are ENOSPC
related corner cases, one of which Darrick posted a fix for a few
hours ago. The other two are fixed by infrastructure that is part of
the upcoming reflink patchset. This new ENOSPC infrastructure
requires a on-disk format tweak required to keep mount times in
check - we need to keep an on-disk count of allocated rmapbt blocks so
we don't have to scan the entire btrees at mount time to count them.
This is currently being tested and will be part of the fixes sent in
the next week or two so users will not be exposed to this change"
* tag 'xfs-rmap-for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (52 commits)
xfs: move (and rename) the deferred bmap-free tracepoints
xfs: collapse single use static functions
xfs: remove unnecessary parentheses from log redo item recovery functions
xfs: remove the extents array from the rmap update done log item
xfs: in btree_lshift, only allocate temporary cursor when needed
xfs: remove unnecesary lshift/rshift key initialization
xfs: remove the get*keys and update_keys btree ops pointers
xfs: enable the rmap btree functionality
xfs: don't update rmapbt when fixing agfl
xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled
xfs: add rmap btree block detection to log recovery
xfs: add rmap btree geometry feature flag
xfs: propagate bmap updates to rmapbt
xfs: enable the xfs_defer mechanism to process rmaps to update
xfs: log rmap intent items
xfs: create rmap update intent log items
xfs: add rmap btree insert and delete helpers
xfs: convert unwritten status of reverse mappings
xfs: remove an extent from the rmap btree
xfs: add an extent to the rmap btree
...