This makes moves the knowledge about bouncing out of the callers into the
block core (just like we do for the normal I/O path), and allows to unexport
blk_queue_bounce.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of explicitly calling scsi_req_init() after blk_get_request(),
call that function from inside blk_get_request(). Add an
.initialize_rq_fn() callback function to the block drivers that need
it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn()
because it is too small to keep it as a separate function. Keep the
scsi_req_init() call in ide_prep_sense() because it follows a
blk_rq_init() call.
References: commit 82ed4db499 ("block: split scsi_request out of struct request")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings. This patch
instead introduces a new blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning. Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.
For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.
blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
We will only have sense data if the command executed and got a SCSI
result, so this is pointless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Here is the big set of new char/misc driver drivers and features for
4.12-rc1.
There's lots of new drivers added this time around, new firmware drivers
from Google, more auxdisplay drivers, extcon drivers, fpga drivers, and
a bunch of other driver updates. Nothing major, except if you happen to
have the hardware for these drivers, and then you will be happy :)
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWQvAgg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yknsACgzkAeyz16Z97J3UTaeejbR7nKUCAAoKY4WEHY
8O9f9pr9gj8GMBwxeZQa
=OIfB
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of new char/misc driver drivers and features for
4.12-rc1.
There's lots of new drivers added this time around, new firmware
drivers from Google, more auxdisplay drivers, extcon drivers, fpga
drivers, and a bunch of other driver updates. Nothing major, except if
you happen to have the hardware for these drivers, and then you will
be happy :)
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
firmware: google memconsole: Fix return value check in platform_memconsole_init()
firmware: Google VPD: Fix return value check in vpd_platform_init()
goldfish_pipe: fix build warning about using too much stack.
goldfish_pipe: An implementation of more parallel pipe
fpga fr br: update supported version numbers
fpga: region: release FPGA region reference in error path
fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe()
mei: drop the TODO from samples
firmware: Google VPD sysfs driver
firmware: Google VPD: import lib_vpd source files
misc: lkdtm: Add volatile to intentional NULL pointer reference
eeprom: idt_89hpesx: Add OF device ID table
misc: ds1682: Add OF device ID table
misc: tsl2550: Add OF device ID table
w1: Remove unneeded use of assert() and remove w1_log.h
w1: Use kernel common min() implementation
uio_mf624: Align memory regions to page size and set correct offsets
uio_mf624: Refactor memory info initialization
uio: Allow handling of non page-aligned memory regions
hangcheck-timer: Fix typo in comment
...
This passes on the scsi_cmnd result field to users of passthrough
requests. Currently we abuse req->errors for this purpose, but that
field will go away in its current form.
Note that the old IDE code abuses the errors field in very creative
ways and stores all kinds of different values in it. I didn't dare
to touch this magic, so the abuses are brought forward 1:1.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The function only returns -EIO if rq->errors is non-zero, which is not
very useful and lets a large number of callers ignore the return value.
Just let the callers figure out their error themselves.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Instead of bloating the generic struct request with it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
We don't call the remove() function unless probe() succeeds so "oud"
can't be NULL here. Plus, if it were NULL, we dereference it on the
next line so it would crash anyway.
[mkp: applied by hand]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This driver did not set kobj.parent so it likely suffered from
a potential use after free race if the user unregistered the
device while it was in use.
This was not so straightforward a conversion but I think this patch
cleans up its probe's error path significantly.
This patch adds device_initialize, which is required for
cdev_device_add. Then it switches to put_device instead of kfree as
recommended by device_initialize's documentation. This removes a lot
from the error path which was already in __remove.
A couple things needed to be re-ordered to be entirely correct, though.
ida_remove is also moved out of __remove and into unregister to
simplify things and follow the pattern other devices are using.
This also drop an extra unnecessary get_device/put_device in the code.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is the set of stuff that didn't quite make the initial pull and a
set of fixes for stuff which did. The new stuff is basically lpfc
(nvme), qedi and aacraid. The fixes cover a lot of previously
submitted stuff, the most important of which probably covers some of
the failing irq vectors allocation and other fallout from having the
SCSI command allocated as part of the block allocation functions.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYug/3AAoJEAVr7HOZEZN4bRwP/RvAW/NWlvs812gUeHNahUHx
HF3EtY5YHneZev1L4fa1Myd8cNcB3gic53e1WvQhN5f+LczeI7iXHH8xIQK/B4GT
OI5W7ZCGlhIamgsU8tUFGrtIOM6N/LIMZPMsaE62dcev0jC0Db4p/5yv5bUg1yuL
JOLWqIaTYTx7b5FwR2cXKv7KGYpXonvcoVUDV6s8aMYToJhJiQ6JtWmjaAA+/OJ7
HTcqNX/KSWQBG3ERjE3A/rGBFxPR9KPq8kpqVJlw39KEN9tMjXb0QAGvAeauNye2
xaIjV41tXH1Ys3aoh6ZU+TpzN3HLlH/gegRe8671fUwqu/RnSzFC8Eidoddzgqne
z5BF0Dy36FA5ol2HTFEwS5WM1gRM+z6YGnbhpE/XfyTL2sLHRynSXKxb02IdjRNi
gCdV89AL8xdfPPFJjsx4hGWUJhalW/RwuPayAupePKGQHePabHa19McI2ssg0WBg
OC6uuEk7vT95xuTZVJlrwXTJPnc9dxWJwzh21oEnvMfWM0VMW06EKT4AcX8FOsjL
RswiYp9wl8JDmfNxyJnQJj0+QXeIE/gh35vBQCGsLVKY4+xzbi3yZfb1sZB0XOum
yLK2LFQ+Ltk1TaPPSZsgXQZfodot0dFJPTnWr+whOfY1kepNlFRwQHltXnCWzgRs
QG//WHsk0u1Mj2gehwmQ
=e+6v
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is the set of stuff that didn't quite make the initial pull and a
set of fixes for stuff which did.
The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes
cover a lot of previously submitted stuff, the most important of which
probably covers some of the failing irq vectors allocation and other
fallout from having the SCSI command allocated as part of the block
allocation functions"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (59 commits)
scsi: qedi: Fix memory leak in tmf response processing.
scsi: aacraid: remove redundant zero check on ret
scsi: lpfc: use proper format string for dma_addr_t
scsi: lpfc: use div_u64 for 64-bit division
scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m
scsi: cciss: correct check map error.
scsi: qla2xxx: fix spelling mistake: "seperator" -> "separator"
scsi: aacraid: Fixed expander hotplug for SMART family
scsi: mpt3sas: switch to pci_alloc_irq_vectors
scsi: qedf: fixup compilation warning about atomic_t usage
scsi: remove scsi_execute_req_flags
scsi: merge __scsi_execute into scsi_execute
scsi: simplify scsi_execute_req_flags
scsi: make the sense header argument to scsi_test_unit_ready mandatory
scsi: sd: improve TUR handling in sd_check_events
scsi: always zero sshdr in scsi_normalize_sense
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
scsi: fix memory leak of sdpk on when gd fails to allocate
scsi: sd: make sd_devt_release() static
scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.
...
Now that %z is standartised in C99 there is no reason to support %Z.
Unlike %L it doesn't even make format strings smaller.
Use BUILD_BUG_ON in a couple ATM drivers.
In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
is in my opinion is quite an achievement. Hopefully this patch inspires
someone else to trim vsprintf.c more.
Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
embeded||embedded
Link: http://lkml.kernel.org/r/1481573103-11329-12-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's a tiny structure that can be allocated on the stack, don't
complicate the code by making it optional.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of keeping two levels of indirection for requests types, fold it
all into the operations. The little caveat here is that previously
cmd_type only applied to struct request, while the request and bio op
fields were set to plain REQ_OP_READ/WRITE even for passthrough
operations.
Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
private requests, althought it has to add two for each so that we
can communicate the data in/out nature of the request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
And require all drivers that want to support BLOCK_PC to allocate it
as the first thing of their private data. To support this the legacy
IDE and BSG code is switched to set cmd_size on their queues to let
the block layer allocate the additional space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
A lot of the REQ_* flags are only used on struct requests, and only of
use to the block layer and a few drivers that dig into struct request
internals.
This patch adds a new req_flags_t rq_flags field to struct request for
them, and thus dramatically shrinks the number of common requests. It
also removes the unfortunate situation where we have to fit the fields
from the same enum into 32 bits for struct bio and 64 bits for
struct request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
I wish the OSD code could simply use blk_rq_map_* helpers like
everyone else, but the complex nature of deciding if we have
DATA IN and/or DATA OUT buffers might make this impossible
(at least for a mere human like me).
But using blk_rq_append_bio at least allows sharing the setup code
between request with or without dat a buffers, and given that this
is the last user of blk_make_request it allows getting rid of that
somewhat awkward interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set/get the bio operation using
bio_set_op_attrs/bio_op
These should be simple one or two liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.
This patch converts the drivers and cgroup to use the
op_is_write helper. This should just cover the simple
cases. I did dm, md and bcache in their own patches
because they were more involved.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The variable is_ver1 is always true and so OSD_CAP_LEN can never be
used.
Reported by Coverity.
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Boaz harrosh <ooo@elecrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If char is signed and one of these bytes happen to have a value outside
the ascii range, the corresponding output will consist of "ffffff"
followed by the two hex chars that were actually intended. One way to
fix it would be to change the casts to (u8*) aka
(unsigned char*), but it is much simpler (and generates smaller code)
to use the %ph extension which was created for such short hexdumps.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull trivial tree from Jiri Kosina:
"Usual trivial tree updates. Nothing outstanding -- mostly printk()
and comment fixes and unused identifier removals"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
goldfish: goldfish_tty_probe() is not using 'i' any more
powerpc: Fix comment in smu.h
qla2xxx: Fix printks in ql_log message
lib: correct link to the original source for div64_u64
si2168, tda10071, m88ds3103: Fix firmware wording
usb: storage: Fix printk in isd200_log_config()
qla2xxx: Fix printk in qla25xx_setup_mode
init/main: fix reset_device comment
ipwireless: missing assignment
goldfish: remove unreachable line of code
coredump: Fix do_coredump() comment
stacktrace.h: remove duplicate declaration task_struct
smpboot.h: Remove unused function prototype
treewide: Fix typo in printk messages
treewide: Fix typo in printk messages
mod_devicetable: fix comment for match_flags
The driver core driver structure has grown an owner field and now
requires it to be set for all modular drivers. Set it up for
all scsi_driver instances and get rid of the now superflous
scsi_driver owner field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Shane M Seymour <shane.seymour@hp.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
The blk_get_request function may fail in low-memory conditions or during
device removal (even if __GFP_WAIT is set). To distinguish between these
errors, modify the blk_get_request call stack to return the appropriate
ERR_PTR. Verify that all callers check the return status and consider
IS_ERR instead of a simple NULL pointer check.
For consistency, make a similar change to the blk_mq_alloc_request leg
of blk_get_request. It may fail if the queue is dead, or the caller was
unwilling to wait.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.
Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.
Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.
Signed-off-by: Jens Axboe <axboe@fb.com>
The dev_attrs field of struct class is going away soon, dev_groups
should be used instead. This converts the scsi osd class code to use
the correct field.
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: James E.J. Bottomley <JBottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disk names may contain arbitrary strings, so they must not be
interpreted as format strings. It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.
CVE-2013-2851
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correct spelling typos in various part of printk.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This is an assorted set of stragglers into the merge window with driver
updates for qla2xxx, megaraid_sas, storvsc and ufs. It also includes pulls of
the uapi tree (all the remaining SCSI pieces) and the fcoe tree (updates to
fcoe and libfc)
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRMHJHAAoJEDeqqVYsXL0M9tAH/2YG3TCfy0RFAejGgLfX9OGH
6eFe71m7Z8nfIEneAnm5BuKjCx3QFRp5UFjJZdFHLP1Qv0TbpKs4FnZyeSGTxLQp
S1fZc5sTWmsb5qYxLaukKopC6sFx+hNI2dvB+rgKcd+nWy1RzG7lGqbS4CRNE76q
UNByqlfqJxn5cfQw7dg2zOUKlGaGL2jSyFf0QFXR2IZzO33PeyBPfKDFeJC6b+oc
XTy9MK9V5u6ne3XimDTU2hP4lPAsZaJtcqsv1Gvv2y+BHalQiPqfL6bZMvN3Zbfq
hfT+i2xnYy85858gxtyIhzHwU14zF7I0HEWiVpddsF9NDK7iNKvK8aWHaTs7qis=
=hvGQ
-----END PGP SIGNATURE-----
Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is an assorted set of stragglers into the merge window with
driver updates for qla2xxx, megaraid_sas, storvsc and ufs.
It also includes pulls of the uapi tree (all the remaining SCSI
pieces) and the fcoe tree (updates to fcoe and libfc)"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (81 commits)
[SCSI] ufs: Separate PCI code into glue driver
[SCSI] ufs: Segregate PCI Specific Code
[SCSI] scsi: fix lpfc build when wmb() is defined as mb()
[SCSI] storvsc: Handle dynamic resizing of the device
[SCSI] storvsc: Restructure error handling code on command completion
[SCSI] storvsc: avoid usage of WRITE_SAME
[SCSI] aacraid: suppress two GCC warnings
[SCSI] hpsa: check for dma_mapping_error in hpsa_passthru ioctls
[SCSI] hpsa: reorganize error handling in hpsa_passthru_ioctl
[SCSI] hpsa: check for dma_mapping_error in hpsa_map_sg_chain_block
[SCSI] hpsa: Check for dma_mapping_error for all code paths using fill_cmd
[SCSI] hpsa: Check for dma_mapping_error in hpsa_map_one
[SCSI] dc395x: uninitialized variable in device_alloc()
[SCSI] Fix range check in scsi_host_dif_capable()
[SCSI] storvsc: Initialize the sglist
[SCSI] mpt2sas: Add support for OEM specific controller
[SCSI] ipr: Fix oops while resetting an ipr adapter
[SCSI] fnic: Fnic Trace Utility
[SCSI] fnic: New debug flags and debug log messages
[SCSI] fnic: fnic driver may hit BUG_ON on device reset
...
There wasn't any error handling for this kzalloc().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
All in-kernel users of class_find_device() don't really need mutable
data for match callback.
In two places (kernel/power/suspend_test.c, drivers/scsi/osd/osd_uld.c)
this patch changes match callbacks to use const search data.
The const is propagated to rtc_class_open() and power_supply_get_by_name()
parameters.
Note that there's a dev reference leak in suspend_test.c that's not
touched in this patch.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the support for the following two read-only sysfs attributes
to scsi_osd class members : osdname & systemid
These attributes will show up as below in sysfs class hierarchy:
/sys/class/scsi_osd/osdX/osdname
/sys/class/scsi_osd/osdX/systemid
The osdname & systemid are OSD device attributes which uniquely
identify a device on the network, while it's IP and certainly
it's /dev/osdX device path might change.
Userspace utilities (e.g. mkfs.exofs) can parse these attributes to
identify the correct OSD in safer and faster way.
(Today osd apps open each device in the system and send a
attributes query for these, in order to access the user
requested device)
Signed-off-by: Sachin Bhamare <sbhamare@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It used to be that minors where 8 bit. But now they
are actually 20 bit. So the fix is simplicity itself.
I've tested with 300 devices and all user-mode utils
work just fine. I have also mechanically added 10,000
to the ida (so devices are /dev/osd10000, /dev/osd10001 ...)
and was able to mkfs an exofs filesystem and access osds
from user-mode.
All the open-osd user-mode code uses the same library
to access devices through their symbolic names in
/dev/osdX so I'd say it's pretty safe. (Well tested)
This patch is very important because some of the systems
that will be deploying the 3.2 pnfs-objects code are larger
than 64 OSDs and will stop to work properly when reaching
that number.
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The module.h header was implicitly present everywhere, so files
with no explicit include of the module infrastructure would build
anyway. We are now removing the implicit include, and so we need
to call out the module.h file that we need explicitly.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
The OSD protocol calls for all kind of security levels that use
CRYPTO_HMAC and SH1, but the current code only supports NO_SEC,
which does not use any of these.
Remove a wrong FIXME that calls for them. Thanks Maxin for
reporting on this.
Reported-by: "Maxin B. John" <maxin.john@gmail.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Since sg-read is a bidi operation, it is a gain to convert
a single sg entry into a regular read. Better do this in the
generic layer then force each caller to do so.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
bio_map_kern() returns ERR_PTRs on failure and never returns NULL.
[jejb: remove redundant unlikely spotted by Tobias Klauser]
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is a trivial addition to the SG API that can receive kernel
pointers. It is only used by the out-of-tree test module. So
it's immediate need is questionable. For maintenance ease it might
just get in, as it's very small.
John.
do you need this in the Kernel, or is it only for osd_ktest.ko?
Signed-off-by: John A. Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds the Scatter-Gather (sg) API to libosd.
Scatter-gather enables a write/read of multiple none-contiguous
areas of an object, in a single call. The extents may overlap
and/or be in any order.
The Scatter-Gather list is sent to the target in what is called
a "cdb continuation segment". This is yet another possible segment
in the osd-out-buffer. It is unlike all other segments in that it
sits before the actual "data" segment (which until now was always
first), and that it is signed by itself and not part of the data
buffer. This is because the cdb-continuation-segment is considered
a spill-over of the CDB data, and is therefor signed under
OSD_SEC_CAPKEY and higher.
TODO: A new osd_finalize_request_ex version should be supplied so
the @caps received on the network also contains a size parameter
and can be spilled over into the "cdb continuation segment".
Thanks to John Chandy <john.chandy@uconn.edu> for the original
code, and investigations. And the implementation of SG support
in the osd-target.
Original-coded-by: John Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
At osd_end_request first free the request that might
point to pages, then free these pages. In reverse order
of allocation. For now it's just anal neatness. When we'll
use mempools It'll also pay in performance.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The _osd_req_finalize_attr_page was off by a mile, when trying to
append the enc_get_attr segment instead of the proper set_attr segment.
Also properly support when we don't have any attribute to set while
getting a full page. And when clearing an attribute by setting it's
size to zero.
Reported-by: John Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Return of the bi_rw tests is no longer bool after commit 74450be1. So
testing against constants doesn't make sense anymore. Fix this bug in
osd_req_read by removing "== 1" in test.
This is not a problem now, where REQ_WRITE is 1, but this can change
in the future and we don't want to rely on that.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>