linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Yan, Zheng	4d1d0534f5	ceph: Hold caps_list_lock when adjusting caps_{use, total}_count Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>	2012-11-04 03:08:24 -08:00
Alex Elder	9e15b77d9a	rbd: get additional info in parent spec When a layered rbd image has a parent, that parent is identified only by its pool id, image id, and snapshot id. Images that have been mapped also record names for those three id's. Add code to look up these names for parent images so they match mapped images more closely. Skip doing this for an image if it already has its pool name defined (this will be the case for images mapped by the user). It is possible that an the name of a parent image can't be determined, even if the image id is valid. If this occurs it does not preclude correct operation, so don't treat this as an error. On the other hand, defined pools will always have both an id and a name. And any snapshot of an image identified as a parent for a clone image will exist, and will have a name (if not it indicates some other internal error). So treat failure to get these bits of information as errors. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	72afc71ffc	libceph: define ceph_pg_pool_name_by_id() Define and export function ceph_pg_pool_name_by_id() to supply the name of a pg pool whose id is given. This will be used by the next patch. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	86b00e0da6	rbd: get parent spec for version 2 images Add support for getting the the information identifying the parent image for rbd images that have them. The child image holds a reference to its parent image specification structure. Create a new entry "parent" in /sys/bus/rbd/image/N/ to report the identifying information for the parent image, if any. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	a92ffdf8a9	rbd: allow null image name Format 2 parent images are partially identified by their image id, but it may not be possible to determine their image name. The name is not strictly needed for correct operation, so we won't be treating it as an error if we don't know it. Handle this case gracefully in rbd_name_show(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	2c0d0a10ea	rbd: allow null image name We will know the image id for format 2 parent images, but won't initially know its image name. Avoid making the query for an image id in rbd_dev_image_id() if it's already known. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	83a0626362	rbd: encapsulate last part of probe Group the activities that now take place after an rbd_dev_probe() call into a single function, and move the call to that function into rbd_dev_probe() itself. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	c53d589337	rbd: define rbd_dev_{create,destroy}() helpers Encapsulate the creation/initialization and destruction of rbd device structures. The rbd_client and the rbd_spec structures provided on creation hold references whose ownership is transferred to the new rbd_device structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	bd4ba6554d	rbd: consolidate rbd_dev init in rbd_add() Group the allocation and initialization of fields of the rbd device structure created in rbd_add(). Move the grouped code down later in the function, just prior to the call to rbd_dev_probe(). This is for the most part simple code movement. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	9d3997fdf4	rbd: don't pass rbd_dev to rbd_get_client() The only reason rbd_dev is passed to rbd_get_client() is so its rbd_client field can get assigned. Instead, just return the rbd_client pointer as a result and have the caller do the assignment. Change rbd_put_client() so it takes an rbd_client structure, so follows the more typical symmetry with rbd_get_client(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	859c31df9c	rbd: fill rbd_spec in rbd_add_parse_args() Pass the address of an rbd_spec structure to rbd_add_parse_args(). Use it to hold the information defining the rbd image to be mapped in an rbd_add() call. Use the result in the caller to initialize the rbd_dev->id field. This means rbd_dev is no longer needed in rbd_add_parse_args(), so get rid of it. Now that this transformation of rbd_add_parse_args() is complete, correct and expand on the its header documentation to reflect the new reality. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Alex Elder	8b8fb99c5c	rbd: add reference counting to rbd_spec With layered images we'll share rbd_spec structures, so add a reference count to it. It neatens up some code also. A silly get/put pair is added to the alloc routine just to avoid "defined but not used" warnings. It will go away soon. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:41 -05:00
Alex Elder	0d7dbfce9d	rbd: define image specification structure Group the fields that uniquely specify an rbd image into a new reference-counted rbd_spec structure. This structure will be used to describe the desired image when mapping an image, and when probing parent images in layered rbd devices. Replace the set of fields in the rbd device structure with a pointer to a dynamically allocated rbd_spec. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:30 -05:00
Alex Elder	dc79b113d6	rbd: have rbd_add_parse_args() return error Change the interface to rbd_add_parse_args() so it returns an error code rather than a pointer. Return the ceph_options result via a pointer whose address is passed as an argument. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:30 -05:00
Alex Elder	4e9afeba7a	rbd: pass and populate rbd_options structure Have the caller pass the address of an rbd_options structure to rbd_add_parse_args(), to be initialized with the information gleaned as a result of the parse. I know, this is another near-reversal of a recent change... Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	819d52bf72	rbd: remove snap_name arg from rbd_add_parse_args() The snapshot name returned by rbd_add_parse_args() just gets saved in the rbd_dev eventually. So just do that inside that function and do away with the snap_name argument, both in rbd_add_parse_args() and rbd_dev_set_mapping(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	f28e565a1b	rbd: remove options args from rbd_add_parse_args() They "options" argument to rbd_add_parse_args() (and it's partner options_size) is now only needed within the function, so there's no need to have the caller allocate and pass the options buffer. Just allocate the options buffer within the function using dup_token(). Also distinguish between failures due to failed memory allocation and failing because a required argument was missing. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	e5c3553404	rbd: get rid of snap_name_len The value returned in the "snap_name_len" argument to rbd_add_parse_args() is never actually used, so get rid of it. The snap_name_len recorded in rbd_dev_v2_snap_name() is not useful either, so get rid of that too. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	0ddebc0c6c	rbd: do all argument parsing in one place This patch makes rbd_add_parse_args() be the single place all argument parsing occurs for an image map request: - Move the ceph_parse_options() call into that function - Use local variables rather than parameters to hold the list of monitor addresses supplied - Rather than returning it, pass the snapshot name (and its length) back via parameters - Have the function return a ceph_options structure pointer Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	78cea76e05	rbd: move ceph_parse_options() call up Move option parsing out of rbd_get_client() and into its caller. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	daba5fdb4c	rbd: rename snap_exists field A Boolean field "snap_exists" in an rbd mapping is used to indicate whether a mapped snapshot has been removed from an image's snapshot context, to stop sending requests for that snapshot as soon as we know it's gone. Generalize the interpretation of this field so it applies to non-snapshot (i.e. "head") mappings. That is, define its value to be false until the mapping has been set, and then define it to be true for both snapshot mappings or head mappings. Rename the field "exists" to reflect the broader interpretation. The rbd_mapping structure is on its way out, so move the field back into the rbd_device structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	971f839a76	rbd: move snap info out of rbd_mapping struct Moving the snap_id and snap_name fields into the separate rbd_mapping structure was misguided. (And in time, perhaps we'll do away with that structure altogether...) Move these fields back into struct rbd_device. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	86992098e7	rbd: make pool_id a 64 bit value If a format 2 image has a parent, its pool id will be specified using a 64-bit value. Change the pool id we save for an image to match that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:29 -05:00
Alex Elder	41f38c2b2f	rbd: remove snapshots on error in rbd_add() If rbd_dev_snaps_update() has ever been called for an rbd device structure there could be snapshot structures on its snaps list. In rbd_add(), this function is called but a subsequent error path neglected to clean up any of these snapshots. Add a call to rbd_remove_all_snaps() in the appropriate spot to remedy this. Change a couple of error labels to be a little clearer while there. Drop the leading underscores from the function name; there's nothing special about that function that they might signify. As suggested in review, the leading underscores in __rbd_remove_snap_dev() have been removed as well. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:28 -05:00
Alex Elder	f7760dad28	rbd: simplify rbd_rq_fn() When processing a request, rbd_rq_fn() makes clones of the bio's in the request's bio chain and submits the results to osd's to be satisfied. If a request bio straddles the boundary between objects backing the rbd image, it must be represented by two cloned bio's, one for the first part (at the end of one object) and one for the second (at the beginning of the next object). This has been handled by a function bio_chain_clone(), which includes an interface only a mother could love, and which has been found to have other problems. This patch defines two new fairly generic bio functions (one which replaces bio_chain_clone()) to help out the situation, and then revises rbd_rq_fn() to make use of them. First, bio_clone_range() clones a portion of a single bio, starting at a given offset within the bio and including only as many bytes as requested. As a convenience, a request to clone the entire bio is passed directly to bio_clone(). Second, bio_chain_clone_range() performs a similar function, producing a chain of cloned bio's covering a sub-range of the source chain. No bio_pair structures are used, and if successful the result will represent exactly the specified range. Using bio_chain_clone_range() makes bio_rq_fn() a little easier to understand, because it avoids the need to pass very much state information between consecutive calls. By avoiding the need to track a bio_pair structure, it also eliminates the problem described here: http://tracker.newdream.net/issues/2933 Note that a block request (and therefore the complete length of a bio chain processed in rbd_rq_fn()) is an unsigned int, while the result of rbd_segment_length() is u64. This change makes this range trunctation explicit, and trips a bug if the the segment boundary is too far off. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-30 08:34:28 -05:00
Sage Weil	0ed7285e00	libceph: fix osdmap decode error paths Ensure that we set the err value correctly so that we do not pass a 0 value to ERR_PTR and confuse the calling code. (In particular, osd_client.c handle_map() will BUG(!newmap)). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-30 08:21:05 -05:00
Alex Elder	069a4b5690	rbd: kill rbd_device->rbd_opts The rbd_device structure has an embedded rbd_options structure. Such a structure is needed to work with the generic ceph argument parsing code, but there's no need to keep it around once argument parsing is done. Use a local variable to hold the rbd options used in parsing in rbd_get_client(), and just transfer its content (it's just a read_only flag) into the field in the rbd_mapping sub-structure that requires that information. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	e5cfeed281	rbd: simplify rbd_merge_bvec() The aim of this patch is to make what's going on rbd_merge_bvec() a bit more obvious than it was before. This was an issue when a recent btrfs bug led us to question whether the merge function was working correctly. Use "obj" rather than "chunk" to indicate the units whose boundaries we care about we call (rados) "objects". Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	d4b125e9eb	rbd: increase maximum snapshot name length Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX. That is a practical limit for the length of a snapshot name (based on the presence of a directory using the name under /sys/bus/rbd to represent the snapshot). The /sys entry is created by prefixing it with "snap_"; define that prefix symbolically, and take its length into account in defining the snapshot name length limit. Enforce the limit in rbd_add_parse_args(). Also delete a dout() call in that function that was not meant to be committed. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	db2388b6ee	rbd: verify rbd image order value This adds a verification that an rbd image's object order is within the upper and lower bounds supported by this implementation. It must be at least 9 (SECTOR_SHIFT), because the Linux bio system assumes that minimum granularity. It also must be less than 32 (at the moment anyway) because there exist spots in the code that store the size of a "segment" (object backing an rbd image) in a signed int variable, which can be 32 bits including the sign. We should be able to relax this limit once we've verified the code uses 64-bit types where needed. Note that the CLI tool already limits the order to the range 12-25. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	4634246db8	rbd: consolidate rbd_do_op() calls The two calls to rbd_do_op() from rbd_rq_fn() differ only in the value passed for the snapshot id and the snapshot context. For reads the snapshot always comes from the mapping, and for writes the snapshot id is always CEPH_NOSNAP. The snapshot context is always null for reads. For writes, the snapshot context always comes from the rbd header, but it is acquired under protection of header semaphore and could change thereafter, so we can't simply use what's available inside rbd_do_op(). Eliminate the snapid parameter from rbd_do_op(), and set it based on the I/O direction inside that function instead. Always pass the snapshot context acquired in the caller, but reset it to a null pointer inside rbd_do_op() if the operation is a read. As a result, there is no difference in the read and write calls to rbd_do_op() made in rbd_rq_fn(), so just call it unconditionally. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	ff2e4bb5b3	rbd: drop rbd_do_op() opcode and flags The only callers of rbd_do_op() are in rbd_rq_fn(), where call one is used for writes and the other used for reads. The request passed to rbd_do_op() already encodes the I/O direction, and that information can be used inside the function to set the opcode and flags value (rather than passing them in as arguments). So get rid of the opcode and flags arguments to rbd_do_op(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	13f4042c05	rbd: kill rbd_req_{read,write}() Both rbd_req_read() and rbd_req_write() are simple wrapper routines for rbd_do_op(), and each is only called once. Replace each wrapper call with a direct call to rbd_do_op(), and get rid of the wrapper functions. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	be466c1cc3	rbd: fix read-only option name The name of the "read-only" mapping option was inadvertently changed in this commit: `f84344f3` rbd: separate mapping info in rbd_dev Revert that hunk to return it to what it should be. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	a0ea3a40fd	rbd: zero return code in rbd_dev_image_id() When rbd_dev_probe() calls rbd_dev_image_id() it expects to get a 0 return code if successful, but it is getting a positive value. The reason is that rbd_dev_image_id() returns the value it gets from rbd_req_sync_exec(), which returns the number of bytes read in as a result of the request. (This ultimately comes from ceph_copy_from_page_vector() in rbd_req_sync_op()). Force the return value to 0 when successful in rbd_dev_image_id(). Do the same in rbd_dev_v2_object_prefix(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>	2012-10-26 17:18:08 -05:00
Alex Elder	b213e0b1a6	rbd: fix bug in rbd_dev_id_put() In rbd_dev_id_put(), there's a loop that's intended to determine the maximum device id in use. But it isn't doing that at all, the effect of how it's written is to simply use the just-put id number, which ignores whole purpose of this function. Fix the bug. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-26 17:18:08 -05:00
David Zafman	b000056a5a	ceph: Fix NULL ptr crash in strlen() set_request_path_attr() checks for NULL ptr before calling strlen() This fixes http://tracker.newdream.net/issues/3404 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-26 16:35:07 -05:00
Sage Weil	7246240c7c	libceph: avoid NULL kref_put from NULL alloc_msg return The ceph_on_in_msg_alloc() method calls the ->alloc_msg() helper which may return NULL. It also drops con->mutex while it allocates a message, which means that the connection state may change (e.g., get closed). If that happens, we clean up and bail out. Avoid calling ceph_msg_put() on a NULL return value and triggering a crash. This was observed when an ->alloc_msg() call races with a timeout that resends a zillion messages and resets the connection, and ->alloc_msg() returns NULL (because the request was resent to another target). Fixes http://tracker.newdream.net/issues/3342 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-26 16:35:04 -05:00
David Zafman	0f9831a893	ceph: fix dentry reference leak in encode_fh() Call to d_find_alias() needs a corresponding dput() This fixes http://tracker.newdream.net/issues/3271 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-26 16:34:53 -05:00
Alex Elder	35152979e6	rbd: activate v2 image support Now that v2 images support is fully implemented, have rbd_dev_v2_probe() return 0 to indicate a successful probe. (Note that an image that implements layering will fail the probe early because of the feature chekc.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:44:01 -07:00
Alex Elder	d889140c4a	rbd: implement feature checks Version 2 images have two sets of feature bit fields. The first indicates features possibly used by the image. The second indicates features that the client must support in order to use the image. When an image (or snapshot) is first examined, we need to make sure that the local implementation supports the image's required features. If not, fail the probe for the image. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:51 -07:00
Alex Elder	117973fb4c	rbd: define rbd_dev_v2_refresh() Define a new function rbd_dev_v2_refresh() to update/refresh the snapshot context for a format version 2 rbd image. This function will update anything that is not fixed for the life of an rbd image--at the moment this is mainly the snapshot context and (for a base mapping) the size. Update rbd_refresh_header() so it selects which function to use based on the image format. Rename __rbd_refresh_header() to be rbd_dev_v1_refresh() to be consistent with the naming of its version 2 counterpart. Similarly rename rbd_refresh_header() to be rbd_dev_refresh(). Unrelated--we use rbd_image_format_valid() here. Delete the other use of it, which was primarily put in place to ensure that function was referenced at the time it was defined. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:39 -07:00
Alex Elder	9478554ae5	rbd: define rbd_update_mapping_size() Encapsulate the code that handles updating the size of a mapping after an rbd image has been refreshed. This is done in anticipation of the next patch, which will make this common code for format 1 and 2 images. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-10-10 07:43:28 -07:00
Alex Elder	802c6d967f	rbd: define common queue_con_delay() This patch defines a single function, queue_con_delay() to call queue_delayed_work() for a connection. It basically generalizes what was previously queue_con() by adding the delay argument. queue_con() is now a simple helper that passes 0 for its delay. queue_con_delay() returns 0 if it queued work or an errno if it did not for some reason. If con_work() finds the BACKOFF flag set for a connection, it now calls queue_con_delay() to handle arranging to start again after a delay. Note about connection reference counts: con_work() only ever gets called as a work item function. At the time that work is scheduled, a reference to the connection is acquired, and the corresponding con_work() call is then responsible for dropping that reference before it returns. Previously, the backoff handling inside con_work() silently handed off its reference to delayed work it scheduled. Now that queue_con_delay() is used, a new reference is acquired for the newly-scheduled work, and the original reference is dropped by the con->ops->put() call at the end of the function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 22:00:44 -07:00
Alex Elder	8618e30bc1	rbd: let con_work() handle backoff Both ceph_fault() and con_work() include handling for imposing a delay before doing further processing on a faulted connection. The latter is used only if ceph_fault() is unable to. Instead, just let con_work() always be responsible for implementing the delay. After setting up the delay value, set the BACKOFF flag on the connection unconditionally and call queue_con() to ensure con_work() will get called to handle it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 22:00:21 -07:00
Alex Elder	588377d619	rbd: reset BACKOFF if unable to re-queue If ceph_fault() is unable to queue work after a delay, it sets the BACKOFF connection flag so con_work() will attempt to do so. In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't result in newly-queued work, it simply ignores this condition and proceeds as if no backoff delay were desired. There are two problems with this--one of which is a bug. The first problem is simply that the intended behavior is to back off, and if we aren't able queue the work item to run after a delay we're not doing that. The only reason queue_delayed_work() won't queue work is if the provided work item is already queued. In the messenger, this means that con_work() is already scheduled to be run again. So if we simply set the BACKOFF flag again when this occurs, we know the next con_work() call will again attempt to hold off activity on the connection until after the delay. The second problem--the bug--is a leak of a reference count. If queue_delayed_work() returns 0 in con_work(), con->ops->put() drops the connection reference held on entry to con_work(). However, processing is (was) allowed to continue, and at the end of the function a second con->ops->put() is called. This patch fixes both problems. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-09 21:59:52 -07:00
Alex Elder	6285bc2312	ceph: avoid 32-bit page index overflow A pgoff_t is defined (by default) to have type (unsigned long). On architectures such as i686 that's a 32-bit type. The ceph address space code was attempting to produce 64 bit offsets by shifting a page's index by PAGE_CACHE_SHIFT, but the result was not what was desired because the shift occurred before the result got promoted to 64 bits. Fix this by converting all uses of page->index used in this way to use the page_offset() macro, which ensures the 64-bit result has the intended value. This fixes http://tracker.newdream.net/issues/3112 Reported-by: Mohamed Pakkeer <pakkeer.mohideen@realimage.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-10-03 10:51:18 -05:00
Sage Weil	457712a0bc	ceph: return EIO on invalid layout on GET_DATALOC ioctl If the user calls GET_DATALOC on a file with an invalid (e.g., zeroed) layout, return EIO to userland. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-03 10:51:17 -05:00
Sage Weil	6cae3717cd	rbd: BUG on invalid layout This shouldn't actually be possible because the layout struct is constructed from the RBD header and validated then. [elder@inktank.com: converted BUG() call to equivalent rbd_assert()] Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
Sage Weil	6816282dab	ceph: propagate layout error on osd request creation If we are creating an osd request and get an invalid layout, return an EINVAL to the caller. We switch up the return to have an error code instead of NULL implying -ENOMEM. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00

1 2 3 4 5 ...

323378 Commits All Branches Search

323378 Commits

All Branches