Commit Graph

77 Commits

Author SHA1 Message Date
Matias Bjørling aff3fb18f9 lightnvm: move bad block and chunk state logic to core
pblk implements two data paths for recovery line state. One for 1.2
and another for 2.0, instead of having pblk implement these, combine
them in the core to reduce complexity and make available to other
targets.

The new interface will adhere to the 2.0 chunk definition,
including managing open chunks with an active write pointer. To provide
this interface, a 1.2 device recovers the state of the chunks by
manually detecting if a chunk is either free/open/close/offline, and if
open, scanning the flash pages sequentially to find the next writeable
page. This process takes on average ~10 seconds on a device with 64 dies,
1024 blocks and 60us read access time. The process can be parallelized
but is left out for maintenance simplicity, as the 1.2 specification is
deprecated. For 2.0 devices, the logic is maintained internally in the
drive and retrieved through the 2.0 interface.

Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09 08:25:06 -06:00
Matias Bjørling 4b5d56edb8 lightnvm: pblk: fix rqd.error return value in pblk_blk_erase_sync
rqd.error is masked by the return value of pblk_submit_io_sync.
The rqd structure is then passed on to the end_io function, which
assumes that any error should lead to a chunk being marked
offline/bad. Since the pblk_submit_io_sync can fail before the
command is issued to the device, the error value maybe not correspond
to a media failure, leading to chunks being immaturely retired.

Also, the pblk_blk_erase_sync function prints an error message in case
the erase fails. Since the caller prints an error message by itself,
remove the error message in this function.

Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Reviewed-by: Javier González <javier@cnexlabs.com>
Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09 08:25:06 -06:00
Matias Bjørling d7b6801673 lightnvm: combine 1.2 and 2.0 command flags
Add nvm_set_flags helper to enable core to appropriately
set the command flags for read/write/erase depending on which version
a drive supports.

The flags arguments can be distilled into the access hint,
scrambling, and program/erase suspend. Replace the access hint with
a "is_seq" parameter. The rest of the flags are dependent on the
command opcode, which is trivial to detect and set.

Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09 08:25:05 -06:00
Gustavo A. R. Silva 884b031b28 lightnvm: pblk: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-13 08:14:45 -06:00
Matias Bjørling 4e495a46b1 lightnvm: pblk: expose generic disk name on pr_* msgs
The error messages in pblk does not say which pblk instance that
a message occurred from. Update each error message to reflect the
instance it belongs to, and also prefix it with pblk, so we know
the message comes from the pblk module.

Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-13 08:14:43 -06:00
Matias Bjørling 880eda5440 lightnvm: move NVM_DEBUG to pblk
There is no users of CONFIG_NVM_DEBUG in the LightNVM subsystem. All
users are in pblk. Rename NVM_DEBUG to NVM_PBLK_DEBUG and enable
only for pblk.

Also fix up the CONFIG_NVM_PBLK entry to follow the code style for
Kconfig files.

Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-13 08:14:31 -06:00
Javier González 9cfd5a9538 lightnvm: pblk: take bitmap alloc. out of critical section
pblk allocates line bitmaps within the line lock unnecessarily. In order
to take pressure out of the fast patch, allocate line bitmaps outside
of this lock and refactor accordingly.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg cc9c9a00b1 lightnvm: pblk: kick writer on new flush points
Unless we kick the writer directly when setting a new flush point, the
user risks having to wait for up to one second (the default timeout for
the write thread to be kicked) for the IO to complete.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Igor Konopko f142ac0b5d lightnvm: proper error handling for pblk_bio_add_pages
Currently in case of error caused by bio_pc_add_page in
pblk_bio_add_pages two issues occur when calling from
pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we
are trying to free pages not allocated from our mempool. Second one
is the warn from dma_pool_free, that we are trying to free NULL
pointer dma.

This commit fix both issues.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg 6cf17a2f83 lightnvm: pblk: fix smeta write error path
Smeta write errors were previously ignored. Skip these
lines instead and throw them back on the free
list, so the chunks will go through a reset cycle
before we attempt to use the line again.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg 48b8d20895 lightnvm: pblk: garbage collect lines with failed writes
Write failures should not happen under normal circumstances,
so in order to bring the chunk back into a known state as soon
as possible, evacuate all the valid data out of the line and let the
fw judge if the block can be written to in the next reset cycle.

Do this by introducing a new gc list for lines with failed writes,
and ensure that the rate limiter allocates a small portion of
the write bandwidth to get the job done.

The lba list is saved in memory for use during gc as we
cannot gurantee that the emeta data is readable if a write
error occurred.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Javier González 72b6cdbb11 lightnvm: pblk: remove dead function
Remove dead function for manual sync. I/O

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 07:43:53 -06:00
Javier González a7c9e9109c lightnvm: pass flag on graceful teardown to targets
If the namespace is unregistered before the LightNVM target is removed
(e.g., on hot unplug) it is too late for the target to store any metadata
on the device - any attempt to write to the device will fail. In this
case, pass on a "gracefull teardown" flag to the target to let it know
when this happens.

In the case of pblk, we pad the open line (close all open chunks) to
improve data retention. In the event of an ungraceful shutdown, avoid
this part and just clean up.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 07:43:53 -06:00
Javier González 8e55c07b2b lightnvm: pblk: remove unnecessary argument
Remove unnecessary argument on pblk_line_free()

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 07:43:53 -06:00
Javier González b6730dd4a9 lightnvm: pblk: return NVM_ error on failed submission
Return a meaningful error when the sanity vector I/O check fails.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 07:43:53 -06:00
Javier González 1d8b33e05c lightnvm: pblk: recheck for bad lines at runtime
Bad blocks can grow at runtime. Check that the number of valid blocks in
a line are within the sanity threshold before allocating the line for
new writes.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 07:43:53 -06:00
Kent Overstreet b906bbb699 lightnvm: convert to bioset_init()/mempool_init()
Convert lightnvm to embedded bio sets.

Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-30 15:33:32 -06:00
Javier González 32ef9412c1 lightnvm: pblk: implement get log report chunk
In preparation of pblk supporting 2.0, implement the get log report
chunk in pblk. Also, define the chunk states as given in the 2.0 spec.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Javier González 6947151374 lightnvm: add support for 2.0 address format
Add support for 2.0 address format. Also, align address bits for 1.2 and
2.0 to be able to operate on channel and luns without requiring a format
conversion. Use a generic address format for this purpose.

Also, convert the generic operations to the generic format in pblk.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Javier González a40afad90b lightnvm: normalize geometry nomenclature
Normalize nomenclature for naming channels, luns, chunks, planes and
sectors as well as derivations in order to improve readability.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Javier González e46f4e4822 lightnvm: simplify geometry structure
Currently, the device geometry is stored redundantly in the nvm_id and
nvm_geo structures at a device level. Moreover, when instantiating
targets on a specific number of LUNs, these structures are replicated
and manually modified to fit the instance channel and LUN partitioning.

Instead, create a generic geometry around nvm_geo, which can be used by
(i) the underlying device to describe the geometry of the whole device,
and (ii) instances to describe their geometry independently.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Javier González e411b33117 lightnvm: pblk: refactor bad block identification
In preparation for the OCSSD 2.0 spec. bad block identification,
refactor the current code to generalize bad block get/set functions and
structures.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Hans Holmberg b966c50b14 lightnvm: pblk: allow allocation of new lines during shutdown
When shutting down pblk the write buffer is flushed and if the
current line can't fit the data in the write buffer we need
to allocate a new line, so remove the check that prevents this.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Hans Holmberg 76758390f8 lightnvm: pblk: export write amplification counters to sysfs
In a SSD, write amplification, WA, is defined as the average
number of page writes per user page write. Write amplification
negatively affects write performance and decreases the lifetime
of the disk, so it's a useful metric to add to sysfs.

In plkb's case, the number of writes per user sector is the sum of:

    (1) number of user writes
    (2) number of sectors written by the garbage collector
    (3) number of sectors padded (i.e. due to syncs)

This patch adds persistent counters for 1-3 and two sysfs attributes
to export these along with WA calculated with five decimals:

    write_amp_mileage: the accumulated write amplification stats
                      for the lifetime of the pblk instance

    write_amp_trip: resetable stats to facilitate delta measurements,
                    values reset at creation and if 0 is written
                    to the attribute.

64-bit counters are used as a 32 bit counter would wrap around
already after about 17 TB worth of user data. It will take a
long long time before the 64 bit sector counters wrap around.

The counters are stored after the bad block bitmap in the first
emeta sector of each written line. There is plenty of space in the
first emeta sector, so we don't need to bump the major version of
the line data format.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Hans Holmberg d0ab0b1ab9 lightnvm: pblk: check data lines version on recovery
As a preparation for future bumps of data line persistent storage
versions, we need to start checking the emeta line version during
recovery. Also slit up the current emeta/smeta version into two
bytes (major,minor).

Recovering lines with the same major number as the current pblk data
line version must succeed. This means that any changes in the
persistent format must be:

 (1) Backward compatible: if we switch back to and older
     kernel, recovery of lines stored with major == current_major
     and minor > current_minor must succeed.

 (2) Forward compatible: switching to a newer kernel,
     recovery of lines stored with major=current_major and
     minor < minor must handle the data format differences
     gracefully(i.e. initialize new data structures to default values).

If we detect lines that have a different major number than
the current we must abort recovery. The user must manually
migrate the data in this case.

Previously the version stored in the emeta header was copied
from smeta, which has version 1, so we need to set the minor
version to 1.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Hans Holmberg cfe1c9e2e2 lightnvm: pblk: handle bad sectors in the emeta area correctly
Unless we check if there are bad sectors in the entire emeta-area
we risk ending up with valid bitmap / available sector count inconsistency.
This results in lines with a bad chunk at the last LUN marked as bad,
so go through the whole emeta area and mark up the invalid sectors.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-29 17:29:09 -06:00
Javier González 8f554597e0 lightnvm: pblk: do not log recovery read errors
On scan recovery, reads can fail. This happens because the first page
for each line is read in order to determined if the line has been used
(and thus needs to be recovered), or not. This can lead to "empty page"
read errors.

Since these errors are normal, do not log them, as they are confusing
when reviewing the logs.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González a7689938ef lightnvm: pblk: use exact free block counter in RL
Until now, pblk's rate-limiter has used a heuristic to reserve space for
GC I/O given that the over-provision area was fixed.

In preparation for allowing to define the over-provision area on target
creation, define a dedicated free_block counter in the rate-limiter to
track the number of blocks being used for user data.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González b1bcfda105 lightnvm: pblk: compress and reorder helper functions
Through time, we have generated some redundant helper functions.
Refactor them to eliminate redundant and unnecessary code. Also, reorder
them to improve readability

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Matias Bjørling fae7fae407 lightnvm: make geometry structures 2.0 ready
Prepare for the 2.0 revision by adapting the geometry
structures to coexist with the 1.2 revision.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Kees Cook 87c1d2d373 lightnvm: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Matias Bjorling <mb@lightnvm.io>
Cc: linux-block@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:46:44 -08:00
Javier González 1a94b2d484 lightnvm: implement generic path for sync I/O
Implement a generic path for sending sync I/O on LightNVM. This allows
to reuse the standard synchronous path trough blk_execute_rq(), instead
of implementing a wait_for_completion on the target side (e.g., pblk).

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 8bd400204b lightnvm: pblk: cleanup unused and static functions
Cleanup up unused and static functions across the whole codebase.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Hans Holmberg d6b992f7ab lightnvm: pblk: gc all lines in the pipeline before exit
Finish garbage collect of the lines that are in the gc pipeline
before exiting. Ensure that all lines already in in the pipeline
goes through, from read to write.

Do this by keeping track of how many lines are in the pipeline
and waiting for that number to reach zero before exiting the gc
reader task.

Since we're adding a new gc line counter, change the name of
inflight_gc to read_inflight_gc to make the distinction clear.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Rakesh Pandit e480689bd1 lightnvm: pblk: remove useless line
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 21d2287119 lightnvm: pblk: enable 1 LUN configuration
Metadata I/Os are scheduled to minimize their impact on user data I/Os.
When there are enough LUNs instantiated (i.e., enough bandwidth), it is
easy to interleave metadata and data one after the other so that
metadata I/Os are the ones being blocked and not vice-versa.

We do this by calculating the distance between the I/Os in terms of the
LUNs that are not in used, and selecting a free LUN that satisfies a
the simple heuristic that metadata is scheduled behind. The per-LUN
semaphores guarantee consistency. This works fine on >1 LUN
configuration. However, when a single LUN is instantiated, this design
leads to a deadlock, where metadata waits to be scheduled on a free LUN.

This patch implements the 1 LUN case by simply scheduling the metadada
I/O after the data I/O. In the process, we refactor the way a line is
replaced to ensure that metadata writes are submitted after data writes
in order to guarantee block sequentiality. Note that, since there is
only one LUN, both I/Os will block each other by design. However, such
configuration only pursues tight read latencies, not write bandwidth.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 7bd4d370db lightnvm: pblk: guarantee line integrity on reads
When a line is recycled during garbage collection, reads can still be
issued to the line. If the line is freed in the middle of this process,
data corruption might occur.

This patch guarantees that lines are not freed in the middle of reads
that target them (lines). Specifically, we use the existing line
reference to decide when a line is eligible for being freed after the
recycle process.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 26532ee52b lightnvm: pblk: use rqd->end_io for completion
For consistency with the rest of pblk, use rqd->end_io to point to the
function taking care of ending the request on the completion path.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 67bf26a322 lightnvm: pblk: refactor rqd alloc/free
Refactor the rqd allocation and free functions so that all I/O types can
use these helper functions.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González e2cddf2082 lightnvm: pblk: improve naming for internal req.
Each request type sent to the LightNVM subsystem requires different
metadata. Until now, we have tailored this metadata based on write, read
and erase commands. However, pblk uses different metadata for internal
writes that do not hit the write buffer. Instead of abusing the metadata
for reads, create a new request type - internal write to improve
code readability.

In the process, create internal values for each I/O type instead of
abusing the READ/WRITE macros, as suggested by Christoph.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 55e836d401 lightnvm: pblk: put bio on bio completion
Simplify put bio by doing it on bio end_io instead of manually putting
it on the completion path.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González d340121eb7 lightnvm: pblk: simplify data validity check on GC
When a line is selected for recycling by the garbage collector (GC), the
line state changes and the invalid bitmap is frozen, preventing
invalidations from happening. Throughout the GC, the L2P map is checked
to verify that not data being recycled has been updated. The last check
is done before the new map is being stored on the L2P table. Though
this algorithm works, it requires a number of corner cases to be checked
each time the L2P table is being updated. This complicates readability
and is error prone in case that the recycling algorithm is modified.

Instead, this patch makes the invalid bitmap accessible even when the
line is being recycled. When recycled data is being remapped, it is
enough to check the invalid bitmap for the line before updating the L2P
table.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 9f6cb13bb4 lightnvm: pblk: normalize ppa namings
Normalize the way we name ppa variables to improve code readability.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 2942f50fa3 lightnvm: pblk: remove checks on mempool alloc.
As part of the mempool audit on pblk, remove unnecessary mempool
allocation checks on mempools.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González e72ec1d31b lightnvm: pblk: do not use a mempool for line bitmaps
pblk holds two sector bitmaps: one to keep track of the mapped sectors
while the line is active and another one to keep track of the invalid
sectors. The latter is kept during the whole live of the line, until it
is recycled. Since we cannot guarantee forward progress for the mempool
in this case, get rid of the mempool and simply allocate memory through
kmalloc.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González 0d880398cb lightnvm: pblk: decouple read/erase mempools
Since read and erase paths offer different guarantees for inflight I/Os,
separate the mempools to set the right min_nr for each on creation.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González b84ae4a8b8 lightnvm: pblk: simplify work_queue mempool
In pblk, we have a mempool to allocate a generic structure that we
pass along workqueues. This is heavily used in the GC path in order
to have enough inflight reads and fully utilize the GC bandwidth.

However, the current GC path copies data to the host memory and puts it
back into the write buffer. This requires a vmalloc allocation for the
data and a memory copy. Thus, guaranteeing the allocation by using a
mempool for the structure in itself does not give us much. Until we
implement support for vector copy to avoid moving data through the host,
just allocate the workqueue structure using kmalloc.

This allows us to have a much smaller mempool.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González bd43241768 lightnvm: pblk: fix min size for page mempool
pblk uses an internal page mempool for allocating pages on internal
bios. The main two users of this memory pool are partial reads (reads
with some sectors in cache and some on media) and padded writes, which
need to add dummy pages to an existing bio already containing valid
data (and with a large enough bioset allocated). In both cases, the
maximum number of pages per bio is defined by the maximum number of
physical sectors supported by the underlying device.

This patch fixes a bad mempool allocation, where the min_nr of elements
on the pool was fixed (to 16), which is lower than the maximum number
of sectors supported by NVMe (as of the time for this patch). Instead,
use the maximum number of allowed sectors reported by the device.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Javier González cd8ddbf7a5 lightnvm: pblk: free padded entries in write buffer
When a REQ_FLUSH reaches pblk, the bio cannot be directly completed.
Instead, data on the write buffer is flushed and the bio is completed on
the completion pah. This might require some sectors to be padded in
order to guarantee a successful write.

This patch fixes a memory leak on the padded pages. A consequence of
this bad free was that internal bios not containing data (only a flush)
were not being completed.

Fixes: a4bd217b43 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00
Rakesh Pandit 32825ebb06 lightnvm: pblk: reuse pblk_gc_should_kick
This is a trivial change which reuses pblk_gc_should_kick instead of
repeating it again in pblk_rl_free_lines_inc.

Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Made it apply to the common case.
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-10-13 08:34:57 -06:00