This patch provides the possibility to adjust the "maximum expected number of
bad blocks per 1024 blocks" (max_beb_per1024) for each mtd device.
The majority of NAND devices have their max_beb_per1024 equal to 20, but
sometimes it's more.
Now, we can adjust that via a kernel parameter:
ubi.mtd=<name|num|path>[,<vid_hdr_offs>[,max_beb_per1024]]
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
max_beb_per1024 shouldn't be negative, and a 0 value will be treated as
the default value. For the upper bound, 768/1024 should be enough.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
This patch prepare the way for the addition of max_beb_per1024 module
parameter. There's no functional change.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
No functional changes here, just to prepare for next patch.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
UBI has changed the MTD_UBI_BEB_LIMIT semantics. It used to be a percent of
total amount of eraseblock in the partition, and now it is the maximum
amount of bad eraseblocks on the entire devise per 1024 eraseblocks. So not
only the units changed, but also the meaning.
Richard Genoud <richard.genoud@gmail.com> says:
"I found the board:
https://www.olimex.com/dev/sam9-L9260.html
and the nand datasheet:
http://www.rockbox.org/wiki/pub/Main/LyrePrototype/K9xxG08UXM.pdf
page 11, we can see that the max_bad_bebper1024 is 25 (100 for 4096)"
Thus, use "25" for sam9.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
On NAND flash devices, UBI reserves some physical erase blocks (PEB) for
bad block handling. Today, the number of reserved PEB can only be set as a
percentage of the total number of PEB in each MTD partition. For example, for a
NAND flash with 128KiB PEB, 2 MTD partition of 20MiB (mtd0) and 100MiB (mtd1)
and 2% reserved PEB:
- the UBI device on mtd0 will have 2 PEB reserved
- the UBI device on mtd1 will have 16 PEB reserved
The problem with this behaviour is that NAND flash manufacturers give a
minimum number of valid block (NVB) during the endurance life of the
device, e.g.:
Parameter Symbol Min Max Unit Notes
--------------------------------------------------------------
Valid block number NVB 1004 1024 Blocks 1
From this number we can deduce the maximum number of bad PEB that a device will
contain during its endurance life: a 128MiB NAND flash (1024 PEB) will not have
less than 20 bad blocks during the flash endurance life.
But the manufacturer doesn't tell where those bad block will appear. He doesn't
say either if they will be equally disposed on the whole device (and I'm pretty
sure they won't). So, according to the datasheets, we should reserve the
maximum number of bad PEB for each UBI device (worst case scenario: 20 bad
blocks appears on the smallest MTD partition).
So this patch make UBI use the whole MTD device size to calculate the maximum
bad expected eraseblocks.
The Kconfig option is in per1024 blocks, thus it can have a default value of 20
which is *very* common for NAND devices.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
'mtd_get_device_size()' returns the size of the whole MTD device, that is the
mtd_info master size. This will be used by UBI to calculate the maximum number
of bad blocks (MBB) on a MTD device.
Artem: amended the patch a bit.
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
'struct mtd_info' is not modified by 'mtd_is_partition()' so it can be marked
as "const".
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
CONFIG_MTD_UBI_BEB_RESERVE has been removed and now we use
CONFIG_MTD_UBI_BEB_LIMIT instead.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
CONFIG_MTD_UBI_BEB_RESERVE and MIN_RESEVED_PEBS are no longer used,
since the amount of reserved eraseblocks for bad PEB handling is now
derived from 'ubi->bad_peb_limit' (ubi's maximum expected bad
eraseblocks).
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
The existing mechanism of reserving PEBs for bad PEB handling has two
flaws:
- It is calculated as a percentage of good PEBs instead of total PEBs.
- There's no limit on the amount of PEBs UBI reserves for future bad
eraseblock handling.
This patch changes the mechanism to overcome these flaws.
The desired level of PEBs reserved for bad PEB handling (beb_rsvd_level)
is set to the maximum expected bad eraseblocks (bad_peb_limit) minus the
existing number of bad eraseblocks (bad_peb_count).
The actual amount of PEBs reserved for bad PEB handling is usually set
to the desired level (but in some circumstances may be lower than the
desired level, e.g. when attaching to a device that has too few
available PEBs to satisfy the desired level).
In the case where the device has too many bad PEBs (above the expected
limit), then the desired level, and the actual amount of PEBs reserved
are set to zero. No PEBs will be set aside for future bad eraseblock
handling - even if some PEBs are made available (e.g. by shrinking a
volume).
If another PEB goes bad, and there are available PEBs, then the
eraseblock will be marked bad (consuming one available PEB). But if
there are no available PEBs, ubi will go into readonly mode.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Introduce 'ubi->bad_peb_limit', which specifies an upper limit of PEBs
UBI expects to go bad. Currently, it is initialized to a fixed percentage
of total PEBs in the UBI device (configurable via CONFIG_MTD_UBI_BEB_LIMIT).
The 'bad_peb_limit' is intended to be used for calculating the amount of PEBs
UBI needs to reserve for bad eraseblock handling.
Artem: minor amendments.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
We are going to kill the CONFIG_MTD_UBI_BEB_RESERVE configuration option soon
and use the CONFIG_MTD_UBI_BEB_LIMIT instead. In order to do this smoothly,
we now introduce the new configuration option to sam9_l9260_defconfig, and
will kill the old one after the corresponding UBI changes are done.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit 4e8b14526c ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.
Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.
This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: Andreas Bombe <aeb@debian.org>
Cc: Zhouping Liu <zliu@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a set of two bug fixes. One is the ATOMIC problem which is now
causing a compile failure in certain situations. The other is mishandling of
PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJQQN+IAAoJEDeqqVYsXL0Mza8H/RZHZSk6xRkMXXnmNXeoPpQT
HH8ILKJtfzLDhsurznNUFcHvUeU0QiwAey9NXTuJ6leXa/f9nsRtE1izejGWbxId
1JQPH0VFz0913y9PtmWMfLedKuQLt3muynKyXbfkUO6jZsfbJK4XcU2rVHHDpPh7
PgbtWQmsOqqpmsR3sN3TcU/NglACCw27V4ZhHqoFfru2loyS84BcjdYRIxoMr6W8
AldXulb0RSBseOQXvDrp5XJMV7i75WYx2EM5l8YfK4rFC/kT0TMaTGfzTLjZRjMd
bgY2e/PEZhCzTK23+d4WtrtghKD+fTtWOJoAx5a1DZP/w/A8S9Lp7xzNJ7X2MD8=
=7In9
-----END PGP SIGNATURE-----
Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is a set of two bug fixes. One is the ATOMIC problem which is
now causing a compile failure in certain situations. The other is
mishandling of PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix personality flag check in copy_thread()
[PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Pull s390 fixes from Martin Schwidefsky:
"A couple of s390 bug fixes for 3.5-rc4"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/32: Don't clobber personality flags on exec
s390/smp: add missing smp_store_status() for !SMP
s390/dasd: fix ioctl return value
s390: Always use "long" for ssize_t to match size_t
Pull drm fixes from Dave Airlie:
"A bunch of scattered fixes ati/intel/nouveau, couple of core ones,
nothing too shocking or different."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222S
gma500: Consider CRTC initially active.
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm: Check for invalid cursor flags
drm: Initialize object type when using DRM_MODE() macro
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions
In native 32 bit mode the personality flags were not correctly inherited.
This is the s390 version of 59e4c3a2 "powerpc/32: Don't clobber personality
flags on exec".
Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Connecting an ASUS VW222S [1] over VGA a garbled screen is shown with
vertical stripes in the top half.
In commit bc42aabc [2]
commit bc42aabc6a
Author: Adam Jackson <ajax@redhat.com>
Date: Wed May 23 16:26:54 2012 -0400
drm/edid/quirks: ViewSonic VA2026w
Adam Jackson added the quirk `EDID_QUIRK_FORCE_REDUCED_BLANKING` which
is also needed for this ASUS monitor.
All log files and output from `xrandr` is included in the referenced
Bugzilla report #17629.
Please note that this monitor only has a VGA (D-Sub) connector [1].
[1] http://www.asus.com/Display/LCD_Monitors/VW222S/
[2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=bc42aabc6a01b92b0f961d65671564e0e1cd7592
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17629
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Cc: <dri-devel@lists.freedesktop.org>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Ian Pilcher <arequipeno@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex writes:
Highlights:
- fix a gart regression on older IGP chips
- more MSAA fixes
- fix a double free in gpu reset code
- modesetting fixes
- trinity dig encoder fix.
* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
[this one ideally should make 3.6 - it fixes the very annoying mode setting bug]
This causes the pipe to be forced off prior to initial mode set, which
roughly mirrors the behavior of the i915 driver. It fixes initial mode
setting on my Intel DN2800MT (Cedarview) board. Without it, mode
setting triggers an out-of-range error from the monitor for most modes,
but only on initial configuration (i.e. they can be configured
successfully from userspace after that).
Signed-off-by: Forest Bond <forest.bond@rapidrollout.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Was using the DCE41 code which was wrong. Fixes
blank displays on a number of Trinity systems.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Pull btrfs fixes from Chris Mason:
"I've split out the big send/receive update from my last pull request
and now have just the fixes in my for-linus branch. The send/recv
branch will wander over to linux-next shortly though.
The largest patches in this pull are Josef's patches to fix DIO
locking problems and his patch to fix a crash during balance. They
are both well tested.
The rest are smaller fixes that we've had queued. The last rc came
out while I was hacking new and exciting ways to recover from a
misplaced rm -rf on my dev box, so these missed rc3."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
Btrfs: fix that repair code is spuriously executed for transid failures
Btrfs: fix ordered extent leak when failing to start a transaction
Btrfs: fix a dio write regression
Btrfs: fix deadlock with freeze and sync V2
Btrfs: revert checksum error statistic which can cause a BUG()
Btrfs: remove superblock writing after fatal error
Btrfs: allow delayed refs to be merged
Btrfs: fix enospc problems when deleting a subvol
Btrfs: fix wrong mtime and ctime when creating snapshots
Btrfs: fix race in run_clustered_refs
Btrfs: don't run __tree_mod_log_free_eb on leaves
Btrfs: increase the size of the free space cache
Btrfs: barrier before waitqueue_active
Btrfs: fix deadlock in wait_for_more_refs
btrfs: fix second lock in btrfs_delete_delayed_items()
Btrfs: don't allocate a seperate csums array for direct reads
Btrfs: do not strdup non existent strings
Btrfs: do not use missing devices when showing devname
Btrfs: fix that error value is changed by mistake
Btrfs: lock extents as we map them in DIO
...
Pull watchdog fixes from Wim Van Sebroeck:
"This will fix a warning for watchdog-test.c and it will remove a
duplicate include of delay.h"
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: da9052: Remove duplicate inclusion of delay.h
watchdog: fix watchdog-test.c build warning
cache_grow() can reenable irqs so the cpu (and node) can change, so ensure
that we take list_lock on the correct nodelist.
This fixes an issue with commit 072bb0aa5e ("mm: sl[au]b: add
knowledge of PFMEMALLOC reserve pages") where list_lock for the wrong
node was taken after growing the cache.
Reported-and-tested-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
radeon_ring_restore is freeing the memory for the saved
ring data. We need to remember that, otherwise we try to
restore the ring data again on the next try. Additional
to that it shouldn't try the reset infinitely if we have
saved ring data.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
It seems some of those IGP dislike non dma32 page despite what
documentation says. Fix regression since we allowed non dma32
pages. It seems it only affect some revision of those IGP chips
as we don't know which one just force dma32 for all of them.
https://bugzilla.redhat.com/show_bug.cgi?id=785375
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adjust the panel mode setup to match the behavior
of the vbios. Rather than checking for specific
bridge chip ids, just check the eDP configuration register.
This saves extra aux transactions and works across
DP bridge chips without requiring additional per chip
id checking.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Power gating is per crtc pair, but the powergating registers
should be called individually. The hw handles power up/down
properly. The pair is powered up if either crtc in the pair
is powered up and the pair is not powered down until both
crtcs in the pair are powered down. This simplifies
programming and should save additional power as the previous
code never actually power gated the crtc pair.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The ordering is important and the current drm code
wasn't cutting it for modern DIG encoders. We need
to have information about crtc before setting up
the encoders so I've shifted the ordering a bit.
Probably we'll need a full rework akin to danvet's
recent intel patchs. This patch fixes numerous
issues with DP bridge chips and makes link training
much more reliable.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Some plls are shared for DP.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Checking of the second colorbuffer was skipped on r700, because
CB_TARGET_MASK was 0xf. With r600, CB_TARGET_MASK is changed to 0xff,
so we must set the number of samples of the second colorbuffer to 1 in order
to pass the CS checker.
The DRM version is bumped, because RESOLVE_BOX is always rejected without this
fix on r600.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This should help catch uninitialized registers and reject commands
because of that.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix compiler warning by making the function static:
Documentation/watchdog/src/watchdog-test.c:34:6: warning: no previous prototype for 'term'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Daniel writes:
"Just a few smaller things:
- Fix up a pipe vs. plane confusion from a refactoring, fixes a regression
from 3.1 (Anhua Xu).
- Fix ivb sprite pixel formats (Vijay).
- Fixup ppgtt pde placement for machines where the Bios artifically limits
the availbale gtt space in the name of ... product differentiation
(Chris). This fixes an oops.
- Yet another no_lvds quirk entry."
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions
Ben says its just a single fix to avoid the wrong pcopy units being used.
* 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
On some Fermi chipsets (NVCE particularly) PCOPY1 doesn't exist. And if
what I've seen on Kepler is true of Fermi too, chipsets of the same type
can have different PCOPY units available.
This should fix a v3.5 regression reported by a number of people effecting
suspend/resume on NVC8/NVCE chipsets.
Cc: stable@vger.kernel.org [3.5]
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If verify_parent_transid() fails for all mirrors, the current code
calls repair_io_failure() anyway which means:
- that the disk block is rewritten without repairing anything and
- that a kernel log message is printed which misleadingly claims
that a read error was corrected.
This is an example:
parent transid verify failed on 615015833600 wanted 110423 found 110424
parent transid verify failed on 615015833600 wanted 110423 found 110424
btrfs read error corrected: ino 1 off 615015833600 (dev /dev/...)
It is wrong to ignore the results from verify_parent_transid() and to
call repair_eb_io_failure() when the verification of the transids failed.
This commit fixes the issue.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
We cannot just return error before freeing ordered extent and releasing reserved
space when we fail to start a transacion.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9
(Btrfs: lock extents as we map them in DIO).
In dio write, we should unlock the section which we didn't do IO on in case that
we fall back to buffered write. But we need to not only unlock the section
but also cleanup reserved space for the section.
This bug was found while running xfstests 133, with this 133 no longer complains.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
We can deadlock with freeze right now because we unconditionally start a
transaction in our ->sync_fs() call. To fix this just check and see if we
have a running transaction to commit. This saves us from the deadlock
because at this point we'll have the umount sem for the sb so we're safe
from freezes coming in after we've done our check. With this patch the
freeze xfstests no longer deadlocks. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Commit 442a4f6308 added btrfs device
statistic counters for detected IO and checksum errors to Linux 3.5.
The statistic part that counts checksum errors in
end_bio_extent_readpage() can cause a BUG() in a subfunction:
"kernel BUG at fs/btrfs/volumes.c:3762!"
That part is reverted with the current patch.
However, the counting of checksum errors in the scrub context remains
active, and the counting of detected IO errors (read, write or flush
errors) in all contexts remains active.
Cc: stable <stable@vger.kernel.org> # 3.5
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
With commit acce952b0, btrfs was changed to flag the filesystem with
BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
error happened like a write I/O errors of all mirrors.
In such situations, on unmount, the superblock is written in
btrfs_error_commit_super(). This is done with the intention to be able
to evaluate the error flag on the next mount. A warning is printed
in this case during the next mount and the log tree is ignored.
The issue is that it is possible that the superblock points to a root
that was not written (due to write I/O errors).
The result is that the filesystem cannot be mounted. btrfsck also does
not start and all the other btrfs-progs tools fail to start as well.
However, mount -o recovery is working well and does the right things
to recover the filesystem (i.e., don't use the log root, clear the
free space cache and use the next mountable root that is stored in the
root backup array).
This patch removes the writing of the superblock when
BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
flag in the mount function.
These lines can be used to reproduce the issue (using /dev/sdm):
SCRATCH_DEV=/dev/sdm
SCRATCH_MNT=/mnt
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
ls -alLF /dev/mapper/foo
mkfs.btrfs /dev/mapper/foo
mount /dev/mapper/foo $SCRATCH_MNT
echo bar > $SCRATCH_MNT/foo
sync
echo 0 25165824 error | dmsetup reload foo
dmsetup resume foo
ls -alF $SCRATCH_MNT
touch $SCRATCH_MNT/1
ls -alF $SCRATCH_MNT
sleep 35
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
dmsetup resume foo
sleep 1
umount $SCRATCH_MNT
btrfsck /dev/mapper/foo
dmsetup remove foo
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>