OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Catalin Marinas	f3d447a97f	arm64: Do not include asm/unistd32.h in asm/unistd.h This patch only includes asm/unistd32.h where necessary and removes its inclusion in the asm/unistd.h file. The __SYSCALL_COMPAT guard is dropped. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com>	2012-10-11 10:39:08 +01:00
Catalin Marinas	4ed27ecfca	arm64: Remove unused definitions from asm/unistd32.h This patch removes the compat __NR_* definitions from the unistd32.h file and only keeps those that are used by the AArch64 kernel with a new __NR_compat_* prefix. The additional wrapper definitions in arch/arm64/kernel/sys32.S have been removed and the actual wrapper names included in the asm/unistd32.h file. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com>	2012-10-11 10:39:08 +01:00
Gong Tao	d23b5799b6	openrisc: mask interrupts in irq_mask_ack function or1k_pic_mask_ack was failing to actually mask the IRQ. Signed-off-by: Gong Tao <gongtao0607@gmail.com> Signed-off-by: Jonas Bonn <jonas@southpole.se>	2012-10-11 11:27:26 +02:00
Jonas Bonn	8eea8a6a9a	openrisc: fix typos in comments and warnings Signed-off-by: Jonas Bonn <jonas@southpole.se>	2012-10-11 11:27:25 +02:00
Jonas Bonn	f248ef1cd3	openrisc: PIC should act on domain-local irqs Now that IRQ domains are in use, we should be acting on domain-local IRQ numbers (hwirq) instead of 'global' ones. Signed-off-by: Jonas Bonn <jonas@southpole.se>	2012-10-11 11:27:25 +02:00
Vladimir Murzin	9b76beb071	openrisc: Make cpu_relax() invoke barrier() Make cpu_relax() invoke barrier() to be the same as other arches. Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> Signed-off-by: Jonas Bonn <jonas@southpole.se>	2012-10-11 11:27:19 +02:00
Ralf Baechle	35bafbee4b	UAPI Disintegration 2012-10-09 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAUHPmWhOxKuMESys7AQLOPQ//asL1dnbW+99GyGVjtvfQKUMbTF23I4u5 2VkYAZl3cNSFMQTEY8BYEPOaQM/NhRXrEgExnsyNukbzVtV9lFq91VvesRYOhPEc 94p6Q/TLj+y/rh/5N5Yz+5HyyPfTnzmaY5kL0uY9guHgZKtCMOf/15i6sPaXzZqZ 7Qn1BRROZ71a8wlvP/3OyuiKDFyYtcE++nUQKNA8Msn87Xuu3tR6Zw7apJdTWuWi 5oc/R0AzKXX5UiYShF38lVBpyrQlW2jshZPhKjboyE3oCClo+q0z+36gQ9Wpowkl ITgxKrJiEGM22RC0gr/hsZGhIxuO2cPvGlX+fwbMkQzTRXn2q6c3gZsx1w0Jfm4Z vbKMdNojP1bIflrMC47dGvKspwoFBut2dFRpeVShcRMjnYqx8YaBWM3HL0eKC5O0 1nXVs7/4vwiIcqWjXzPNak+O/Z8egCvzY2nBwTD7aU6pufGZvTEsDnzi8jos6apQ /FxZGHGOVcpkbWdMOKyJXc7mkdm2zKfMsDbIYY9wxRDBSVM4Eq030lOqp+/jvqkc d5NT97n87BMOIOFW4noBqJW54DnvXhD3Oi9e+TmpYNIk78F0ewg3ac28Xy4aGil5 xH/id7Nm7BWbXlnhHXFyw5WaeVmXhBo4UpRyxeLD1wmusrNAsvjrhpqOCqpbU1IT IelBSr0Qx5A= =XolC -----END PGP SIGNATURE----- Merge tag 'disintegrate-mips-20121009' of git://git.infradead.org/users/dhowells/linux-headers into mips-for-linux-next UAPI Disintegration 2012-10-09 Patchwork: https://patchwork.linux-mips.org/patch/4414/	2012-10-11 11:15:03 +02:00
Thomas Bogendoerfer	49a94e9482	MIPS: SNI: Switch RM400 serial to SCCNXP driver The new SCCNXP driver supports the SC2681 chips used in RM400 machines. We now use the new driver instead of the old SC26xx driver. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4417/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:14:13 +02:00
Ralf Baechle	fd9e8392c3	MIPS: Remove unused empty_bad_pmd_table[] declaration. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:14:12 +02:00
Ralf Baechle	2551aebc67	MIPS: MT: Remove kspd. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:14:12 +02:00
Ralf Baechle	2eaaac508a	MIPS: Malta: Fix section mismatch. LD arch/mips/pci/built-in.o WARNING: arch/mips/pci/built-in.o(.devinit.text+0x2a0): Section mismatch in reference from the function malta_piix_func0_fixup() to the variable .init.data:pci_irq The function __devinit malta_piix_func0_fixup() references a variable __initdata pci_irq. If pci_irq is only used by malta_piix_func0_fixup then annotate pci_irq with a matching annotation. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:14:12 +02:00
Ralf Baechle	3efd5a0db5	MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets. Originally added in 05b541489c48e7fbeec19a92acf8683230750d0a [Merge with Linux 2.5.5.] over 10 years ago but never been used. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:11:20 +02:00
Manuel Lauss	851d4f5d38	MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code. The PB1100/1500 are similar to their DB-cousins but with a few more devices on the bus. This patch adds PB1100/1500 support to the existing DB1100/1500 code. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: lnux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4338/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:11:20 +02:00
Manuel Lauss	24e8c1a611	MIPS: Alchemy: merge PB1550 support into DB1550 code The PB1550 is more or less a DB1550 without the PCI IDE controller, a more complicated (read: configurable) Flash setup and some other minor changes. Like the DB1550 it can be automatically detected by reading the CPLD ID register bits. This patch adds PB1550 detection and setup to the DB1550 code. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4337/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:11:20 +02:00
Manuel Lauss	bd8510df88	MIPS: Alchemy: Single kernel for DB1200/1300/1550 Combine support for the DB1200/PB1200, DB1300 and DB1550 boards into a single kernel image. defconfig-generated image verified on DB1200, DB1300 and DB1550. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4335/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:11:20 +02:00
David Daney	748e787eb6	MIPS: Optimize TLB refill for RI/XI configurations. We don't have to do a separate shift to eliminate the software bits, just rotate them into the fill and they will be ignored. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4294/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:11:20 +02:00
Ralf Baechle	981ef0de49	MIPS: proc: Cleanup printing of ASEs. The number of %s was just getting ridiculous. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:10:43 +02:00
Ralf Baechle	475032564e	MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required. Most supported systems currently hardwire cpu_has_dsp to 0, so we also can disable support for cpu_has_dsp2 resulting in a slightly smaller kernel. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:10:43 +02:00
Steven J. Hill	ee80f7c73d	MIPS: Add detection of DSP ASE Revision 2. [ralf@linux-mips.org: This patch really only detects the ASE and passes its existence on to userland via /proc/cpuinfo. The DSP ASE Rev 2. adds new resources but no resources that would need management by the kernel.] Signed-off-by: Steven J. Hill <sjhill@mips.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4165/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:05:03 +02:00
David Daney	f59a2d22a0	MIPS: Optimize pgd_init and pmd_init On a dual issue processor GCC generates code that saves a couple of clock cycles per loop if we rearrange things slightly. Checking for p != end saves a SLTU per loop, moving the increment to the middle can let it dual issue on multi-issue processors. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4249/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:04:35 +02:00
Al Cooper	a7911a8fd1	MIPS: perf: Add perf functionality for BMIPS5000 Add hardware performance counter support to kernel "perf" code for BMIPS5000. The BMIPS5000 performance counters are similar to MIPS MTI cores, so the changes were mostly made in perf_event_mipsxx.c which is typically for MTI cores. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4109/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:04:34 +02:00
Al Cooper	399aaa2568	MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP Split the Kconfig option CONFIG_MIPS_MT_SMP into CONFIG_MIPS_MT_SMP and CONFIG_MIPS_PERF_SHARED_TC_COUNTERS so some of the code used for performance counters that are shared between threads can be used for MIPS cores that are not MT_SMP. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4108/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:04:34 +02:00
Al Cooper	ecb8ee8a89	MIPS: perf: Remove unnecessary #ifdef The #ifdef for CONFIG_HW_PERF_EVENTS is not needed because the Makefile will only compile the module if this config option is set. This means that the code under #else would never be compiled. This may have been done to leave the original broken code around for reference, but the FIXME comment above the code already shows the broken code. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4107/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:04:34 +02:00
Al Cooper	da4b62cd67	MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt) The PCI (Program Counter Interrupt) bit in the "cause" register is mandatory for MIPS32R2 cores, but has also been added to some R1 cores (BMIPS5000). This change adds a cpu feature bit to make it easier to check for and use this feature. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4106/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:04:34 +02:00
Al Cooper	c5600b2dd9	MIPS: perf: Change the "mips_perf_event" table unsupported indicator. Change the indicator from 0xffffffff in the "event_id" member to zero in the "cntr_mask" member. This removes the need to initialize entries that are unsupported. This also solves a problem where the number of entries in the table was increased based on a globel enum used for all platforms, but the new unsupported entries were not added for mips. This was leaving new table entries of all zeros that we not marked UNSUPPORTED. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4110/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:02:41 +02:00
David Daney	485172b3df	MIPS: Align swapper_pg_dir to 64K for better TLB Refill code. We can save an instruction in the TLB Refill path for kernel mappings by aligning swapper_pg_dir on a 64K boundary. The address of swapper_pg_dir can be generated with a single LUI instead of LUI/{D}ADDUI. The alignment of __init_end is bumped up to 64K so there are no holes between it and swapper_pg_dir, which is placed at the very beginning of .bss. The alignment of invalid_pmd_table and invalid_pte_table can be relaxed to PAGE_SIZE. We do this by using __page_aligned_bss, which has the added benefit of eliminating alignment holes in .bss. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org, Cc: linux-kernel@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Patchwork: https://patchwork.linux-mips.org/patch/4220/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:02:40 +02:00
David Daney	c87728ca82	vmlinux.lds.h: Allow architectures to add sections to the front of .bss Follow-on MIPS patch will put an object here that needs 64K alignment to minimize padding. For those architectures that don't define BSS_FIRST_SECTIONS, there is no change. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org, Cc: linux-kernel@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Patchwork: https://patchwork.linux-mips.org/patch/4221/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:02:37 +02:00
Joshua Kinard	b4f2a17ba9	Improve atomic.h robustness I've maintained this patch, originally from Thiemo Seufer in 2004, for a really long time, but I think it's time for it to get a look at for possible inclusion. I have had no problems with it across various SGI systems over the years. To quote the post here: http://www.linux-mips.org/archives/linux-mips/2004-12/msg00000.html "the atomic functions use so far memory references for the inline assembler to access the semaphore. This can lead to additional instructions in the ll/sc loop, because newer compilers don't expand the memory reference any more but leave it to the assembler. The appended patch uses registers instead, and makes the ll/sc arguments more explicit. In some cases it will lead also to better register scheduling because the register isn't bound to an output any more." Signed-off-by: Joshua Kinard <kumba@gentoo.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4029/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-10-11 11:02:36 +02:00
Dmitry Torokhov	0cc8d6a9d2	Merge branch 'next' into for-linus Prepare second set of updates for 3.7 merge window (Wacom driver update and patches extending number of input minors).	2012-10-11 00:45:21 -07:00
NeilBrown	72f36d5972	md: refine reporting of resync/reshape delays. If 'resync_max' is set to 0 (as is often done when starting a reshape, so the mdadm can remain in control during a sensitive period), and if the reshape request is initially delayed because another array using the same array is resyncing or reshaping etc, when user-space cannot easily tell when the delay changes from being due to a conflicting reshape, to being due to resync_max = 0. So introduce a new state: (curr_resync == 3) to reflect this, make sure it is visible both via /proc/mdstat and via the "sync_completed" sysfs attribute, and ensure that the event transition from one delay state to the other is properly notified. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:25:57 +11:00
NeilBrown	e56108d65f	md/raid5: be careful not to resize_stripes too big. When a RAID5 is reshaping, conf->raid_disks is increased before mddev->delta_disks becomes zero. This can result in check_reshape calling resize_stripes with a number that is too large. This particularly happens when md_check_recovery calls ->check_reshape(). If we use ->previous_raid_disks, we don't risk this. Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:24:13 +11:00
NeilBrown	db07d85ef6	md: make sure manual changes to recovery checkpoint are saved. If you make an array bigger but suppress resync of the new region with mdadm --grow /dev/mdX --size=max --assume-clean then stop the array before anything is written to it, the effect of the "--assume-clean" is lost and the array will resync the new space when restarted. So ensure that we update the metadata in the case. Reported-by: Sebastian Riemer <sebastian.riemer@profitbricks.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:22:17 +11:00
Dan Carpenter	91502f099d	md/raid10: use correct limit variable Clang complains that we are assigning a variable to itself. This should be using bad_sectors like the similar earlier check does. Bug has been present since 3.1-rc1. It is minor but could conceivably cause corruption or other bad behaviour. Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:20:58 +11:00
NeilBrown	48c26ddc9f	md: writing to sync_action should clear the read-auto state. In some cases array are started in 'read-auto' state where in nothing gets written to any device until the array is written to. The purpose of this is to make accidental auto-assembly of the wrong arrays less of a risk, and to allow arrays to be started to read suspend-to-disk images without actually changing anything (as might happen if the array were dirty and a resync seemed necessary). Explicitly writing the 'sync_action' for a read-auto array currently doesn't clear the read-auto state, so the sync action doesn't happen, which can be confusing. So allow any successful write to sync_action to clear any read-auto state. Reported-by: Alexander Kühn <alexander.kuehn@nagilum.de> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:19:39 +11:00
Jianpeng Ma	7f7583d420	Subject: [PATCH] md:change resync_mismatches to atomic64_t to avoid races Now that multiple threads can handle stripes, it is safer to use an atomic64_t for resync_mismatches, to avoid update races. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 14:17:59 +11:00
Hiroaki SHIMODA	8edc0e624d	e1000e: Change wthresh to 1 to avoid possible Tx stalls This patch originated from Hiroaki SHIMODA but has been modified by Intel with some minor cleanups and additional commit log text. Denys Fedoryshchenko and others reported Tx stalls on e1000e with BQL enabled. Issue was root caused to hardware delays. They were introduced because some of the e1000e hardware with transmit writeback bursting enabled, waits until the driver does an explict flush OR there are WTHRESH descriptors to write back. Sometimes the delays in question were on the order of seconds, causing visible lag for ssh sessions and unacceptable tx completion latency, especially for BQL enabled kernels. To avoid possible Tx stalls, change WTHRESH back to 1. The current plan is to investigate a method for re-enabling WTHRESH while not harming BQL, but those patches will be later for net-next if they work. please enqueue for stable since v3.3 as this bug was introduced in commit `3f0cfa3bc1` Author: Tom Herbert <therbert@google.com> Date: Mon Nov 28 16:33:16 2011 +0000 e1000e: Support for byte queue limits Changes to e1000e to use byte queue limits. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> CC: eric.dumazet@gmail.com CC: therbert@google.com Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:59:18 -04:00
David S. Miller	959859d2fe	Merge branch 'uapi-for-3.7' of git://gitorious.org/linux-can/linux-can Marc Kleine-Budde says: ==================== this pull request for net, i.e. the v3.7 release cycle, contains the patch by David Howells to move the UAPI related headers for the CAN subsystem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:57:16 -04:00
stephen hemminger	68aaed54e7	ipv4: fix route mark sparse warning Sparse complains about RTA_MARK which is should be host order according to include file and usage in iproute. net/ipv4/route.c:2223:46: warning: incorrect type in argument 3 (different base types) net/ipv4/route.c:2223:46: expected restricted __be32 [usertype] value net/ipv4/route.c:2223:46: got unsigned int [unsigned] [usertype] flowic_mark Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:54:59 -04:00
Ian Campbell	6a8ed462f1	xen: netback: handle compound page fragments on transmit. An SKB paged fragment can consist of a compound page with order > 0. However the netchannel protocol deals only in PAGE_SIZE frames. Handle this in netbk_gop_frag_copy and xen_netbk_count_skb_slots by iterating over the frames which make up the page. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:50:45 -04:00
Sarveshwar Bandi	6caab7b054	bridge: Pull ip header into skb->data before looking into ip header. If lower layer driver leaves the ip header in the skb fragment, it needs to be first pulled into skb->data before inspecting ip header length or ip version number. Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:50:45 -04:00
Dan Carpenter	435f08a721	isdn: fix a wrapping bug in isdn_ppp_ioctl() "protos" is an array of unsigned longs and "i" is the number of bits in an unsigned long so we need to use 1UL as well to prevent the shift from wrapping around. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:50:45 -04:00
NeilBrown	1ed850f356	md/raid5: make sure to_read and to_write never go negative. to_read and to_write are part of the result of analysing a stripe before handling it. Their use is to avoid some loops and tests if the values are known to be zero. Thus it is not a problem if they are a little bit larger than they should be. So decrementing them in handle_failed_stripe serves little value, and due to races it could cause some loops to be skipped incorrectly. So remove those decrements. Reported-by: "Jianpeng Ma" <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:50:13 +11:00
Alexander Lyakas	a7854487cd	md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write. Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Suggested-by: Yair Hershko <yair@zadarastorage.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:50:12 +11:00
NeilBrown	b97390aec4	md/raid5: protect debug message against NULL derefernce. The pr_debug in add_stripe_bio could race with something changing *bip, so it is best to hold the lock until after the pr_debug. Reported-by: "Jianpeng Ma" <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:50:12 +11:00
NeilBrown	143c4d0573	md/raid5: add some missing locking in handle_failed_stripe. We really should hold the stripe_lock while accessing 'toread' else we could race with add_stripe_bio and corrupt a list. Reported-by: "Jianpeng Ma" <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:50:12 +11:00
Shaohua Li	9e44476851	MD: raid5 avoid unnecessary zero page for trim We want to avoid zero discarded dev page, because it's useless for discard. But if we don't zero it, another read/write hit such page in the cache and will get inconsistent data. To avoid zero the page, we don't set R5_UPTODATE flag after construction is done. In this way, discard write request is still issued and finished, but read will not hit the page. If the stripe gets accessed soon, we need reread the stripe, but since the chance is low, the reread isn't a big deal. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:49:49 +11:00
Shaohua Li	620125f2bf	MD: raid5 trim support Discard for raid4/5/6 has limitation. If discard request size is small, we do discard for one disk, but we need calculate parity and write parity disk. To correctly calculate parity, zero_after_discard must be guaranteed. Even it's true, we need do discard for one disk but write another disks, which makes the parity disks wear out fast. This doesn't make sense. So an efficient discard for raid4/5/6 should discard all data disks and parity disks, which requires the write pattern to be (A, A+chunk_size, A+chunk_size*2...). If A's size is smaller than chunk_size, such pattern is almost impossible in practice. So in this patch, I only handle the case that A's size equals to chunk_size. That is discard request should be aligned to stripe size and its size is multiple of stripe size. Since we can only handle request with specific alignment and size (or part of the request fitting stripes), we can't guarantee zero_after_discard even zero_after_discard is true in low level drives. The block layer doesn't send down correctly aligned requests even correct discard alignment is set, so I must filter out. For raid4/5/6 parity calculation, if data is 0, parity is 0. So if zero_after_discard is true for all disks, data is consistent after discard. Otherwise, data might be lost. Let's consider a scenario: discard a stripe, write data to one disk and write parity disk. The stripe could be still inconsistent till then depending on using data from other data disks or parity disks to calculate new parity. If the disk is broken, we can't restore it. So in this patch, we only enable discard support if all disks have zero_after_discard. If discard fails in one disk, we face the similar inconsistent issue above. The patch will make discard follow the same path as normal write request. If discard fails, a resync will be scheduled to make the data consistent. This isn't good to have extra writes, but data consistency is important. If a subsequent read/write request hits raid5 cache of a discarded stripe, the discarded dev page should have zero filled, so the data is consistent. This patch will always zero dev page for discarded request stripe. This isn't optimal because discard request doesn't need such payload. Next patch will avoid it. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:49:05 +11:00
Jianpeng Ma	582e2e056a	md/bitmap:Don't use IS_ERR to judge alloc_page(). Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:45:36 +11:00
NeilBrown	7ad4d4a68a	md/raid1: Don't release reference to device while handling read error. When we get a read error, we arrange for raid1d to handle it. Currently we release the reference on the device. This can result in conf->mirrors[read_disk].rdev being NULL in fix_read_error, if the device happens to get removed before the read error is handled. So instead keep the reference until the read error has been fully handled. Reported-by: hank <pyu@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:44:30 +11:00
Michael Wang	fd177481b4	raid: replace list_for_each_continue_rcu with new interface This patch replaces list_for_each_continue_rcu() with list_for_each_entry_continue_rcu() to save a few lines of code and allow removing list_for_each_continue_rcu(). Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>	2012-10-11 13:43:21 +11:00

... 3 4 5 6 7 ...

334401 Commits All Branches Search

334401 Commits

All Branches