This fixes a race which happens by freeing an object on the stack.
Quoting Julius:
> The issue is
> that it calls usbnet_terminate_urbs() before that, which temporarily
> installs a waitqueue in dev->wait in order to be able to wait on the
> tasklet to run and finish up some queues. The waiting itself looks
> okay, but the access to 'dev->wait' is totally unprotected and can
> race arbitrarily. I think in this case usbnet_bh() managed to succeed
> it's dev->wait check just before usbnet_terminate_urbs() sets it back
> to NULL. The latter then finishes and the waitqueue_t structure on its
> stack gets overwritten by other functions halfway through the
> wake_up() call in usbnet_bh().
The fix is to just not allocate the data structure on the stack.
As dev->wait is abused as a flag it also takes a runtime PM change
to fix this bug.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Reported-by: Grant Grundler <grundler@google.com>
Tested-by: Grant Grundler <grundler@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current error handling of virtqueue_kick() was wrong in two places:
- The skb were freed immediately when virtqueue_kick() fail during
xmit. This may lead double free since the skb was not detached from
the virtqueue.
- try_fill_recv() returns false when virtqueue_kick() fail. This will
lead unnecessary rescheduling of refill work.
Actually, it's safe to just ignore the kick failure in those two
places. So this patch fixes this by partially revert commit
6797590118.
Fixes 6797590118
(virtio_net: verify if virtqueue_kick() succeeded).
Cc: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we allocated anon_inode_inode in anon_inodefs_mount. This is
somewhat fragile as if that function ever gets called again, it will
overwrite anon_inode_inode pointer. So move the initialization of
anon_inode_inode to anon_inode_init().
Signed-off-by: Jan Kara <jack@suse.cz>
[ Further simplified on suggestion from Dave Jones ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we were on a non-optimus device, we'd return -EINVAL, this would
lead to the over engineered runtime pm system to go into an error
state, subsequent get_sync's would fail, so we'd never be able
to open the device again.
(like really get_sync shouldn't fail if the device isn't powered
down).
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
this stops the device from being deleted before all the dma-bufs
on it are freed, this fixes an oops when you unplug a udl device while
it has imported a buffer from another device.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some applications didn't expect recvmsg() on a non blocking socket
could return -EINTR. This possibility was added as a side effect
of commit b3ca9b02b0 ("net: fix multithreaded signal handling in
unix recv routines").
To hit this bug, you need to be a bit unlucky, as the u->readlock
mutex is usually held for very small periods.
Fixes: b3ca9b02b0 ("net: fix multithreaded signal handling in unix recv routines")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Petazzoni says:
====================
net: mvneta: fix usage as a module
The following set of two patches fix the usage of the mvneta driver
when built as a module, and used in RGMII configurations. It is
somewhat similar to a previous fix that was made by Arnaud Patard, but
which was limited to SGMII configurations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.
This commit switches to use devm_ioremap_resource() instead, which
automatically requests the resource (so the I/O registers region shows
up properly in /proc/iomem), and also is devm-style, which allows to
get rid of some error handling to unmap the I/O registers region.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5445eaf309 ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.
However, it turns out that the Armada XP GP, which uses RGMII, is
affected by a similar problem: its SERDES configuration is lost when
mvneta is loaded as a module, because this configuration is set by the
bootloader, and then lost because the clock is gated by the clock
framework until the mvneta driver is loaded again and the clock is
re-enabled.
However, it turns out that for the RGMII case, setting the SERDES
configuration is not sufficient: the PCS enable bit in the
MVNETA_GMAC_CTRL_2 register must also be set, like in the SGMII
configuration.
Therefore, this commit reworks the SGMII/RGMII initialization: the
only difference between the two now is a different SERDES
configuration, all the rest is identical.
In detail, to achieve this, the commit:
* Renames MVNETA_SGMII_SERDES_CFG to MVNETA_SERDES_CFG because it is
not specific to SGMII, but also used on RGMII configurations.
* Adds a MVNETA_RGMII_SERDES_PROTO definition, that must be used as
the MVNETA_SERDES_CFG value in RGMII configurations.
* Removes the mvneta_gmac_rgmii_set() and mvneta_port_sgmii_config()
functions, and instead directly do the SGMII/RGMII configuration in
mvneta_port_up(), from where those functions where called. It is
worth mentioning that mvneta_gmac_rgmii_set() had an 'enable'
parameter that was always passed as '1', so it was pretty useless.
* Reworks the mvneta_port_up() function to set the MVNETA_SERDES_CFG
register to the appropriate value depending on the RGMII vs. SGMII
configuration. It also unconditionally set the PCS_ENABLE bit (was
already done for SGMII, but is now also needed for RGMII), and sets
the PORT_RGMII bit (which was already done for both SGMII and
RGMII).
This commit was successfully tested with mvneta compiled as a module,
on both the OpenBlocks AX3 (SGMII configuration) and the Armada XP GP
(RGMII configuration).
Reported-by: Steve McIntyre <steve@einval.com>
Cc: stable@vger.kernel.org # 3.11.x: 5445eaf309 mvneta: Try to fix mvneta when compiled as module
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bit 3 of the MVNETA_GMAC_CTRL_2 is actually used to enable the PCS,
not the PSC: there was a typo in the name of the define, which this
commit fixes.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cypress PS/2 trackpad models supported by the cypress_ps2 driver
emulate BTN_RIGHT events in firmware based on the finger position, as part
of this no motion events are sent when the finger is in the button area.
The INPUT_PROP_BUTTONPAD property is there to indicate to userspace that
BTN_RIGHT events should be emulated in userspace, which is not necessary
in this case.
When INPUT_PROP_BUTTONPAD is advertised userspace will wait for a motion
event before propagating the button event higher up the stack, as it needs
current abs x + y data for its BTN_RIGHT emulation. Since in the
cypress_ps2 pads don't report motion events in the button area, this means
that clicks in the button area end up being ignored, so
INPUT_PROP_BUTTONPAD actually causes problems for these touchpads, and
removing it fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=76341
Reported-by: Adam Williamson <awilliam@redhat.com>
Tested-by: Adam Williamson <awilliam@redhat.com>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Including hardware acceleration features in vlan_features breaks
stacked vlans (Q-in-Q) by marking the bottom vlan interface as
capable of acceleration. This causes one of the tags to be lost
and the packets are sent with a sing vlan header.
CC: Nithin Nayak Sujir <nsujir@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 10ddceb22b (ip_tunnel:multicast process cause panic due
to skb->_skb_refdst NULL pointer) removed dst-drop call from
ip-tunnel-recv.
Following commit reintroduce dst-drop and fix the original bug by
checking loopback packet before releasing dst.
Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
my slides for the event trigger tutorial. I booted a 3.14-rc7 kernel
to perform what I wanted to teach and cut and paste it into my slides.
When I tried the traceon event trigger with a condition attached to it
(turns tracing on only if a field of the trigger event matches a condition
set by the user), nothing happened. Tracing would not turn on. I stopped
working on my presentation in order to find what was wrong.
It ended up being the way trace event triggers work when they have
conditions. Instead of copying the fields, the condition code just
looks at the fields that were copied into the ring buffer. This works
great, unless tracing is off. That's because when the event is reserved
on the ring buffer, the ring buffer returns a NULL pointer, this tells
the tracing code that the ring buffer is disabled. This ends up being
a problem for the traceon trigger if it is using this information to
check its condition.
Luckily the code that checks if tracing is on returns the ring buffer
to use (because the ring buffer is determined by the event file
also passed to that field). I was able to easily solve this bug by
checking in that helper function if the returned ring buffer entry
is NULL, and if so, also check the file flag if it has a trace event
trigger condition, and if so, to pass back a temp ring buffer to use.
This will allow the trace event trigger condition to still test the
event fields, but nothing will be recorded.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTMtD2AAoJEKQekfcNnQGuwrUH/juk/qKY7T/nq98iMleOD+LR
iTQyWh9zvHdku/3ZlK5eJqla2TJUctM5Sf9WhvV/iXJ2/9m5qnekZ6gCoraGVpG4
qfKZAiHdGU0sFdp2NIt1uIHcqjYc8dg2KGajEKGNr7E2F3fvSTfNZV+WKAnmGqlE
8ICsoYJ/Uv4kh18QQJexbyd7jXWE8tzxxS9UzPvCtCiSd1xf2yCAS5OlwPYafgiv
Egbjqq7IUCJfVb0h/YfgPbEW69sjlxOulzn42Nii+UOx9uv1RTdtXdivyCXDWfX3
Que0Do757oTvpW/22s9T3mpUtYLCB+6B2GV+ljqbLCZ5pRywc/GDCU//D1AXKEI=
=dKWH
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.14-rc7-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"While on my flight to Linux Collaboration Summit, I was working on my
slides for the event trigger tutorial. I booted a 3.14-rc7 kernel to
perform what I wanted to teach and cut and paste it into my slides.
When I tried the traceon event trigger with a condition attached to it
(turns tracing on only if a field of the trigger event matches a
condition set by the user), nothing happened. Tracing would not turn
on. I stopped working on my presentation in order to find what was
wrong.
It ended up being the way trace event triggers work when they have
conditions. Instead of copying the fields, the condition code just
looks at the fields that were copied into the ring buffer. This works
great, unless tracing is off. That's because when the event is
reserved on the ring buffer, the ring buffer returns a NULL pointer,
this tells the tracing code that the ring buffer is disabled. This
ends up being a problem for the traceon trigger if it is using this
information to check its condition.
Luckily the code that checks if tracing is on returns the ring buffer
to use (because the ring buffer is determined by the event file also
passed to that field). I was able to easily solve this bug by
checking in that helper function if the returned ring buffer entry is
NULL, and if so, also check the file flag if it has a trace event
trigger condition, and if so, to pass back a temp ring buffer to use.
This will allow the trace event trigger condition to still test the
event fields, but nothing will be recorded"
* tag 'trace-fixes-v3.14-rc7-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix traceon trigger condition to actually turn tracing on
While working on my tutorial for 2014 Linux Collaboration Summit
I found that the traceon trigger did not work when conditions were
used. The other triggers worked fine though. Looking into it, it
is because of the way the triggers use the ring buffer to store
the fields it will use for the condition. But if tracing is off, nothing
is stored in the buffer, and the tracepoint exits before calling the
trigger to test the condition. This is fine for all the triggers that
only work when tracing is on, but for traceon trigger that is to
work when tracing is off, nothing happens.
The fix is simple, just use a temp ring buffer to record the event
if tracing is off and the event has a trace event conditional trigger
enabled. The rest of the tracepoint code will work just fine, but
the tracepoint wont be recorded in the other buffers.
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The previous commit removed the register_filesystem() call and the
associated error handling, but left the label for the error path that no
longer exists. Remove that too.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
anon_inodefs filesystem is a kernel internal filesystem userspace
shouldn't mess with. Remove registration of it so userspace cannot
even try to mount it (which would fail anyway because the filesystem is
MS_NOUSER).
This fixes an oops triggered by trinity when it tried mounting
anon_inodefs which overwrote anon_inode_inode pointer while other CPU
has been in anon_inode_getfile() between ihold() and d_instantiate().
Thus effectively creating dentry pointing to an inode without holding a
reference to it.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull nfsd fix frm Bruce Fields:
"J R Okajima sent this early and I was just slow to pass it along,
apologies. Fortunately it's a simple fix"
* 'nfsd-next' of git://linux-nfs.org/~bfields/linux:
nfsd: fix lost nfserrno() call in nfsd_setattr()
Pull vfs fixes from Al Viro:
"These four commits are obvious fixes (a couple of fdget_pos()-related
ones from Eric Biggers, prepend_name() fix, missing checks for false
negatives from __lookup_mnt() in fs/namei.c)"
For now I'm pulling just the four obvious fixes, there's another four
pending in Al's 'for-linus' branch wrt the mnt_hash list that were more
involved.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
rcuwalk: recheck mount_lock after mountpoint crossing attempts
make prepend_name() work correctly when called with negative *buflen
vfs: Don't let __fdget_pos() get FMODE_PATH files
vfs: atomic f_pos access in llseek()
This reverts commit a9c8e4beee.
PTEs in Xen PV guests must contain machine addresses if _PAGE_PRESENT
is set and pseudo-physical addresses is _PAGE_PRESENT is clear.
This is because during a domain save/restore (migration) the page
table entries are "canonicalised" and uncanonicalised". i.e., MFNs are
converted to PFNs during domain save so that on a restore the page
table entries may be rewritten with the new MFNs on the destination.
This canonicalisation is only done for PTEs that are present.
This change resulted in writing PTEs with MFNs if _PAGE_PROTNONE (or
_PAGE_NUMA) was set but _PAGE_PRESENT was clear. These PTEs would be
migrated as-is which would result in unexpected behaviour in the
destination domain. Either a) the MFN would be translated to the
wrong PFN/page; b) setting the _PAGE_PRESENT bit would clear the PTE
because the MFN is no longer owned by the domain; or c) the present
bit would not get set.
Symptoms include "Bad page" reports when munmapping after migrating a
domain.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> [3.12+]
Xen balloon driver will update ballooned out pages' P2M entries to point
to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
page while non-preemptible", kmap_flush_unused was moved after updating
P2M table. In that case for 32 bit PV guest we might end up with
P2M X -----> S (S is mfn of balloon scratch page)
M2P Y -----> X (Y is mfn in persistent kmap entry)
kmap_flush_unused() iterates through all the PTEs in the kmap address
space, using pte_to_page() to obtain the page. If the p2m and the m2p
are inconsistent the incorrect page is returned. This will clear
page->address on the wrong page which may cause subsequent oopses if
that page is currently kmap'ed.
Move the flush back between get_page and __set_phys_to_machine to fix
this.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: stable@vger.kernel.org # 3.12+
Pull parisc updates from Helge Deller:
- revert parts of the latest patch regarding font selection with STICON
console
- wire up the utimes() syscall for parisc
- remove the unused parisc tmpalias code and unnecessary arch*relax
defines
* 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: locks: remove redundant arch_*_relax operations
parisc: wire up sys_utimes
parisc: Remove unused CONFIG_PARISC_TMPALIAS code
partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in ROM fonts
Pull sparc fixes from David Miller:
1) Do serial locking in a way that makes things clear that these are
IRQ spinlocks.
2) Conversion to generic idle loop broke first generation Niagara
machines, need to have %pil interrupts enabled during cpu yield
hypervisor call.
3) Do not use magic constants for iterations over tsb tables, from Doug
Wilson.
4) Fix erroneous truncation of 64-bit system call return values to
32-bit. From Dave Kleikamp.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Make sure %pil interrupts are enabled during hypervisor yield.
sparc64:tsb.c:use array size macro rather than number
sparc64: don't treat 64-bit syscall return codes as 32-bit
sparc: serial: Clean up the locking for -rt
Pull networking fixes from David Miller:
1) OpenVswitch's lookup_datapath() returns error pointers, so don't
check against NULL. From Jiri Pirko.
2) pfkey_compile_policy() code path tries to do a GFP_KERNEL allocation
under RCU locks, fix by using GFP_ATOMIC when necessary. From
Nikolay Aleksandrov.
3) phy_suspend() indirectly passes uninitialized data into the ethtool
get wake-on-land implementations. Fix from Sebastian Hesselbarth.
4) CPSW driver unregisters CPTS twice, fix from Benedikt Spranger.
5) If SKB allocation of reply packet fails, vxlan's arp_reduce() defers
a NULL pointer. Fix from David Stevens.
6) IPV6 neigh handling in vxlan doesn't validate the destination
address properly, and it builds a packet with the src and dst
reversed. Fix also from David Stevens.
7) Fix spinlock recursion during subscription failures in TIPC stack,
from Erik Hugne.
8) Revert buggy conversion of davinci_emac to devm_request_irq, from
Chrstian Riesch.
9) Wrong flags passed into forwarding database netlink notifications,
from Nicolas Dichtel.
10) The netpoll neighbour soliciation handler checks wrong ethertype,
needs to be ETH_P_IPV6 rather than ETH_P_ARP. Fix from Li RongQing.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
tipc: fix spinlock recursion bug for failed subscriptions
vxlan: fix nonfunctional neigh_reduce()
net: davinci_emac: Fix rollback of emac_dev_open()
net: davinci_emac: Replace devm_request_irq with request_irq
netpoll: fix the skb check in pkt_is_ns
net: micrel : ks8851-ml: add vdd-supply support
ip6mr: fix mfc notification flags
ipmr: fix mfc notification flags
rtnetlink: fix fdb notification flags
tcp: syncookies: do not use getnstimeofday()
netlink: fix setsockopt in mmap examples in documentation
openvswitch: Correctly report flow used times for first 5 minutes after boot.
via-rhine: Disable device in error path
ATHEROS-ATL1E: Convert iounmap to pci_iounmap
vxlan: fix potential NULL dereference in arp_reduce()
cnic: Update version to 2.5.20 and copyright year.
cnic,bnx2i,bnx2fc: Fix inconsistent use of page size
cnic: Use proper ulp_ops for per device operations.
net: cdc_ncm: fix control message ordering
ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly
...
If a topology event subscription fails for any reason, such as out
of memory, max number reached or because we received an invalid
request the correct behavior is to terminate the subscribers
connection to the topology server. This is currently broken and
produces the following oops:
[27.953662] tipc: Subscription rejected, illegal request
[27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
[27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
[27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5
[27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
[27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
[27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
[27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
[27.964023] Call Trace:
[27.964292] [<ffffffff815c0207>] dump_stack+0x45/0x56
[27.964874] [<ffffffff815beec5>] spin_dump+0x8c/0x91
[27.965420] [<ffffffff815beeeb>] spin_bug+0x21/0x26
[27.965995] [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140
[27.966631] [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20
[27.967256] [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
[27.968051] [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc]
[27.968722] [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc]
[27.969436] [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
[27.970209] [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc]
[27.970972] [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc]
[27.971633] [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0
[27.972267] [<ffffffff8105e869>] worker_thread+0x119/0x3a0
[27.972896] [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0
[27.973622] [<ffffffff810648af>] kthread+0xdf/0x100
[27.974168] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
[27.974893] [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0
[27.975466] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
The recursion occurs when subscr_terminate tries to grab the
subscriber lock, which is already taken by subscr_conn_msg_event.
We fix this by checking if the request to establish a new
subscription was successful, and if not we initiate termination of
the subscriber after we have released the subscriber lock.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VXLAN neigh_reduce() code is completely non-functional since
check-in. Specific errors:
1) The original code drops all packets with a multicast destination address,
even though neighbor solicitations are sent to the solicited-node
address, a multicast address. The code after this check was never run.
2) The neighbor table lookup used the IPv6 header destination, which is the
solicited node address, rather than the target address from the
neighbor solicitation. So neighbor lookups would always fail if it
got this far. Also for L3MISSes.
3) The code calls ndisc_send_na(), which does a send on the tunnel device.
The context for neigh_reduce() is the transmit path, vxlan_xmit(),
where the host or a bridge-attached neighbor is trying to transmit
a neighbor solicitation. To respond to it, the tunnel endpoint needs
to do a *receive* of the appropriate neighbor advertisement. Doing a
send, would only try to send the advertisement, encapsulated, to the
remote destinations in the fdb -- hosts that definitely did not do the
corresponding solicitation.
4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the
isrouter flag in the advertisement. This has nothing to do with whether
or not the target is a router, and generally won't be set since the
tunnel endpoint is bridging, not routing, traffic.
The patch below creates a proxy neighbor advertisement to respond to
neighbor solicitions as intended, providing proper IPv6 support for neighbor
reduction.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Riesch says:
====================
net: davinci_emac: Fix interrupt requests and error handling
since commit 6892b41d97 (Linux 3.11) the
davinci_emac driver is broken. After doing ifconfig down, ifconfig up,
requesting the interrupts for the driver fails. The interface remains dead
until the board is rebooted.
The first patch in this patchset reverts commit
6892b41d97 partially and makes the driver
useable again.
During the work on the first patch, a number of bugs in the error handling
of the driver's ndo_open code were found. The second patch fixes these bugs.
I believe the first patch meets the rules for stable kernels, I therefore added
the stable tag to this patch. The second patch is just cleanup, the code
that is fixed by this patch is only executed in case of an error.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If an error occurs during the initialization in emac_dev_open() (the
driver's ndo_open function), interrupts, DMA descriptors etc. must be freed.
The current rollback code is buggy in several ways.
1) Freeing the interrupts. The current code will not free all interrupts
that were requested by the driver. Furthermore, the code tries to do a
platform_get_resource(priv->pdev, IORESOURCE_IRQ, -1) in its last
iteration.
This patch fixes these bugs.
2) Wrong order of err: and rollback: labels. If the setup of the PHY in
the code fails, the interrupts that have been requested before are
not freed:
request irq
if requesting irqs fails, goto rollback
setup phy
if phy setup fails, goto err
return 0
rollback:
free irqs
err:
This patch brings the code into the correct order.
3) The code calls napi_enable() and emac_int_enable(), but does not
undo both in case of an error.
This patch adds calls of emac_int_disable() and napi_disable() to the
rollback code.
4) RX DMA descriptors are not freed in case of an error: Right before
requesting the irqs, the function creates DMA descriptors for the
RX channel. These RX descriptors are never freed when we jump to either
rollback or err.
This patch adds code for freeing the DMA descriptors in the case of
an initialization error. This required a modification of
cpdma_ctrl_stop() in davinci_cpdma.c: We must be able to call this
function to free the DMA descriptors while the DMA channels are
in IDLE state (before cpdma_ctlr_start() was called).
Tested on a custom board with the Texas Instruments AM1808.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 6892b41d97
Author: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Date: Tue Jun 25 21:24:51 2013 +0530
net: davinci: emac: Convert to devm_* api
the call of request_irq is replaced by devm_request_irq and the call
of free_irq is removed. But since interrupts are requested in
emac_dev_open, doing ifconfig up/down on the board requests the
interrupts again each time, causing devm_request_irq to fail. The
interface is dead until the device is rebooted.
This patch reverts said commit partially: It changes the driver back
to use request_irq instead of devm_request_irq, puts free_irq back in
place, but keeps the remaining changes of the original patch.
Reported-by: Jon Ringle <jon@ringle.org>
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neighbor Solicitation is ipv6 protocol, so we should check
skb->protocol with ETH_P_IPV6
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Cc: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In arch_cpu_idle() we must enable %pil based interrupts before
potentially invoking the hypervisor cpu yield call.
As per the Hypervisor API documentation for cpu_yield:
Interrupts which are blocked by some mechanism other that
pstate.ie (for example %pil) are not guaranteed to cause
a return from this service.
It seems that only first generation Niagara chips are hit by this
bug. My best guess is that later chips implement this in hardware
and wake up anyways from %pil events, whereas in first generation
chips the yield is implemented completely in hypervisor code and
requires %pil to be enabled in order to wake properly from this
call.
Fixes: 87fa05aeb3 ("sparc: Use generic idle loop")
Reported-by: Fabio M. Di Nitto <fabbione@fabbione.net>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes a build break due to the undeclared use of irq_of_parse_and_map()
and of_iomap(). This build break was apparently introduced while the
driver was unbuildable due to the bug fixed by
62c19c9d29 ("i2c: Remove usage of
orphaned symbol OF_I2C"). When 62c19c was added in v3.14-rc7,
the driver was enabled again, breaking the powerpc mpc85xx_defconfig
and mpc85xx_smp_defconfig.
62c19c is marked for stable, so this should go there as well.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Few platforms use external regulator to keep the ethernet MAC supplied.
So, request and enable the regulator for driver functionality.
Fixes: 66fda75f47 (regulator: core: Replace direct ops->disable usage)
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Suggested-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the arch_{spin,read,write}_relax macros default to cpu_relax(),
remove the redundant definitions for parisc.
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Helge Deller <deller@gmx.de>
We seem to be nearly the only platform which does not provide the
sys_utimes syscall. Adding it now makes our life much easier with
userspace applications (like dietlibc and e2fsprogs) since we then
behave like all other platforms too and don't need extra patches which
are hard to get upstream anyway because we are not a mainstream
architecture.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v3.13
The attached change removes the unused and experimental
CONFIG_PARISC_TMPALIAS code. It doesn't work and I don't believe it will
ever be used.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
STI console is used on parisc and m68k HP machines. This patch partly reverts
my previous commit and as such restores the fonts for the m68k machines.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v3.13
We can get false negative from __lookup_mnt() if an unrelated vfsmount
gets moved. In that case legitimize_mnt() is guaranteed to fail,
and we will fall back to non-RCU walk... unless we end up running
into a hard error on a filesystem object we wouldn't have reached
if not for that false negative. IOW, delaying that check until
the end of pathname resolution is wrong - we should recheck right
after we attempt to cross the mountpoint. We don't need to recheck
unless we see d_mountpoint() being true - in that case even if
we have just raced with mount/umount, we can simply go on as if
we'd come at the moment when the sucker wasn't a mountpoint; if we
run into a hard error as the result, it was a legitimate outcome.
__lookup_mnt() returning NULL is different in that respect, since
it might've happened due to operation on completely unrelated
mountpoint.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In all callchains leading to prepend_name(), the value left in *buflen
is eventually discarded unused if prepend_name() has returned a negative.
So we are free to do what prepend() does, and subtract from *buflen
*before* checking for underflow (which turns into checking the sign
of subtraction result, of course).
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit bd2a31d522 ("get rid of fget_light()") introduced the
__fdget_pos() function, which returns the resulting file pointer and
fdput flags combined in an 'unsigned long'. However, it also changed the
behavior to return files with FMODE_PATH set, which shouldn't happen
because read(), write(), lseek(), etc. aren't allowed on such files.
This commit restores the old behavior.
This regression actually had no effect on read() and write() since
FMODE_READ and FMODE_WRITE are not set on file descriptors opened with
O_PATH, but it did cause lseek() on a file descriptor opened with O_PATH
to fail with ESPIPE rather than EBADF.
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Commit 9c225f2655 ("vfs: atomic f_pos accesses as per POSIX") changed
several system calls to use fdget_pos() instead of fdget(), but missed
sys_llseek(). Fix it.
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Only two patches this time, one to fix ethernet probe order on at91 (better
fix with proper device aliasing will be done for 3.15, this is stop-gap), and
one update to MAINTAINERS due to Freescale moving their repo to kernel.org.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTLRrzAAoJEIwa5zzehBx3RR8P/1IdwjlFC7ait4LxUvSfpPEw
laHatndu1IBc+SoUDlPqddvEZlbPGelSRRUXg/5RbRwedkZRvI1gLE3KmqbxCLW3
SmdnSbhZlFY55c4Qjf1EWbRvVs/eQWADBowxd7xHHH17Rr/nKNrRxXME0tSjtScm
uN0ZrhjJPfdRIo7pYI29CV7E1hy+9WKKEDO6mWUu7aC/R5n8L5fPthNiLiuDE4e4
rILenZAUhxhHGrn+NBy2LXm2QuhS2T38OYKDGm4wCRnzUJphN7dHAwH8xz+n8yTl
mRN3shGrzaPgNt1UR8cBrEWhUjFem7n8XBV7dgaGTD6LZvUbThLtY6g1PK2bOlHp
qnm99UHRPjMpQXg19ssbXuYOcRFyvDxBwCNYUwTbyGy5fb5XmpiTzyR4n0j2LFkU
Svv4H1UUaSeicnBZwheB6idY6fd5yGU6pMfdsWUgMS03Gbpt9v71DKwvM3ZJYTWh
A0ZSwOHdUf3wiEkxLkpVXA8kPBU2La17wEYV/US1M/ZCwk2t1UHmVAdtqpZZqng4
4uqkIBwJot5s/kbi+m5I/dx9elZYAhQU41bjVaUJab8rjqrADdTToRsLouEsjGfE
WuqGDhmYM4k+I88xxFXgA5LxcJHsX/bzFexwkUPCJmvJOz5qtUdU25NMw+fmNVhy
xeKODx1dWmajrFou0yjw
=Dh0X
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Only two patches this time, one to fix ethernet probe order on at91
(better fix with proper device aliasing will be done for 3.15, this is
stop-gap), and one update to MAINTAINERS due to Freescale moving their
repo to kernel.org"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: at91: fix network interface ordering for sama5d36
MAINTAINERS: update IMX kernel git tree
Pull drm fixes from Dave Airlie:
"Some final few intel fixes, all regressions, all stable cc, and one
exynos oops fixer.
The biggest is probably the intel display error irqs one, but it seems
to fix a few crashes on startup, and one use after free in drm core"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/exynos: Fix (more) freeing issues in exynos_drm_drv.c
drm/i915: Disable stolen memory when DMAR is active
Revert "drm/i915: don't touch the VDD when disabling the panel"
drm: Fix use-after-free in the shadow-attache exit code
drm/i915: Don't enable display error interrupts from the start
drm/i915: Fix scanline counter fixup on BDW
drm/i915: Add a workaround for HSW scanline counter weirdness
drm/i915: Fix PSR programming
Commit 7982e90c3a ("block: fix q->flush_rq NULL pointer crash on
dm-mpath flush") moved an allocation to blk_init_allocated_queue(), but
neglected to free that allocation on the error paths that follow.
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Srikar Dronamraju reports that commit b0c29f79ec ("futexes: Avoid
taking the hb->lock if there's nothing to wake up") causes java threads
getting stuck on futexes when runing specjbb on a power7 numa box.
The cause appears to be that the powerpc spinlocks aren't using the same
ticket lock model that we use on x86 (and other) architectures, which in
turn result in the "spin_is_locked()" test in hb_waiters_pending()
occasionally reporting an unlocked spinlock even when there are pending
waiters.
So this reinstates Davidlohr Bueso's original explicit waiter counting
code, which I had convinced Davidlohr to drop in favor of figuring out
the pending waiters by just using the existing state of the spinlock and
the wait queue.
Reported-and-tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Original-code-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
array index in the trace event format bogus. He supplied an elegant solution
that uses __stringify() and also removes the need for the event_storage
and event_storage_mutex and also cuts off a few K of overhead from
the trace events.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTK6cxAAoJEKQekfcNnQGu+3oIAIPGwsevXcVNlmqLwtsoUhu2
d0uMJYBpi+aQ/stenhkAThzY/5/O0SN2o+uyacSJKDo3WayF9fxGTvOHVbJhvmLF
YsX6oQLVmzqPrq7BGJTvglv4+RYf+HkV1MAb/1iacA7sFtd7jVpUiPvLlQ3CEwph
kNqdmoFT16iyE1snUviE0GmZmrdOqZBjwC1Ys+oSbaycRSFnvmDjYAGYot5tJfU5
gYgpkeJ8J3bxHOGzNCRgmpLQNR3P1HzailPQVi51We14FlzSwOTwuKDtpf8WwzXV
0fIEkdTU3+K62XxVw5/YQ5o/PpFKO/J5dSPjFe7PF2e6hCTTOABcK5foSSP1KSU=
=559y
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull trace fix from Steven Rostedt:
"Vaibhav Nagarnaik discovered that since 3.10 a clean-up patch made the
array index in the trace event format bogus.
He supplied an elegant solution that uses __stringify() and also
removes the need for the event_storage and event_storage_mutex and
also cuts off a few K of overhead from the trace events"
* tag 'trace-fixes-v3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix array size mismatch in format string
Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting
little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity:
indicating that remove_migration_ptes() failed to find one of the
migration entries that was temporarily inserted.
The problem comes from remap_file_pages()'s switch from vma_interval_tree
(good for inserting the migration entry) to i_mmap_nonlinear list (no good
for locating it again); but can only be a problem if the remap_file_pages()
range does not cover the whole of the vma (zap_pte() clears the range).
remove_migration_ptes() needs a file_nonlinear method to go down the
i_mmap_nonlinear list, applying linear location to look for migration
entries in those vmas too, just in case there was this race.
The file_nonlinear method does need rmap_walk_control.arg to do this;
but it never needed vma passed in - vma comes from its own iteration.
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesse Gross says:
====================
Open vSwitch
Four small fixes for net/3.14. I realize that these are late in the
cycle - just got back from vacation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>