Commit Graph

440862 Commits

Author SHA1 Message Date
Feng Wu 97ec8c067d KVM: Add SMAP support when setting CR4
This patch adds SMAP handling logic when setting CR4 for guests

Thanks a lot to Paolo Bonzini for his suggestion to use the branchless
way to detect SMAP violation.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:34 -03:00
Feng Wu 56d6efc2de KVM: Remove SMAP bit from CR4_RESERVED_BITS
This patch removes SMAP bit from CR4_RESERVED_BITS.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:33 -03:00
Daniel Borkmann 362d52040c Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
This reverts commit ef2820a735 ("net: sctp: Fix a_rwnd/rwnd management
to reflect real state of the receiver's buffer") as it introduced a
serious performance regression on SCTP over IPv4 and IPv6, though a not
as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs.

Current state:

[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 17:56:21 GMT
Connecting to host 192.168.241.3, port 5201
      Cookie: Lab200slot2.1397238981.812898.548918
[  4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.09   sec  20.8 MBytes   161 Mbits/sec
[  4]   1.09-2.13   sec  10.8 MBytes  86.8 Mbits/sec
[  4]   2.13-3.15   sec  3.57 MBytes  29.5 Mbits/sec
[  4]   3.15-4.16   sec  4.33 MBytes  35.7 Mbits/sec
[  4]   4.16-6.21   sec  10.4 MBytes  42.7 Mbits/sec
[  4]   6.21-6.21   sec  0.00 Bytes    0.00 bits/sec
[  4]   6.21-7.35   sec  34.6 MBytes   253 Mbits/sec
[  4]   7.35-11.45  sec  22.0 MBytes  45.0 Mbits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-12.51  sec  16.0 MBytes   126 Mbits/sec
[  4]  12.51-13.59  sec  20.3 MBytes   158 Mbits/sec
[  4]  13.59-14.65  sec  13.4 MBytes   107 Mbits/sec
[  4]  14.65-16.79  sec  33.3 MBytes   130 Mbits/sec
[  4]  16.79-16.79  sec  0.00 Bytes    0.00 bits/sec
[  4]  16.79-17.82  sec  5.94 MBytes  48.7 Mbits/sec
(etc)

[root@Lab200slot2 ~]#  iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 19:08:41 GMT
Connecting to host 2001:db8:0:f101::1, port 5201
      Cookie: Lab200slot2.1397243321.714295.2b3f7c
[  4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201
Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   169 MBytes  1.42 Gbits/sec
[  4]   1.00-2.00   sec   201 MBytes  1.69 Gbits/sec
[  4]   2.00-3.00   sec   188 MBytes  1.58 Gbits/sec
[  4]   3.00-4.00   sec   174 MBytes  1.46 Gbits/sec
[  4]   4.00-5.00   sec   165 MBytes  1.39 Gbits/sec
[  4]   5.00-6.00   sec   199 MBytes  1.67 Gbits/sec
[  4]   6.00-7.00   sec   163 MBytes  1.36 Gbits/sec
[  4]   7.00-8.00   sec   174 MBytes  1.46 Gbits/sec
[  4]   8.00-9.00   sec   193 MBytes  1.62 Gbits/sec
[  4]   9.00-10.00  sec   196 MBytes  1.65 Gbits/sec
[  4]  10.00-11.00  sec   157 MBytes  1.31 Gbits/sec
[  4]  11.00-12.00  sec   175 MBytes  1.47 Gbits/sec
[  4]  12.00-13.00  sec   192 MBytes  1.61 Gbits/sec
[  4]  13.00-14.00  sec   199 MBytes  1.67 Gbits/sec
(etc)

After patch:

[root@Lab200slot2 ~]#  iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64
Time: Mon, 14 Apr 2014 16:40:48 GMT
Connecting to host 192.168.240.3, port 5201
      Cookie: Lab200slot2.1397493648.413274.65e131
[  4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   240 MBytes  2.02 Gbits/sec
[  4]   1.00-2.00   sec   239 MBytes  2.01 Gbits/sec
[  4]   2.00-3.00   sec   240 MBytes  2.01 Gbits/sec
[  4]   3.00-4.00   sec   239 MBytes  2.00 Gbits/sec
[  4]   4.00-5.00   sec   245 MBytes  2.05 Gbits/sec
[  4]   5.00-6.00   sec   240 MBytes  2.01 Gbits/sec
[  4]   6.00-7.00   sec   240 MBytes  2.02 Gbits/sec
[  4]   7.00-8.00   sec   239 MBytes  2.01 Gbits/sec

With the reverted patch applied, the SCTP/IPv4 performance is back
to normal on latest upstream for IPv4 and IPv6 and has same throughput
as 3.4.2 test kernel, steady and interval reports are smooth again.

Fixes: ef2820a735 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer")
Reported-by: Peter Butler <pbutler@sonusnet.com>
Reported-by: Dongsheng Song <dongsheng.song@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Peter Butler <pbutler@sonusnet.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 16:26:48 -04:00
Steve Wise bfae232499 cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
Hardware needs the local device mac address to support hw loopback for
rdma loopback connections.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 16:26:48 -04:00
Daniel Borkmann 8c482cdc35 net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has
been wrongly decoded by commit a8fc927780 ("sk-filter: Add ability to
get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS
although it should have been decoded as BPF_LD|BPF_W|BPF_ABS.

In practice, this should not have much side-effect though, as such
conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse
operation PR_GET_SECCOMP will only return the current seccomp mode, but
not the filter itself. Since the transition to the new BPF infrastructure,
it's also not used anymore, so we can simply remove this as it's
unreachable.

Fixes: a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 16:26:47 -04:00
Daniel Borkmann 2eac764832 seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
Linus reports that on 32-bit x86 Chromium throws the following seccomp
resp. audit log messages:

  audit: type=1326 audit(1397359304.356:28108): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0
syscall=172 compat=0 ip=0xb2dd9852 code=0x30000

  audit: type=1326 audit(1397359304.356:28109): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5
compat=0 ip=0xb2dd9852 code=0x50000

These audit messages are being triggered via audit_seccomp() through
__secure_computing() in seccomp mode (BPF) filter with seccomp return
codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO)
during filter runtime. Moreover, Linus reports that x86_64 Chromium
seems fine.

The underlying issue that explains this is that the implementation of
populate_seccomp_data() is wrong. Our seccomp data structure sd that
is being shared with user ABI is:

  struct seccomp_data {
    int nr;
    __u32 arch;
    __u64 instruction_pointer;
    __u64 args[6];
  };

Therefore, a simple cast to 'unsigned long *' for storing the value of
the syscall argument via syscall_get_arguments() is just wrong as on
32-bit x86 (or any other 32bit arch), it would result in storing a0-a5
at wrong offsets in args[] member, and thus i) could leak stack memory
to user space and ii) tampers with the logic of seccomp BPF programs
that read out and check for syscall arguments:

  syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);

Tested on 32-bit x86 with Google Chrome, unfortunately only via remote
test machine through slow ssh X forwarding, but it fixes the issue on
my side. So fix it up by storing args in type correct variables, gcc
is clever and optimizes the copy away in other cases, e.g. x86_64.

Fixes: bd4cf0ed33 ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 16:26:47 -04:00
David S. Miller 14ed4a5bcb Merge branch 'qlcnic'
Shahed Shaikh says:

====================
qlcnic: Bug fixes

This patch series contains following bug fixes -

* Send INIT_NIC_FUNC mailbox command as first mailbox
* Fix a panic because of uninitialized delayed_work.
* Fix inconsistent calculation of max rings count.
* Fix PVID configuration issue. Driver needs to clear older
  PVID before adding new one.
* Fix QLogic application/driver interface by packing vNIC information
  array.
* Fix a crash when user tries to disable SR-IOV while VFs are
  still assigned to VMs.

Please apply to net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:58 -04:00
Manish Chopra 696f1943a1 qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
o While disabling SR-IOV when VFs are assigned to VMs causes host crash
  so return -EPERM when user request to disable SR-IOV using pci sysfs in
  case of VFs are assigned to VMs.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:53 -04:00
Jitendra Kalsaria 4f03022777 qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
o Application expect vNIC number as the array index but driver interface
return configuration in array index form.

o Pack the vNIC information array in the buffer such that application can
access it using vNIC number as the array index.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:52 -04:00
Jitendra Kalsaria a78b6da89f qlcnic: Fix PVID configuration on eSwitch port.
Clear older PVID before adding a newer PVID to the eSwicth port

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:52 -04:00
Shahed Shaikh 7b546842b1 qlcnic: Fix max ring count calculation
Do not read max rings count from qlcnic_get_nic_info(). Use driver defined
values for 82xx adapters. In case of 83xx adapters, use minimum of firmware
provided and driver defined values.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:52 -04:00
Sucheta Chakraborty 4d52e1e8d1 qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
o INIT_NIC_FUNC should be first mailbox sent. Sending DCB capability and
  parameter query commands after that command.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:52 -04:00
Sucheta Chakraborty 463518a0cb qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
o AEN event was being received before initializing delayed_work struct
  and handlers for it. This was resulting in crash. This patch fixes it.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:43:52 -04:00
David S. Miller 677df2f4ae Merge branch 'be2net'
Sathya Perla says:

====================
be2net: patch set

Patch 1/2 is a v2 of a patch that was submitted earlier (as a part of a
different patch-set). v2 incorporates a suggestion given by David Laight
for how long to poll for pending TX completions while disabling a device.

Patch 2/2 fixes a crash in be_remove()->be_close()
path after be2net has aborted an EEH error recovery
due to a permanant failure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:41:42 -04:00
Kalesh AP e1ad8e33d2 be2net: Fix invocation of be_close() after be_clear()
In the EEH error recovery path, when a permanent failure occurs,
we clean up adapter structure (i.e. destroy queues etc) by calling
be_clear() and return PCI_ERS_RESULT_DISCONNECT.
After this the stack tries to remove device from bus and calls
be_remove() which invokes netdev_unregister()->be_close().
be_close() operating on destroyed queues results in a
NULL dereference.

This patch fixes this problem by introducing a flag to keep track
of the setup state.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:41:37 -04:00
Vasundhara Volam 1a3d0717f6 be2net: Fix to reap TX compls till HW doesn't respond for some time
be_close() currently waits for a max of 200ms to receive all pending
TX compls. This timeout value was roughly calculated based on 10G
transmission speeds and the TX queue depth. This timeout may not be
enough when the link is operating at lower speeds or in multi-channel/SR-IOV
configs with TX-rate limiting setting.

It is hard to calculate a "proper timeout value" that works in all
configurations.  This patch solves this problem by continuing to reap
TX completions till the HW is completely silent for a period of 10ms or
a HW error is detected.

v2: implements the new scheme (as suggested by David Laight) instead of
just waiting longer than 200ms for reaping all completions.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:41:37 -04:00
Amir Vadai e1a5ddc506 net/mlx4_core: Defer VF initialization till PF is fully initialized
Fix in commit [1] is not sufficient since a deferred VF initialization
could happen after pci_enable_sriov() is finished, but before the PF is
fully initialized.
Need to prevent VFs from initializing till the PF is fully ready and
comm channel is operational.

[1] - 9798935 "net/mlx4_core: mlx4_init_slave() shouldn't access comm
      channel before PF is ready"

CC: Stuart Hayes <Stuart_Hayes@Dell.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 13:24:42 -04:00
Daniel J Blueman 77d149c4eb bnx2: Don't build unused suspend/resume functions not enabled
When CONFIG_PM_SLEEP isn't enabled, bnx2_suspend/resume are unused; don't
build them when they aren't used.

Signed-off-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 12:40:00 -04:00
Eric Dumazet 30f78d8ebf ipv6: Limit mtu to 65575 bytes
Francois reported that setting big mtu on loopback device could prevent
tcp sessions making progress.

We do not support (yet ?) IPv6 Jumbograms and cook corrupted packets.

We must limit the IPv6 MTU to (65535 + 40) bytes in theory.

Tested:

ifconfig lo mtu 70000
netperf -H ::1

Before patch : Throughput :   0.05 Mbits

After patch : Throughput : 35484 Mbits

Reported-by: Francois WELLENREITER <f.wellenreiter@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 12:39:59 -04:00
Patrick McHardy b855d416dc netfilter: nf_tables: fix nft_cmp_fast failure on big endian for size < 4
nft_cmp_fast is used for equality comparisions of size <= 4. For
comparisions of size < 4 byte a mask is calculated that is applied to
both the data from userspace (during initialization) and the register
value (during runtime). Both values are stored using (in effect) memcpy
to a memory area that is then interpreted as u32 by nft_cmp_fast.

This works fine on little endian since smaller types have the same base
address, however on big endian this is not true and the smaller types
are interpreted as a big number with trailing zero bytes.

The mask therefore must not include the lower bytes, but the higher bytes
on big endian. Add a helper function that does a cpu_to_le32 to switch
the bytes on big endian. Since we're dealing with a mask of just consequitive
bits, this works out fine.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-14 10:38:02 +02:00
Andrey Vagin ee214d54bf netfilter: nf_conntrack: initialize net.ct.generation
[  251.920788] INFO: trying to register non-static key.
[  251.921386] the code is fine but needs lockdep annotation.
[  251.921386] turning off the locking correctness validator.
[  251.921386] CPU: 2 PID: 15715 Comm: socket_listen Not tainted 3.14.0+ #294
[  251.921386] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  251.921386]  0000000000000000 000000009d18c210 ffff880075f039b8 ffffffff816b7ecd
[  251.921386]  ffffffff822c3b10 ffff880075f039c8 ffffffff816b36f4 ffff880075f03aa0
[  251.921386]  ffffffff810c65ff ffffffff810c4a85 00000000fffffe01 ffffffffa0075172
[  251.921386] Call Trace:
[  251.921386]  [<ffffffff816b7ecd>] dump_stack+0x45/0x56
[  251.921386]  [<ffffffff816b36f4>] register_lock_class.part.24+0x38/0x3c
[  251.921386]  [<ffffffff810c65ff>] __lock_acquire+0x168f/0x1b40
[  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
[  251.921386]  [<ffffffff816c1215>] ? _raw_spin_unlock_bh+0x35/0x40
[  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
[  251.921386]  [<ffffffff810c7272>] lock_acquire+0xa2/0x120
[  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffffa0055989>] __nf_conntrack_confirm+0x129/0x410 [nf_conntrack]
[  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffffa008ab90>] ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815d8c5a>] nf_iterate+0xaa/0xc0
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815d8d14>] nf_hook_slow+0xa4/0x190
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815e98f2>] ip_output+0x92/0x100
[  251.921386]  [<ffffffff815e8df9>] ip_local_out+0x29/0x90
[  251.921386]  [<ffffffff815e9240>] ip_queue_xmit+0x170/0x4c0
[  251.921386]  [<ffffffff815e90d5>] ? ip_queue_xmit+0x5/0x4c0
[  251.921386]  [<ffffffff81601208>] tcp_transmit_skb+0x498/0x960
[  251.921386]  [<ffffffff81602d82>] tcp_connect+0x812/0x960
[  251.921386]  [<ffffffff810e3dc5>] ? ktime_get_real+0x25/0x70
[  251.921386]  [<ffffffff8159ea2a>] ? secure_tcp_sequence_number+0x6a/0xc0
[  251.921386]  [<ffffffff81606f57>] tcp_v4_connect+0x317/0x470
[  251.921386]  [<ffffffff8161f645>] __inet_stream_connect+0xb5/0x330
[  251.921386]  [<ffffffff8158dfc3>] ? lock_sock_nested+0x33/0xa0
[  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
[  251.921386]  [<ffffffff81078885>] ? __local_bh_enable_ip+0x75/0xe0
[  251.921386]  [<ffffffff8161f8f8>] inet_stream_connect+0x38/0x50
[  251.921386]  [<ffffffff8158b157>] SYSC_connect+0xe7/0x120
[  251.921386]  [<ffffffff810e3789>] ? current_kernel_time+0x69/0xd0
[  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
[  251.921386]  [<ffffffff8158c36e>] SyS_connect+0xe/0x10
[  251.921386]  [<ffffffff816caf69>] system_call_fastpath+0x16/0x1b
[  312.014104] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=60003 jiffies, g=42359, c=42358, q=333)
[  312.015097] INFO: Stall ended before state dump start

Fixes: 93bb0ceb75 ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-14 10:35:28 +02:00
Mathias Krause 05ab8f2647 filter: prevent nla extensions to peek beyond the end of the message
The BPF_S_ANC_NLATTR and BPF_S_ANC_NLATTR_NEST extensions fail to check
for a minimal message length before testing the supplied offset to be
within the bounds of the message. This allows the subtraction of the nla
header to underflow and therefore -- as the data type is unsigned --
allowing far to big offset and length values for the search of the
netlink attribute.

The remainder calculation for the BPF_S_ANC_NLATTR_NEST extension is
also wrong. It has the minuend and subtrahend mixed up, therefore
calculates a huge length value, allowing to overrun the end of the
message while looking for the netlink attribute.

The following three BPF snippets will trigger the bugs when attached to
a UNIX datagram socket and parsing a message with length 1, 2 or 3.

 ,-[ PoC for missing size check in BPF_S_ANC_NLATTR ]--
 | ld	#0x87654321
 | ldx	#42
 | ld	#nla
 | ret	a
 `---

 ,-[ PoC for the same bug in BPF_S_ANC_NLATTR_NEST ]--
 | ld	#0x87654321
 | ldx	#42
 | ld	#nlan
 | ret	a
 `---

 ,-[ PoC for wrong remainder calculation in BPF_S_ANC_NLATTR_NEST ]--
 | ; (needs a fake netlink header at offset 0)
 | ld	#0
 | ldx	#42
 | ld	#nlan
 | ret	a
 `---

Fix the first issue by ensuring the message length fulfills the minimal
size constrains of a nla header. Fix the second bug by getting the math
for the remainder calculation right.

Fixes: 4738c1db15 ("[SKFILTER]: Add SKF_ADF_NLATTR instruction")
Fixes: d214c7537b ("filter: add SKF_AD_NLATTR_NEST to look for nested..")
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 23:31:55 -04:00
Julian Anastasov 91146153da ipv4: return valid RTA_IIF on ip route get
Extend commit 13378cad02
("ipv4: Change rt->rt_iif encoding.") from 3.6 to return valid
RTA_IIF on 'ip route get ... iif DEVICE' instead of rt_iif 0
which is displayed as 'iif *'.

inet_iif is not appropriate to use because skb_iif is not set.
Use the skb->dev->ifindex instead.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 23:23:41 -04:00
Thomas Petazzoni cc6ca3023f Revert "net: mvneta: fix usage as a module on RGMII configurations"
This reverts commit e3a8786c10. While
this commit allows to use the mvneta driver as a module on some
configurations, it breaks other configurations even if mvneta is used
built-in.

This breakage is due to the fact that on some RGMII platforms, the PCS
bit has to be set, and on some other platforms, it has to be
cleared. At the moment, we lack informations to know exactly the
significance of this bit (the datasheet only says "enables PCS"), and
so we can't produce a patch that will work on all platforms at this
point. And since this change is breaking the network completely for
many users, it's much better to revert it for now. We'll come back
later with a proper fix that takes into account all platforms.

Basically:

 * Armada XP GP is configured as RGMII-ID, and needs the PCS bit to be
   set.
 * Armada 370 Mirabox is configured as RGMII-ID, and needs the PCS bit
   to be cleared.

And at the moment, we don't know how to make the distinction between
those two cases. One hint is that the Armada XP GP appears in fact to
be using a QSGMII connection with the PHY (Quad-SGMII), but
configuring it as SGMII doesn't work, while RGMII-ID works. This needs
more investigation, but in the mean time, let's unbreak the network
for all those users.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reported-by: Arnaud Ebalard <arno@natisbad.org>
Reported-by: Alexander Reuter <Alexander.Reuter@gmx.net>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=73401
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 23:13:13 -04:00
Wei Yang befdf8978a net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()
pci_match_id() just match the static pci_device_id, which may return NULL if
someone binds the driver to a device manually using
/sys/bus/pci/drivers/.../new_id.

This patch wrap up a helper function __mlx4_remove_one() which does the tear
down function but preserve the drv_data. Functions like
mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
releasing drvdata.

Fixes: 97a5221 "net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset".

CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Amir Vadai <amirv@mellanox.com>
CC: Jack Morgenstein <jackm@dev.mellanox.co.il>
CC: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 23:12:15 -04:00
Wang, Xiaoming b04c461902 net: ipv4: current group_info should be put after using.
Plug a group_info refcount leak in ping_init.
group_info is only needed during initialization and
the code failed to release the reference on exit.
While here move grabbing the reference to a place
where it is actually needed.

Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com>
Signed-off-by: xiaoming wang <xiaoming.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 22:52:58 -04:00
Linus Torvalds c9eaa447e7 Linux 3.15-rc1 2014-04-13 14:18:35 -07:00
Geert Uytterhoeven f7c1d07420 mm: Initialize error in shmem_file_aio_read()
Some versions of gcc even warn about it:

  mm/shmem.c: In function ‘shmem_file_aio_read’:
  mm/shmem.c:1414: warning: ‘error’ may be used uninitialized in this function

If the loop is aborted during the first iteration by one of the two
first break statements, error will be uninitialized.

Introduced by commit 6e58e79db8 ("introduce copy_page_to_iter, kill
loop over iovec in generic_file_aio_read()").

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-13 14:10:26 -07:00
Geert Uytterhoeven e686bd8dc5 cifs: Use min_t() when comparing "size_t" and "unsigned long"
On 32 bit, size_t is "unsigned int", not "unsigned long", causing the
following warning when comparing with PAGE_SIZE, which is always "unsigned
long":

  fs/cifs/file.c: In function ‘cifs_readdata_to_iov’:
  fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast

Introduced by commit 7f25bba819 ("cifs_iovec_read: keep iov_iter
between the calls of cifs_readdata_to_iov()"), which changed the
signedness of "remaining" and the code from min_t() to min().

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-13 14:10:26 -07:00
Linus Torvalds bf3a340738 Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull slab changes from Pekka Enberg:
 "The biggest change is byte-sized freelist indices which reduces slab
  freelist memory usage:

    https://lkml.org/lkml/2013/12/2/64"

* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  mm: slab/slub: use page->list consistently instead of page->lru
  mm/slab.c: cleanup outdated comments and unify variables naming
  slab: fix wrongly used macro
  slub: fix high order page allocation problem with __GFP_NOFAIL
  slab: Make allocations with GFP_ZERO slightly more efficient
  slab: make more slab management structure off the slab
  slab: introduce byte sized index for the freelist of a slab
  slab: restrict the number of objects in a slab
  slab: introduce helper functions to get/set free object
  slab: factor out calculate nr objects in cache_estimate
2014-04-13 13:28:13 -07:00
Linus Torvalds 321d03c867 Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull misc kbuild changes from Michal Marek:
 "Here is the non-critical part of kbuild:
   - One bogus coccinelle check removed, one check fixed not to suggest
     the obsolete PTR_RET macro
   - scripts/tags.sh does not index the generated *.mod.c files
   - new objdiff tool to list differences between two versions of an
     object file
   - A fix for scripts/bootgraph.pl"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scripts/coccinelle: Use PTR_ERR_OR_ZERO
  scripts/bootgraph.pl: Add graphic header
  scripts: objdiff: detect object code changes between two commits
  Coccicheck: Remove memcpy to struct assignment test
  scripts/tags.sh: Ignore *.mod.c
2014-04-12 18:22:27 -07:00
Mikulas Patocka fd1232b214 sym53c8xx_2: Set DID_REQUEUE return code when aborting squeue
This patch fixes I/O errors with the sym53c8xx_2 driver when the disk
returns QUEUE FULL status.

When the controller encounters an error (including QUEUE FULL or BUSY
status), it aborts all not yet submitted requests in the function
sym_dequeue_from_squeue.

This function aborts them with DID_SOFT_ERROR.

If the disk has full tag queue, the request that caused the overflow is
aborted with QUEUE FULL status (and the scsi midlayer properly retries
it until it is accepted by the disk), but the sym53c8xx_2 driver aborts
the following requests with DID_SOFT_ERROR --- for them, the midlayer
does just a few retries and then signals the error up to sd.

The result is that disk returning QUEUE FULL causes request failures.

The error was reproduced on 53c895 with COMPAQ BD03685A24 disk
(rebranded ST336607LC) with command queue 48 or 64 tags.  The disk has
64 tags, but under some access patterns it return QUEUE FULL when there
are less than 64 pending tags.  The SCSI specification allows returning
QUEUE FULL anytime and it is up to the host to retry.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12 18:02:16 -07:00
Paul Mackerras 18aa0da33e powerpc: Don't try to set LPCR unless we're in hypervisor mode
Commit 8f619b5429 ("powerpc/ppc64: Do not turn AIL (reloc-on
interrupts) too early") added code to set the AIL bit in the LPCR
without checking whether the kernel is running in hypervisor mode.  The
result is that when the kernel is running as a guest (i.e., under
PowerKVM or PowerVM), the processor takes a privileged instruction
interrupt at that point, causing a panic.  The visible result is that
the kernel hangs after printing "returning from prom_init".

This fixes it by checking for hypervisor mode being available before
setting LPCR.  If we are not in hypervisor mode, we enable relocation-on
interrupts later in pSeries_setup_arch using the H_SET_MODE hcall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12 17:58:48 -07:00
Davidlohr Bueso d7e8af1afe futex: update documentation for ordering guarantees
Commits 11d4616bd0 ("futex: revert back to the explicit waiter
counting code") and 69cd9eba38 ("futex: avoid race between requeue and
wake") changed some of the finer details of how we think about futexes.
One was a late fix and the other a consequence of overlooking the whole
requeuing logic.

The first change caused our documentation to be incorrect, and the
second made us aware that we need to explicitly add more details to it.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12 17:57:51 -07:00
Linus Torvalds 454fd351f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull yet more networking updates from David Miller:

 1) Various fixes to the new Redpine Signals wireless driver, from
    Fariya Fatima.

 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from
    Dmitry Petukhov.

 3) UFO and TSO packets differ in whether they include the protocol
    header in gso_size, account for that in skb_gso_transport_seglen().
   From Florian Westphal.

 4) If VLAN untagging fails, we double free the SKB in the bridging
    output path.  From Toshiaki Makita.

 5) Several call sites of sk->sk_data_ready() were referencing an SKB
    just added to the socket receive queue in order to calculate the
    second argument via skb->len.  This is dangerous because the moment
    the skb is added to the receive queue it can be consumed in another
    context and freed up.

    It turns out also that none of the sk->sk_data_ready()
    implementations even care about this second argument.

    So just kill it off and thus fix all these use-after-free bugs as a
    side effect.

 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti.

 7) pktgen needs to do locking properly for LLTX devices, from Daniel
    Borkmann.

 8) xen-netfront driver initializes TX array entries in RX loop :-) From
    Vincenzo Maffione.

 9) After refactoring, some tunnel drivers allow a tunnel to be
    configured on top itself.  Fix from Nicolas Dichtel.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
  vti: don't allow to add the same tunnel twice
  gre: don't allow to add the same tunnel twice
  drivers: net: xen-netfront: fix array initialization bug
  pktgen: be friendly to LLTX devices
  r8152: check RTL8152_UNPLUG
  net: sun4i-emac: add promiscuous support
  net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
  net: ipv6: Fix oif in TCP SYN+ACK route lookup.
  drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts
  drivers: net: cpsw: discard all packets received when interface is down
  net: Fix use after free by removing length arg from sk_data_ready callbacks.
  Drivers: net: hyperv: Address UDP checksum issues
  Drivers: net: hyperv: Negotiate suitable ndis version for offload support
  Drivers: net: hyperv: Allocate memory for all possible per-pecket information
  bridge: Fix double free and memory leak around br_allowed_ingress
  bonding: Remove debug_fs files when module init fails
  i40evf: program RSS LUT correctly
  i40evf: remove open-coded skb_cow_head
  ixgb: remove open-coded skb_cow_head
  igbvf: remove open-coded skb_cow_head
  ...
2014-04-12 17:31:22 -07:00
Linus Torvalds fd18f00dd9 blackfin updates for Linux 3.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTSI28AAoJEJommM3PjknH4aEP/iw75wO2c9L0tY4GxKiHBEwq
 koyXb9alWR1wlHRx6ja5v1V4I6zXk2Ft8MfxoGbze0XNY3Ih4ErUqZMFgt2oiJen
 lfilY2F0oNrcAxHn/Q5VQnKLa8P3qrPgzzIu9zkPMYmfyMHn2bGXesmQ/JITod8J
 ljcciWH4PvYU27aHZJRCUhcsRfBVa5XaWNd5sjRyBGmkuRrgbxy5spJ8ORukElrR
 yzRrd4qPSRNY7jHt39FYJDk3mAU5TZZZKGk++l0dpTUuwkS65TqsoqWtbb88MNR+
 JAaZ5AEAxC0NWDYDDOP/UPi89K56OphqwwgK9sgLXcVX1iGhl/VKT5isfli3pqA8
 aSNDOrYsdOx2a1te6H1sU6aoTq7X9Eb0NHlBYGkwIYnCF20m5rLy6TuFmupLPtD0
 sdYF3YIUWgqyomcBJWV0+kqr9jqB85w2QQvHBFHkirAoZ3tgtWiXF75bh6zOuquB
 ADQ0ZRYcDDIJ0IA3unKIi0JyEWKySYzuWQgc8rfMbKvYKQHHkiEBsALjvwusJ3Zj
 tEmctpVBb1mxyRi90m+IJjx63PyJpUYDRaV1f7/E4Q3kyNQ2k6sD22nT512Sl6lA
 y3OzfO72hr59DhuG8FcPrAb7nYr7BO8CWgotPfnzt+EaK/0mNa2AZPIkJ9vGXRSF
 26X31FJxjN/xbsELC+TP
 =xfcQ
 -----END PGP SIGNATURE-----

Merge tag 'blackfin-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/realmz6/blackfin-linux

Pull blackfin updates from Steven Miao:
 "Code cleanup, some previously ignored patches, and bug fixes"

* tag 'blackfin-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/realmz6/blackfin-linux:
  blackfin: cleanup board files
  bf609: clock: drop unused clock bit set/clear functions
  Blackfin: bf537: rename "CONFIG_ADT75"
  Blackfin: bf537: rename "CONFIG_AD7314"
  Blackfin: bf537: rename ad2s120x ->ad2s1200
  blackfin: bf537: fix typo "CONFIG_SND_SOC_ADV80X_MODULE"
  blackfin: dma: current count mmr is read only
  bfin_crc: Move architecture independant crc header file out of the blackfin folder.
  bf54x: drop unuesd HOST status,control,timeout registers bit define macros
  blackfin: portmux: cleanup head file
  Blackfin: remove "config IP_CHECKSUM_L1"
  blackfin: Remove GENERIC_GPIO config option again
  blackfin:Use generic /proc/interrupts implementation
  blackfin: bf60x: fix typo "CONFIG_PM_BFIN_WAKE_PA15_POL"
2014-04-12 17:26:45 -07:00
Linus Torvalds de0c9cf96a Several remoteproc cleanup patches coming from Jingoo Han, Julia Lawall and Uwe Kleine-König.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTR+rtAAoJELLolMlTRIoMIccQAMPvi0aYD/BvqGFm/7540Iqx
 A8RfG93V223V6qs79eqmSg+OjmowK/m9GNLblvNupzG1a5lOyPaMWyhCLP+3BbpT
 wHuFZJVb35jSn+pHoRDKPsu2a0hf1VTFpGToJKEsWKDW12zQfUlNrDpWxiEoqilO
 JBCyE/2/WwYvdceWhUnfbWJlABI6Y4t8yc99UlH77fjApu6B4jg6YVLEydgIKgoG
 rZJkae67WK6xHp4ORSB4VCwEs4sV3Ze7OED4a5VA/2Vr+IeGnzjmzADLJFiOjQOA
 4Wh/dI7wayDTJC7O2b8pOtmG+fIk0Cn3JEybMi3eDx/SenuCUs2gSQBaB0qh+pNw
 1Vc/2xjD/ro9HgLBpTk2+ioUZb6N5ZQuhGMZsLrJNM/ebNcuyznCct0cuhUKDFyW
 DcSZ7TIrl49lhXxyGhhKhLjDrqJ4Wq93/Tp9NGG2EIeePqix9A1oukDOgSTRRe7s
 +Gd12CQw0bF/z7jrvPYK1/TG+Per8AfnAf9xmB5KidkGFB+dx+wFAPKNyZAsPITX
 XuANJ0KVsO1XN/jJTa2Qo3XDcKTTb4setZZQZv08LHWk3ArzS+g6jVOfNhWf3tZ0
 I/c6UrymgfVIzjGe3UPJfA8CqZB0p4qdKe1Qt0pu17HpUmv5x1iy/XuUACfGzA/S
 PtnTN3jbuuy+vLRWBn5S
 =2+pm
 -----END PGP SIGNATURE-----

Merge tag 'remoteproc-3.15-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc

Pull remoteproc cleanups from Ohad Ben-Cohen:
 "Several remoteproc cleanup patches coming from Jingoo Han, Julia
  Lawall and Uwe Kleine-König"

* tag 'remoteproc-3.15-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc:
  remoteproc/ste_modem: staticize local symbols
  remoteproc/davinci: simplify use of devm_ioremap_resource
  remoteproc/davinci: drop needless devm_clk_put
2014-04-12 17:23:12 -07:00
Linus Torvalds 09c9b61d5d LLVMLinux Patches for v3.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iEYEABECAAYFAlNFtEkACgkQuseO5dulBZVcPwCdFWY81hKqQaKHaSPFh9m+n1lt
 yY0An2VZZGrFSkj162POFy1P2sPpGw5p
 =LRFZ
 -----END PGP SIGNATURE-----

Merge tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel

Pull llvm patches from Behan Webster:
 "These are some initial updates to support compiling the kernel with
  clang.

  These patches have been through the proper reviews to the best of my
  ability, and have been soaking in linux-next for a few weeks.  These
  patches by themselves still do not completely allow clang to be used
  with the kernel code, but lay the foundation for other patches which
  are still under review.

  Several other of the LLVMLinux patches have been already added via
  maintainer trees"

* tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel:
  x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id"
  x86 kbuild: LLVMLinux: More cc-options added for clang
  x86, acpi: LLVMLinux: Remove nested functions from Thinkpad ACPI
  LLVMLinux: Add support for clang to compiler.h and new compiler-clang.h
  LLVMLinux: Remove warning about returning an uninitialized variable
  kbuild: LLVMLinux: Fix LINUX_COMPILER definition script for compilation with clang
  Documentation: LLVMLinux: Update Documentation/dontdiff
  kbuild: LLVMLinux: Adapt warnings for compilation with clang
  kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang
2014-04-12 17:00:40 -07:00
Linus Torvalds 141eaccd01 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the target pending updates for v3.15-rc1.  Apologies in
  advance for waiting until the second to last day of the merge window
  to send these out.

  The highlights this round include:

   - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
   - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
   - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
   - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
   - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
   - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
   - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

  Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
  metadata into KVM guest have been left-out for now, as there where a
  few comments from MST + Paolo that where not able to be addressed in
  time for v3.15.  Please expect this feature for v3.16-rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
  ib_srpt: Use correct ib_sg_dma primitives
  target/tcm_fc: Rename ft_tport_create to ft_tport_get
  target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
  target/tcm_fc: Rename structs and list members for clarity
  target/tcm_fc: Limit to 1 TPG per wwn
  target/tcm_fc: Don't export ft_lport_list
  target/tcm_fc: Fix use-after-free of ft_tpg
  target: Add check to prevent Abort Task from aborting itself
  target: Enable READ_STRIP emulation in target_complete_ok_work
  target/sbc: Add sbc_dif_read_strip software emulation
  target: Enable WRITE_INSERT emulation in target_execute_cmd
  target/sbc: Add sbc_dif_generate software emulation
  target/sbc: Only expose PI read_cap16 bits when supported by fabric
  target/spc: Only expose PI mode page bits when supported by fabric
  target/spc: Only expose PI inquiry bits when supported by fabric
  target: Pass in transport supported PI at session initialization
  target/iblock: Fix double bioset_integrity_free bug
  Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
  target/rd: T10-Dif: RAM disk is allocating more space than required.
  iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
  ...
2014-04-12 16:51:08 -07:00
Linus Torvalds 9309444906 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "A series of bug fix patches for v3.15-rc1.  Most are just driver
  fixes.  There are some changes at remote controller core level, fixing
  some definitions on a new API added for Kernel v3.15.

  It also adds the missing include at include/uapi/linux/v4l2-common.h,
  to allow its compilation on userspace, as pointed by you"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (24 commits)
  [media] gpsca: remove the risk of a division by zero
  [media] stk1160: warrant a NUL terminated string
  [media] v4l: ti-vpe: retain v4l2_buffer flags for captured buffers
  [media] v4l: ti-vpe: Set correct field parameter for output and capture buffers
  [media] v4l: ti-vpe: zero out reserved fields in try_fmt
  [media] v4l: ti-vpe: Fix initial configuration queue data
  [media] v4l: ti-vpe: Use correct bus_info name for the device in querycap
  [media] v4l: ti-vpe: report correct capabilities in querycap
  [media] v4l: ti-vpe: Allow usage of smaller images
  [media] v4l: ti-vpe: Use video_device_release_empty
  [media] v4l: ti-vpe: Make sure in job_ready that we have the needed number of dst_bufs
  [media] lgdt3305: include sleep functionality in lgdt3304_ops
  [media] drx-j: use customise option correctly
  [media] m88rs2000: fix sparse static warnings
  [media] r820t: fix size and init values
  [media] rc-core: remove generic scancode filter
  [media] rc-core: split dev->s_filter
  [media] rc-core: do not change 32bit NEC scancode format for now
  [media] rtl28xxu: remove duplicate ID 0458:707f Genius TVGo DVB-T03
  [media] xc2028: add missing break to switch
  ...
2014-04-12 16:18:17 -07:00
Linus Torvalds 07f5fef981 NTB driver bug fixes to address issues in list traversal, skb leak in
ntb_netdev, a typo, and a leak of msix entries in the error path.
 Clean ups of the event handling logic, as well as a overall style
 cleanup.  Finally, the driver was converted to use the new
 pci_enable_msix_range logic (and the refactoring to go along with it).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTQxbAAAoJEG5mS6x6i9Ijq7oP/0pYUCGKYv3ElQuyatPnDI8q
 rSi93Q7WVYpv92y+uhal/TldVR5LvKKnPBf700TwUpj752X1XhlcFCw2TE9kHH1c
 VRicAZypLl4+5ob5QhFD7mRl+gEbq3JHLwSPffM26V5dm3mMqm24PRQPm9RN2uzO
 MRL/WEgShGM1pJQhIx+Ti+35/PxV67dw/rysvoUtPIuB8DfoIU7UBYEJ9BrzfyCP
 AXSjJmrR92ES/HX4w1kcXl+FA6sF/UR/YmnAW9uO6YDhxiIs1RNpiqJEJUettOHI
 7SxF4tFv8Pi5Heuf4ASflF1bPOFOLUt698W3+JMWaD0bLgF5IDI9hTvtbfOhy7Ag
 zqsRCipZ3dVJ8EE012IA978rjqbr8qrHfQK2jnD0JALWf/t8RHBzo3IjfRu+c6W+
 aRlkqu2BxZO0V3tFLoJ7uQrjp23hrERjRJi8s5hjOqwMwgB3SlVsaMVJDkeuR/46
 diQTkxjylhlQgdJb6xlSiM948wfcMt5pIFxWQppHFk8ouCOX1+q9u4cEipU+nvT+
 dcnc5mqERqMMjsHke3uzsHtG8OlPbbtHjzCKXT1/7NPhUn2x84AsOEkiX+qas5nK
 sabgOWZtawz1Nl3x+gKR6OBGtXDdLlj7JTsqphPa3mr3RBqx8v8VVYYtqHVCiy8N
 mtc/701fHOKdrBtk54Wb
 =vDei
 -----END PGP SIGNATURE-----

Merge tag 'ntb-3.15' of git://github.com/jonmason/ntb

Pull PCIe non-transparent bridge fixes and features from Jon Mason:
 "NTB driver bug fixes to address issues in list traversal, skb leak in
  ntb_netdev, a typo, and a leak of msix entries in the error path.
  Clean ups of the event handling logic, as well as a overall style
  cleanup.  Finally, the driver was converted to use the new
  pci_enable_msix_range logic (and the refactoring to go along with it)"

* tag 'ntb-3.15' of git://github.com/jonmason/ntb:
  ntb: Use pci_enable_msix_range() instead of pci_enable_msix()
  ntb: Split ntb_setup_msix() into separate BWD/SNB routines
  ntb: Use pci_msix_vec_count() to obtain number of MSI-Xs
  NTB: Code Style Clean-up
  NTB: client event cleanup
  ntb: Fix leakage of ntb_device::msix_entries[] array
  NTB: Fix typo in setting one translation register
  ntb_netdev: Fix skb free issue in open
  ntb_netdev: Fix list_for_each_entry exit issue
2014-04-12 16:16:39 -07:00
Linus Torvalds 96c57ade7e ceph: fix pr_fmt() redefinition
The vfs merge caused a latent bug to show up:

   In file included from fs/ceph/super.h:4:0,
                    from fs/ceph/ioctl.c:3:
   include/linux/ceph/ceph_debug.h:4:0: warning: "pr_fmt" redefined [enabled by default]
    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
    ^
   In file included from include/linux/kernel.h:13:0,
                    from include/linux/uio.h:12,
                    from include/linux/socket.h:7,
                    from include/uapi/linux/in.h:22,
                    from include/linux/in.h:23,
                    from fs/ceph/ioctl.c:1:
   include/linux/printk.h:214:0: note: this is the location of the previous definition
    #define pr_fmt(fmt) fmt
    ^

where the reason is that <linux/ceph_debug.h> is included much too late
for the "pr_fmt()" define.

The include of <linux/ceph_debug.h> needs to be the first include in the
file, but fs/ceph/ioctl.c had for some reason missed that, and it wasn't
noticeable until some unrelated header file changes brought in an
indirect earlier include of <linux/kernel.h>.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-12 15:39:53 -07:00
Linus Torvalds 5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
David S. Miller eda43ce039 Merge branch 'tunnels'
Nicolas Dichtel says:

====================
tunnels: don't allow to add the same tunnel twice

This series fixes the check of an existing tunnel with the same
parameters when a new tunnel is added.  I've checked all users of
ip_tunnel_newlink(): gre, gretap, ipip and vti. The bug exists only
for gre and vti.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-12 17:03:20 -04:00
Nicolas Dichtel 8d89dcdf80 vti: don't allow to add the same tunnel twice
Before the patch, it was possible to add two times the same tunnel:
ip l a vti1 type vti remote 10.16.0.121 local 10.16.0.249 key 41
ip l a vti2 type vti remote 10.16.0.121 local 10.16.0.249 key 41

It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the
argument dev->type, which was set only later (when calling ndo_init handler
in register_netdevice()). Let's set this type in the setup handler, which is
called before newlink handler.

Introduced by commit b9959fd3b0 ("vti: switch to new ip tunnel code").

CC: Cong Wang <amwang@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-12 17:03:11 -04:00
Nicolas Dichtel 5a4552752d gre: don't allow to add the same tunnel twice
Before the patch, it was possible to add two times the same tunnel:
ip l a gre1 type gre remote 10.16.0.121 local 10.16.0.249
ip l a gre2 type gre remote 10.16.0.121 local 10.16.0.249

It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the
argument dev->type, which was set only later (when calling ndo_init handler
in register_netdevice()). Let's set this type in the setup handler, which is
called before newlink handler.

Introduced by commit c544193214 ("GRE: Refactor GRE tunneling code.").

CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-12 17:03:11 -04:00
Vincenzo Maffione 810d8cedcc drivers: net: xen-netfront: fix array initialization bug
This patch fixes the initialization of an array used in the TX
datapath that was mistakenly initialized together with the
RX datapath arrays. An out of range array access could happen
when RX and TX rings had different sizes.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-12 16:49:06 -04:00
David S. Miller dcfba949ba Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to e1000, e1000e, igb, igbvf, ixgb, ixgbe,
ixgbevf and i40evf.

Mark fixes an issue with ixgbe and ixgbevf by adding a bit to indicate
when workqueues have been initialized.  This permits the register read
error handling from attempting to use them prior to that, which also
generates warnings.  Checking for a detected removal after initializing
the work queues allows the probe function to return an error without
getting the workqueue involved.  Further, if the error_detected
callback is entered before the workqueues are initialized, exit without
recovery since the device initialization was so truncated.

Francois Romieu provides several patches to all the drivers to remove
the open coded skb_cow_head.

Jakub Kicinski provides a fix for igb where last_rx_timestamp should be
updated only when Rx time stamp is read.

Mitch provides a fix for i40evf where a recent change broke the RSS LUT
programming causing it to be programmed with all 0's.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-12 16:36:44 -04:00
Linus Torvalds 0a7418f5f5 This includes the final patch to clean up and fix the issue with the
design of tracepoints and how a user could register a tracepoint
 and have that tracepoint not be activated but no error was shown.
 
 The design was for an out of tree module but broke in tree users.
 The clean up was to remove the saving of the hash table of tracepoint
 names such that they can be enabled before they exist (enabling
 a module tracepoint before that module is loaded). This added more
 complexity than needed. The clean up was to remove that code and
 just enable tracepoints that exist or fail if they do not.
 
 This removed a lot of code as well as the complexity that it brought.
 As a side effect, instead of registering a tracepoint by its name,
 the tracepoint needs to be registered with the tracepoint descriptor.
 This removes having to duplicate the tracepoint names that are
 enabled.
 
 The second patch was added that simplified the way modules were
 searched for.
 
 This cleanup required changes that were in the 3.15 queue as well as
 some changes that were added late in the 3.14-rc cycle. This final
 change waited till the two were merged in upstream and then the
 change was added and full tests were run. Unfortunately, the
 test found some errors, but after it was already submitted to the
 for-next branch and not to be rebased. Sparse errors were detected
 by Fengguang Wu's bot tests, and my internal tests discovered that
 the anonymous union initialization triggered a bug in older gcc compilers.
 Luckily, there was a bugzilla for the gcc bug which gave a work around
 to the problem. The third and fourth patch handled the sparse error
 and the gcc bug respectively.
 
 A final patch was tagged along to fix a missing documentation for
 the README file.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTR+pwAAoJEKQekfcNnQGuvfoH/A4XZu4/1h2ZuKhzGi6lrrWr
 +zHUQ+JmGiAYRziQFwr2t/gqJ2vmDfHJnbDjKi6Emx8JcxesHas6CQOWps4zEic0
 dwYSQjvuGNGFIFt+7I0K1OxfVVdt2PQ2lVrB5WgYdbash5J4Bi+09QBv0RbUKheo
 37dKSeN3pbsuQsR70OTVP8laG3dA9IbHW7PsKnxIEB5zeIUHUBME/QdPPj/CuJwk
 wxZjXC2dbc3rdRlQjTVtWV3ZkGgZJB0k+JxjvZTA0N6u8Hj8LiFPuNawzf7ceBHx
 gc++57+WuMW0f0X/ar5/+3UPGFQKMSvKmdxIQCnWXQz5seTYYKDEx7mTH22fxgg=
 =OgeQ
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:
 "This includes the final patch to clean up and fix the issue with the
  design of tracepoints and how a user could register a tracepoint and
  have that tracepoint not be activated but no error was shown.

  The design was for an out of tree module but broke in tree users.  The
  clean up was to remove the saving of the hash table of tracepoint
  names such that they can be enabled before they exist (enabling a
  module tracepoint before that module is loaded).  This added more
  complexity than needed.  The clean up was to remove that code and just
  enable tracepoints that exist or fail if they do not.

  This removed a lot of code as well as the complexity that it brought.
  As a side effect, instead of registering a tracepoint by its name, the
  tracepoint needs to be registered with the tracepoint descriptor.
  This removes having to duplicate the tracepoint names that are
  enabled.

  The second patch was added that simplified the way modules were
  searched for.

  This cleanup required changes that were in the 3.15 queue as well as
  some changes that were added late in the 3.14-rc cycle.  This final
  change waited till the two were merged in upstream and then the change
  was added and full tests were run.  Unfortunately, the test found some
  errors, but after it was already submitted to the for-next branch and
  not to be rebased.  Sparse errors were detected by Fengguang Wu's bot
  tests, and my internal tests discovered that the anonymous union
  initialization triggered a bug in older gcc compilers.  Luckily, there
  was a bugzilla for the gcc bug which gave a work around to the
  problem.  The third and fourth patch handled the sparse error and the
  gcc bug respectively.

  A final patch was tagged along to fix a missing documentation for the
  README file"

* tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Add missing function triggers dump and cpudump to README
  tracing: Fix anonymous unions in struct ftrace_event_call
  tracepoint: Fix sparse warnings in tracepoint.c
  tracepoint: Simplify tracepoint module search
  tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints
2014-04-12 13:06:10 -07:00
Linus Torvalds 0b747172dc Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
  AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
  audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
  audit: do not cast audit_rule_data pointers pointlesly
  AUDIT: Allow login in non-init namespaces
  audit: define audit_is_compat in kernel internal header
  kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
  sched: declare pid_alive as inline
  audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
  syscall_get_arch: remove useless function arguments
  audit: remove stray newline from audit_log_execve_info() audit_panic() call
  audit: remove stray newlines from audit_log_lost messages
  audit: include subject in login records
  audit: remove superfluous new- prefix in AUDIT_LOGIN messages
  audit: allow user processes to log from another PID namespace
  audit: anchor all pid references in the initial pid namespace
  audit: convert PPIDs to the inital PID namespace.
  pid: get pid_t ppid of task in init_pid_ns
  audit: rename the misleading audit_get_context() to audit_take_context()
  audit: Add generic compat syscall support
  audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
  ...
2014-04-12 12:38:53 -07:00