Commit Graph

591611 Commits

Author SHA1 Message Date
David S. Miller 4b307a8edb Merge branch 'bpf-direct-pkt-access'
Alexei Starovoitov says:

====================
bpf: introduce direct packet access

This set of patches introduce 'direct packet access' from
cls_bpf and act_bpf programs (which are root only).

Current bpf programs use LD_ABS, LD_INS instructions which have
to do 'if (off < skb_headlen)' for every packet access.
It's ok for socket filters, but too slow for XDP, since single
LD_ABS insn consumes 3% of cpu. Therefore we have to amortize the cost
of length check over multiple packet accesses via direct access
to skb->data, data_end pointers.

The existing packet parser typically look like:
  if (load_half(skb, offsetof(struct ethhdr, h_proto)) != ETH_P_IP)
     return 0;
  if (load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)) != IPPROTO_UDP ||
      load_byte(skb, ETH_HLEN) != 0x45)
     return 0;
  ...
with 'direct packet access' the bpf program becomes:
   void *data = (void *)(long)skb->data;
   void *data_end = (void *)(long)skb->data_end;
   struct eth_hdr *eth = data;
   struct iphdr *iph = data + sizeof(*eth);

   if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
      return 0;
   if (eth->h_proto != htons(ETH_P_IP))
      return 0;
   if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
      return 0;
   ...
which is more natural to write and significantly faster.
See patch 6 for performance tests:
21Mpps(old) vs 24Mpps(new) with just 5 loads.
For more complex parsers the performance gain is higher.

The other approach implemented in [1] was adding two new instructions
to interpreter and JITs and was too hard to use from llvm side.
The approach presented here doesn't need any instruction changes,
but the verifier has to work harder to check safety of the packet access.

Patch 1 prepares the code and Patch 2 adds new checks for direct
packet access and all of them are gated with 'env->allow_ptr_leaks'
which is true for root only.
Patch 3 improves search pruning for large programs.
Patch 4 wires in verifier's changes with net/core/filter side.
Patch 5 updates docs
Patches 6 and 7 add tests.

[1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:55 -04:00
Alexei Starovoitov 883e44e4de samples/bpf: add verifier tests
add few tests for "pointer to packet" logic of the verifier

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Alexei Starovoitov 65d472fb00 samples/bpf: add 'pointer to packet' tests
parse_simple.c - packet parser exapmle with single length check that
filters out udp packets for port 9

parse_varlen.c - variable length parser that understand multiple vlan headers,
ipip, ipip6 and ip options to filter out udp or tcp packets on port 9.
The packet is parsed layer by layer with multitple length checks.

parse_ldabs.c - classic style of packet parsing using LD_ABS instruction.
Same functionality as parse_simple.

simple = 24.1Mpps per core
varlen = 22.7Mpps
ldabs  = 21.4Mpps

Parser with LD_ABS instructions is slower than full direct access parser
which does more packet accesses and checks.

These examples demonstrate the choice bpf program authors can make between
flexibility of the parser vs speed.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Alexei Starovoitov f9c8d19d6c bpf: add documentation for 'direct packet access'
explain how verifier checks safety of packet access
and update email addresses.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Alexei Starovoitov db58ba4592 bpf: wire in data and data_end for cls_act_bpf
allow cls_bpf and act_bpf programs access skb->data and skb->data_end pointers.
The bpf helpers that change skb->data need to update data_end pointer as well.
The verifier checks that programs always reload data, data_end pointers
after calls to such bpf helpers.
We cannot add 'data_end' pointer to struct qdisc_skb_cb directly,
since it's embedded as-is by infiniband ipoib, so wrapper struct is needed.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Alexei Starovoitov 735b433397 bpf: improve verifier state equivalence
since UNKNOWN_VALUE type is weaker than CONST_IMM we can un-teach
verifier its recognition of constants in conditional branches
without affecting safety.
Ex:
if (reg == 123) {
  .. here verifier was marking reg->type as CONST_IMM
     instead keep reg as UNKNOWN_VALUE
}

Two verifier states with UNKNOWN_VALUE are equivalent, whereas
CONST_IMM_X != CONST_IMM_Y, since CONST_IMM is used for stack range
verification and other cases.
So help search pruning by marking registers as UNKNOWN_VALUE
where possible instead of CONST_IMM.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Alexei Starovoitov 969bf05eb3 bpf: direct packet access
Extended BPF carried over two instructions from classic to access
packet data: LD_ABS and LD_IND. They're highly optimized in JITs,
but due to their design they have to do length check for every access.
When BPF is processing 20M packets per second single LD_ABS after JIT
is consuming 3% cpu. Hence the need to optimize it further by amortizing
the cost of 'off < skb_headlen' over multiple packet accesses.
One option is to introduce two new eBPF instructions LD_ABS_DW and LD_IND_DW
with similar usage as skb_header_pointer().
The kernel part for interpreter and x64 JIT was implemented in [1], but such
new insns behave like old ld_abs and abort the program with 'return 0' if
access is beyond linear data. Such hidden control flow is hard to workaround
plus changing JITs and rolling out new llvm is incovenient.

Therefore allow cls_bpf/act_bpf program access skb->data directly:
int bpf_prog(struct __sk_buff *skb)
{
  struct iphdr *ip;

  if (skb->data + sizeof(struct iphdr) + ETH_HLEN > skb->data_end)
      /* packet too small */
      return 0;

  ip = skb->data + ETH_HLEN;

  /* access IP header fields with direct loads */
  if (ip->version != 4 || ip->saddr == 0x7f000001)
      return 1;
  [...]
}

This solution avoids introduction of new instructions. llvm stays
the same and all JITs stay the same, but verifier has to work extra hard
to prove safety of the above program.

For XDP the direct store instructions can be allowed as well.

The skb->data is NET_IP_ALIGNED, so for common cases the verifier can check
the alignment. The complex packet parsers where packet pointer is adjusted
incrementally cannot be tracked for alignment, so allow byte access in such cases
and misaligned access on architectures that define efficient_unaligned_access

[1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:53 -04:00
Alexei Starovoitov 1a0dc1ac1d bpf: cleanup verifier code
cleanup verifier code and prepare it for addition of "pointer to packet" logic

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:53 -04:00
Linus Torvalds 3f86ba5d0c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "This contains two fixes: a boot fix for older SGI/UV systems, and an
  APIC calibration fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
  x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init
2016-05-06 12:59:27 -07:00
David S. Miller 95aef7cecb Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-05-05

This series contains updates to i40e and i40evf.

The theme behind this series is code reduction, yeah!  Jesse provides
most of the changes starting with a refactor of the interpretation of
a tunnel which lets us start using the hardware's parsing.  Removed
the packet split receive routine and ancillary code in preparation
for the Rx-refactor.  The refactor of the receive routine,
aligns the receive routine with the one in ixgbe which was highly
optimized.  The hardware supports a 16 byte descriptor for receive,
but the driver was never using it in production.  There was no performance
benefit to the real driver of 16 byte descriptors, so drop a whole lot
of complexity while getting rid of the code.  Fixed a bug where while
changing the number of descriptors using ethtool, the driver did not
test the limits of the system memory before permanently assuming it
would be able to get receive buffer memory.

Mitch fixes a memory leak of one page each time the driver is opened by
allocating the correct number of receive buffers and do not fiddle with
next_to_use in the VF driver.

Arnd Bergmann fixed a indentation issue by adding the appropriate
curly braces in i40e_vc_config_promiscuous_mode_msg().

Julia Lawall fixed an issue found by Coccinelle, where i40e_client_ops
structure can be const since it is never modified.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 15:55:30 -04:00
Tedd Ho-Jeong An a0af53b511 Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]
This patch adds support for Intel Bluetooth device 8265 also known
as Windstorm Peak (WsP).

T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#=  6 Spd=12   MxCh= 0
D:  Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=0a2b Rev= 0.10
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-05-06 21:52:35 +02:00
David Ahern b3b4663c97 net: vrf: Create FIB tables on link create
Tables have to exist for VRFs to function. Ensure they exist
when VRF device is created.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 15:51:47 -04:00
Sudarsana Reddy Kalluru 8e0ddc040a qede: prevent chip hang when increasing channels
qede requires qed to provide enough resources to accommodate 16 combined
channels, but that upper-bound isn't actually being enforced by it.
Instead, qed inform back to qede how many channels can be opened based on
available resources - but that calculation doesn't really take into account
the resources requested by qede; Instead it considers other FW/HW available
resources.

As a result, if a user would increase the number of channels to more than
16 [e.g., using ethtool] the chip would hang.

This change increments the resources requested by qede to 64 combined
channels instead of 16; This value is an upper bound on the possible
available channels [due to other FW/HW resources].

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 15:50:33 -04:00
David Ahern 1d2f7b2d95 net: ipv6: tcp reset, icmp need to consider L3 domain
Responses for packets to unused ports are getting lost with L3 domains.

IPv4 has ip_send_unicast_reply for sending TCP responses which accounts
for L3 domains; update the IPv6 counterpart tcp_v6_send_response.
For icmp the L3 master check needs to be moved up in icmp6_send
to properly respond to UDP packets to a port with no listener.

Fixes: ca254490c8 ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 15:49:07 -04:00
Jon Maxwell f37bd0cced cnic: call cp->stop_hw() in cnic_start_hw() on allocation failure
We recently had a system crash in the cnic module. Vmcore analysis confirmed
that "ip link up" was executed which failed due to an allocation failure
because of memory fragmentation. Futher analysis revealed that the cnic irq
vector was still allocated after the "ip link up" that failed. When
"ip link down" was executed it called free_msi_irqs() which crashed the system
because the cnic irq was still inuse.

PANIC: "kernel BUG at drivers/pci/msi.c:411!"

The code execution was:

cnic_netdev_event()
if (event == NETDEV_UP) {
.
.
       ▹       if (!cnic_start_hw(dev))
cnic_start_hw()
calls cnic_cm_open() which failed with -ENOMEM
cnic_start_hw() then took the err1 path:

err1:↩
       cp->free_resc(dev);↩ <---- frees resources but not irq vector
       pci_dev_put(dev->pcidev);↩
       return err;↩
}↩

This returns control back to cnic_netdev_event() but now the cnic irq vector
is still allocated even although cnic_cm_open() failed. The next
"ip link down" while trigger the crash.

The cnic_start_hw() routine is not handling the allocation failure correctly.
Fix this by checking whether CNIC_DRV_STATE_HANDLES_IRQ flag is set indicating
that the hardware has been started in cnic_start_hw(). If it has then call
cp->stop_hw() which frees the cnic irq vector and cnic resources. Otherwise
just maintain the previous behaviour and free cnic resources.

I reproduced this by injecting an ENOMEM error into cnic_cm_alloc_mem()s return
code.

# ip link set dev enpX down
# ip link set dev enpX up <--- hit's allocation failure
# ip link set dev enpX down <--- crashes here

With this patch I confirmed there was no crash in the reproducer.

Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 15:44:54 -04:00
Linus Torvalds 01ec716761 Power management and ACPI fixes for v4.6-rc7
- Fix for a recent regression in the intel_pstate driver causing
    it to fail to restore the HWP (HW-managed P-states) configuration
    of the boot CPU after suspend-to-RAM (Rafael Wysocki).
 
  - Fix for two recent regressions in the intel_pstate driver, one
    that can trigger a divide by zero if the driver is accessed via
    sysfs before it manages to take the first sample and one causing
    it to fail to update a structure field used in a trace point, so
    the information coming from it is less useful (Rafael Wysocki).
 
  - Fix for a problem in the sti-cpufreq driver introduced during
    the 4.5 cycle that causes it to break CPU PM in multi-platform
    kernels by registering cpufreq-dt (which subsequently doesn't
    work) unconditionally and preventing the driver that would
    actually work from registering (Sudeep Holla).
 
  - Stable-candidate fix for an ARM64 cpuidle issue causing idle
    state usage counters to be incorrectly updated for idle states
    that were not entered due to errors (James Morse).
 
  - Fix for a recently introduced issue in the OPP (Operating
    Performance Points) framework causing it to print bogus error
    messages for missing optional regulators (Viresh Kumar).
 
  - Fix for a recently introduced issue in the generic device
    properties framework that may cause it to attempt to dereferece
    and invalid pointer in some cases (Heikki Krogerus).
 
  - Fix for a deadlock in the ACPICA core that may be triggered
    by device (eg. Thunderbolt) hotplug (Prarit Bhargava).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXLIwPAAoJEILEb/54YlRxT+wP/ROEo/r5IaRZ2k8cphWjsiKk
 k9eDuWBL2KZ29ikghXs/vVY2fMbtQkaDT5h57imsUEKoEzI3MlYA3OkQyffFOcsY
 dz/9EnG6K9Efi6VS1dS1tNCgl45aIeHLCqlVPOBCZ9TwSoAERdNJGqItJdS2YKIA
 +C1LGrWl4UiJ95AOof9PHfKfnWxrnRbpIsB2PbxD0Swe5vfskrHoRWGOAMLJIwpF
 7NvEJ15fryDIvlMR/ggNrg2L2piOu1fJl2kVZYWZTb/u+qAO3utxTQN4y++zTSNb
 LAN78Hq/nJu156SSioO9fLa0wPaU+k2OChfWXtlMsTDK+L5EQz4G3pJwi5FA8QTD
 nfeZNC9VgqfP4LtqWw05h/AOw4A0XUeuwB8Edbc+WG5twzULqDhS57jew4A4xX8d
 jOsvK5syygnR+/rExWc0NWSmCH0g1u6mCUWXQuocfSb/oOEcUGq5RSixRNRfmJUq
 9XNF3hbp7W/Vnp9GWT30Md+CenrEtQXFK8ZQtg0ckBl+b5bEqKYs6FXGqCkUmjZy
 Qgt5sqxgdLWtslS3vSu1/mdryeaLmXNO6c6wueSPMmLyYODEoIHSSka9N9O0Inwv
 d106p7gUy3/ETamC3lbnyHkUrAru74Qh8rErKpqaRLkKfcIq7YCB073fxbqlamzz
 X4n8a1H37LefLqmKwIbF
 =pU+A
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "Fixes for problems introduced or discovered recently (intel_pstate,
  sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
  generic device properties framework) and one fix for a hotplug-related
  deadlock in ACPICA that's been there forever, but is nasty enough.

  Specifics:

   - Fix for a recent regression in the intel_pstate driver causing it
     to fail to restore the HWP (HW-managed P-states) configuration of
     the boot CPU after suspend-to-RAM (Rafael Wysocki).

   - Fix for two recent regressions in the intel_pstate driver, one that
     can trigger a divide by zero if the driver is accessed via sysfs
     before it manages to take the first sample and one causing it to
     fail to update a structure field used in a trace point, so the
     information coming from it is less useful (Rafael Wysocki).

   - Fix for a problem in the sti-cpufreq driver introduced during the
     4.5 cycle that causes it to break CPU PM in multi-platform kernels
     by registering cpufreq-dt (which subsequently doesn't work)
     unconditionally and preventing the driver that would actually work
     from registering (Sudeep Holla).

   - Stable-candidate fix for an ARM64 cpuidle issue causing idle state
     usage counters to be incorrectly updated for idle states that were
     not entered due to errors (James Morse).

   - Fix for a recently introduced issue in the OPP (Operating
     Performance Points) framework causing it to print bogus error
     messages for missing optional regulators (Viresh Kumar).

   - Fix for a recently introduced issue in the generic device
     properties framework that may cause it to attempt to dereferece and
     invalid pointer in some cases (Heikki Krogerus).

   - Fix for a deadlock in the ACPICA core that may be triggered by
     device (eg Thunderbolt) hotplug (Prarit Bhargava)"

* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / OPP: Remove useless check
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  intel_pstate: Fix intel_pstate_get()
  cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
  cpufreq: st: enable selective initialization based on the platform
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 11:58:45 -07:00
Linus Torvalds 17d25a337b Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "This contains a single fix that fixes a nohz tick stopping bug when
  mixed-poliocy SCHED_FIFO and SCHED_RR tasks are present on a runqueue"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
2016-05-06 11:53:27 -07:00
Linus Torvalds 18fb92c30c Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains two fixes: new Intel CPU model numbers and an
  AMD/iommu uncore PMU driver fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
  perf/x86: Add model numbers for Kabylake CPUs
2016-05-06 11:40:24 -07:00
Linus Torvalds cade818463 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "This tree contains three fixes: a console spam fix, a file pattern fix
  and a sysfb_efi fix for a bug that triggered on older ThinkPads"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sysfb_efi: Fix valid BAR address range check
  x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
  MAINTAINERS: Remove asterisk from EFI directory names
2016-05-06 11:33:02 -07:00
Linus Torvalds 83a395d332 Merge branch 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
 "Patch from Dmitry V Levin to fix a kernel crash when a straced process
  calls the (invalid) syscall which is equal to value of __NR_Linux_syscalls"

* 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
2016-05-06 11:27:05 -07:00
Linus Torvalds dd287690b0 ARC fixes for 4.6-rc7
- Fix for PTE truncation in PAE40 builds
  - Fix for big endian IO accessors lacking IO barrier
  - Allow HIGHMEM to work with low physical addresses
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXLIohAAoJEGnX8d3iisJeXLcQALwte9aeySA+r1X7S/OkW6uw
 kU+kAYqjxnjgIy4Zy+BmQwFMTxJpvb7NV9ZHvLWNFXWlyZEBAbDQz8LgLpuAekx4
 nbyk5KvvSKKkWP19T0/pCr19mQC06qKWzEr91zZ/lqxDwZKR6GIlbnlT7ugB25gH
 9dKSUww17YsS2nVvlq0d0nbs+rFP/4fE9O8RW096bo6zOi5j6dzP7jJy6AdG1Czq
 +UQUcFdkIkJLwoOaamur49azNmpmdR8goOgz6kzj8NK2J/cQx5698YTArR9c2xSw
 86TAr1yw4gnz9M+vinNrfY+mxs+n3YZTG//tVbuhL5nvGKDM6T1lkuyJPbiJHHDQ
 NxULX60NYOP32Xl5/fnCGoIDL30rLwmCgn1T2uZONVnx9e0Ai51TJE3T7Qlux/3X
 t/BlqzWQRIWOfR6jAtS+Gi3KM4fRwRNAeU4ed/kUqyXQSoRUNXOPof35+h7O7dSO
 dvrUnhBtBXsHCaDRRLfxjDzNqV9T2aIr/zZLM3rkBw79SPVIBHdV1hjLt0IxpMrn
 ttwhjVQXahD+MUGxLqS7efwdyHzjtOB/D1wW/6Fg6n/ircvTcIziV4pThHCFTxs9
 zWgmGgtD2XWSj0fhNXqe+AyjW1a6ZiH3GjMfV9cBbAsydFmUDCws8kWMTQOUSRNw
 1l/L0DIsBZzGlnredJzJ
 =bf8d
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "Late in the cycle, but this has fixes for couple of issues: a PAE40
  boot crash and Arnd spotting lack of barriers in BE io-accessors.

  The 3rd patch for enabling highmem in low physical mem ;-) honestly is
  more than a "fix" but its been in works for some time, seems to be
  stable in testing and enables 2 of our customers to go forward with
  4.6 kernel.

   - Fix for PTE truncation in PAE40 builds
   - Fix for big endian IO accessors lacking IO barrier
   - Allow HIGHMEM to work with low physical addresses"

* tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: support HIGHMEM even without PAE40
  ARC: Fix PAE40 boot failures due to PTE truncation
  ARC: Add missing io barriers to io{read,write}{16,32}be()
2016-05-06 11:14:38 -07:00
Linus Torvalds 4883d11e06 powerpc fixes for 4.6 #4
- Fix bad inline asm constraint in create_zero_mask() from Anton Blanchard
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXLEx6AAoJEFHr6jzI4aWAQskQAL9rtQtxhJQnCE97832iIgWn
 +YEef53mSTX7CEpoQqGY/2PUAOeQ8MPteewNaO5spZC2Vv34bXuHtv0q1uhiVJ6K
 Q0t1LSbm60rv9BIuO8rDSJmSGnSljI1bczd0WVsTHuYLyFrVPtRZaDBJT9Bh2JTQ
 +K/GE3o4R1X5pnUlZ4UqCcN7KwLrqHkQrmEcsSdUQMJ4s4jZaxS1KP9Kzq/M+mjy
 L9K4SyXneMyCgiUCIdu3hN5A7Vb+2NauGmCP/eV6x4+M/Bq5JC4vNDUcCKE0IBLF
 bNP5nvMKBBZOHlwxVWo9d2UWPeyEoIz4sKiZ26AE+urCYGhRMHb4M/Mh0qUbru13
 CwZ1BwmZnyYs4pY8INbZJ+TvLcFgROfxxeUlT0KHStlnOrIbmgeKPWI/nymsL6n8
 uHgSc8SnhskNW0n4NhWKVPD5iKqeWV2spFr5aTQ2tLDi7s2rT2oZjl5/GvKSIt+o
 NlmkL2LpbIOftRQ2j+EhHPq/acV8+d20yjN8Zul0tSTNuFeSk6rXyXhEiYXfU+Ao
 lPjkheNJZ9Oq0MAW0dvqWK4DrK31lhnZYcauO84cSA39sGa9BhHFe1E9XXMDSDwf
 YFs7ihMcbap1pejcf7waJJ5axCwN1SgxUGH1lCuGmOAypftbQT5TpB7m52ctAS5J
 vbsUZ8Z0W6OwsfSiKc/G
 =IkNh
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "Fix bad inline asm constraint in create_zero_mask() from Anton
  Blanchard"

* tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix bad inline asm constraint in create_zero_mask()
2016-05-06 11:05:07 -07:00
Linus Torvalds 659a182327 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Fixes for i915, amdgpu/radeon and imx.

  The IMX fix is for an autoloading regression found in Fedora.  The
  radeon fixes, are the same fix to amdgpu/radeon to avoid a hardware
  lockup in some circumstances with a bad mode, and a double free bug I
  took a few hours chasing down the other morning.

  The i915 fixes are across the board, all stable material, and fixing
  some hangs and suspend/resume issues, along with a live status
  regressions"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
  drm/amdgpu: make sure vertical front porch is at least 1
  drm/radeon: make sure vertical front porch is at least 1
  drm/amdgpu: set metadata pointer to NULL after freeing.
  drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
  drm/i915: Fake HDMI live status
  drm/i915: Fix eDP low vswing for Broadwell
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/i915: Fix system resume if PCI device remained enabled
  drm/i915: Avoid stalling on pending flips for legacy cursor updates
2016-05-06 10:59:53 -07:00
Linus Lüssing 856ce5d083 bridge: fix igmp / mld query parsing
With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.

The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.

If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.

Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.

Fixes: 9afd85c9e4 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 12:55:13 -04:00
Dmitry V. Levin f0b22d1bb2 parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Do not load one entry beyond the end of the syscall table when the
syscall number of a traced process equals to __NR_Linux_syscalls.
Similar bug with regular processes was fixed by commit 3bb457af4f
("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").

This bug was found by strace test suite.

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2016-05-06 15:09:07 +02:00
Rafael J. Wysocki 5f2f88e330 Merge branches 'pm-opp-fixes', 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes'
* pm-opp-fixes:
  PM / OPP: Remove useless check

* pm-cpufreq-fixes:
  intel_pstate: Fix intel_pstate_get()
  cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
  cpufreq: st: enable selective initialization based on the platform

* pm-cpuidle-fixes:
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
2016-05-06 13:16:22 +02:00
Rafael J. Wysocki 7c21b38ca9 Merge branches 'acpica-fixes' and 'device-properties-fixes'
* acpica-fixes:
  ACPICA: Dispatcher: Update thread ID for recursive method calls

* device-properties-fixes:
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 13:15:52 +02:00
Pablo Neira Ayuso bbe848ec1a Merge tag 'ipvs2-for-v4.7' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Simon Horman says:

====================
Second Round of IPVS Updates for v4.7

please consider these enhancements to the IPVS. They allow its DoS
mitigation strategy effective in conjunction with the SIP persistence
engine.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-05-06 11:50:52 +02:00
Chen Yu 886123fb3a x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;

Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
(35.5), the ratio bits are bit 8-15.

Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
TSC calibration and the Local APIC timer frequency to be incorrect.

Fix this problem by masking 0xff instead.

[ tglx: Massaged changelog ]

Fixes: 7da7c15613 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Bin Gao <bin.gao@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 11:50:50 +02:00
Florian Westphal 0a93aaedc4 netfilter: conntrack: use a single expectation table for all namespaces
We already include netns address in the hash and compare the netns pointers
during lookup, so even if namespaces have overlapping addresses entries
will be spread across the expectation table.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-05-06 11:50:01 +02:00
Florian Westphal a9a083c387 netfilter: conntrack: make netns address part of expect hash
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-05-06 11:50:01 +02:00
Florian Westphal 03d7dc5cdf netfilter: conntrack: check netns when walking expect hash
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-05-06 11:50:01 +02:00
Marco Angaroni 698e2a8dca ipvs: make drop_entry protection effective for SIP-pe
DoS protection policy that deletes connections to avoid out of memory is
currently not effective for SIP-pe plus OPS-mode for two reasons:
  1) connection templates (holding SIP call-id) are always skipped in
     ip_vs_random_dropentry()
  2) in_pkts counter (used by drop_entry algorithm) is not incremented
     for connection templates

This patch addresses such problems with the following changes:
  a) connection templates associated (via their dest) to virtual-services
     configured in OPS mode are included in ip_vs_random_dropentry()
     monitoring. This applies to SIP-pe over UDP (which requires OPS mode),
     but is more general principle: when OPS is controlled by templates
     memory can be used only by templates themselves, since OPS conns are
     deleted after packet is forwarded.
  b) OPS connections, if controlled by a template, cause increment of
     in_pkts counter of their template. This is already happening but only
     in case director is in master-slave mode (see ip_vs_sync_conn()).

Signed-off-by: Marco Angaroni <marcoangaroni@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2016-05-06 16:26:23 +09:00
Julia Lawall 3949c4ac8c i40e: constify i40e_client_ops structure
The i40e_client_ops structure is never modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 23:32:59 -07:00
Arnd Bergmann ce927db487 i40e: fix misleading indentation
Newly added code in i40e_vc_config_promiscuous_mode_msg() is indented
in a way that gcc rightly complains about:

drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c: In function 'i40e_vc_config_promiscuous_mode_msg':
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1543:4: error: this 'if' clause does not guard... [-Werror=misleading-indentation]
    if (f->vlan >= 0 && f->vlan <= I40E_MAX_VLANID)
    ^~
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1550:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if'
     aq_err = pf->hw.aq.asq_last_status;

From the context, it looks like the aq_err assignment was meant to be
inside of the conditional expression, so I'm adding the appropriate
curly braces now.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5676a8b9cd ("i40e: Add VF promiscuous mode driver support")
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 23:25:34 -07:00
Jesse Brandeburg 147e81ec75 i40e: Test memory before ethtool alloc succeeds
When testing on systems with very limited amounts of RAM, a bug was
found where, while changing the number of descriptors using ethtool,
the driver didn't test the limits of system memory before permanently
assuming it would be able to get receive buffer memory.

Work around this issue by pre-allocation of the receive buffer
memory, in the "ghost" ring, which is then used during reinit
using the new ring length.

Change-Id: I92d7a5fb59a6c884b2efdd1ec652845f101c3359
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 23:17:07 -07:00
Mitch Williams b163098ea1 i40evf: Allocate Rx buffers properly
Allocate the correct number of RX buffers, and don't fiddle with
next_to_use. The common RX code handles all of this. This fixes a memory
leak of one page each time the driver is opened.

Change-Id: Id06eca353086e084921f047acad28c14745684ee
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 23:07:30 -07:00
Jesse Brandeburg bec60fc42b i40e/i40evf: Remove unused hardware receive descriptor code
The hardware supports a 16 byte descriptor for receive, but the
driver was never using it in production.  There was no performance
benefit to the real driver of 16 byte descriptors, so drop a whole
lot of complexity while getting rid of the code.

Also since the previous patch made us use no-split mode all the
time, drop any support in the driver for any other value in dtype
and assume it is always zero (aka no-split).

Hooray for code removal!

Change-ID: I2257e902e4dad84a07b94db6d2e6f4ce69b27bc0
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 22:59:54 -07:00
Jesse Brandeburg ab9ad98eb5 i40evf: refactor receive routine
This is part 2 of the Rx refactor series, just including
changes to i40evf.

This refactor aligns the receive routine with the one in
ixgbe which was highly optimized.  This reduces the code
we have to maintain and allows for (hopefully) more readable
and maintainable RX hot path.

In order to do this:
- consolidate the receive path into a single function that doesn't
  use packet split but *does* use pages for Rx buffers.
- remove the old _1buf routine
- consolidate several routines into helper functions
- remove VF ethtool control over packet split
- remove priv_flags interface since it is unused

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 22:42:58 -07:00
Jesse Brandeburg 19b85e677d i40evf: Drop packet split receive routine
As part of preparation for the rx-refactor, remove the
packet split receive routine and ancillary code.

Some of the split related context set up code stays in
i40e_virtchnl_pf.c in case an older VF driver tries to load
and still wants to use packet split.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 22:31:23 -07:00
Jesse Brandeburg 1a557afc4d i40e: Refactor receive routine
This is part 1 of the Rx refactor series, just including
changes to i40e.

This refactor aligns the receive routine with the one in
ixgbe which was highly optimized.  This reduces the code
we have to maintain and allows for (hopefully) more readable
and maintainable RX hot path.

In order to do this:
- consolidate the receive path into a single function that doesn't
  use packet split but *does* use pages for Rx buffers.
- remove the old _1buf routine
- consolidate several routines into helper functions
- remove ethtool control over packet split

Change-ID: I5ca100721de65992aa0114f8b4bac844b84758e0
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 21:53:16 -07:00
Linus Torvalds 9caa7e7848 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "14 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  byteswap: try to avoid __builtin_constant_p gcc bug
  lib/stackdepot: avoid to return 0 handle
  mm: fix kcompactd hang during memory offlining
  modpost: fix module autoloading for OF devices with generic compatible property
  proc: prevent accessing /proc/<PID>/environ until it's ready
  mm/zswap: provide unique zpool name
  mm: thp: kvm: fix memory corruption in KVM with THP enabled
  MAINTAINERS: fix Rajendra Nayak's address
  mm, cma: prevent nr_isolated_* counters from going negative
  mm: update min_free_kbytes from khugepaged after core initialization
  huge pagecache: mmap_sem is unlocked when truncation splits pmd
  rapidio/mport_cdev: fix uapi type definitions
  mm: memcontrol: let v2 cgroups follow changes in system swappiness
  mm: thp: correct split_huge_pages file permission
2016-05-05 20:48:35 -07:00
Nikolay Aleksandrov 31ca0458a6 net: bridge: fix old ioctl unlocked net device walk
get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[  957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[  957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G        W  O
4.6.0-rc4+ #157
[  957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[  957.423009]  0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[  957.423009]  ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[  957.423009]  00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[  957.423009] Call Trace:
[  957.423009]  [<ffffffff8138dec5>] dump_stack+0x85/0xc0
[  957.423009]  [<ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[  957.423009]  [<ffffffff81515beb>] sock_ioctl+0x22b/0x290
[  957.423009]  [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[  957.423009]  [<ffffffff8126c159>] SyS_ioctl+0x79/0x90
[  957.423009]  [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1

Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:32:30 -04:00
Ian Campbell dedc58e067 VSOCK: do not disconnect socket when peer has shutdown SEND only
The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.

Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.

I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:31:29 -04:00
Daniel Jurgens 82d69203df net/mlx4_en: Fix endianness bug in IPV6 csum calculation
Use htons instead of unconditionally byte swapping nexthdr.  On a little
endian systems shifting the byte is correct behavior, but it results in
incorrect csums on big endian architectures.

Fixes: f8c6455bb0 ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Carol Soto <clsoto@us.ibm.com>
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:26:14 -04:00
Haggai Abramovsky 73898db043 net/mlx4: Avoid wrong virtual mappings
The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory.  On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent.  Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent().  Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).

The mlx4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the
device is opened.

Prevent Ethernet driver to run this problematic code by forcing it to
allocate contiguous memory. As for the Infiniband driver, at first we
are trying to allocate contiguous memory, but in case of failure roll
back to work with fragmented memory.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reported-by: David Daney <david.daney@cavium.com>
Tested-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:23:05 -04:00
Linus Torvalds 43a3e837e2 mailmap: add John Paul Adrian Glaubitz
Apparently patchwork ended up truncating the full name.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 20:07:14 -07:00
Jesse Brandeburg 04b3b77981 i40e/i40evf: Remove reference to ring->dtype
As part of the rx-refactor, the dtype variable in the i40e_ring
struct is no longer used, so remove it.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 18:59:23 -07:00
Jesse Brandeburg b32bfa1724 i40e: Drop packet split receive routine
As part of preparation for the rx-refactor, remove the
packet split receive routine and ancillary code.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 18:52:06 -07:00
Jesse Brandeburg f8a952cb40 i40e/i40evf: Refactor tunnel interpretation
Refactor the interpretation of a tunnel.  This removes
some code and lets us start using the hardware's parsing.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-05 18:24:10 -07:00