After timer removal this just calls nf_ct_delete so remove the __ prefix
version and make nf_ct_kill a shorthand for nf_ct_delete.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If we evicted a large fraction of the scanned conntrack entries re-schedule
the next gc cycle for immediate execution.
This triggers during tests where load is high, then drops to zero and
many connections will be in TW/CLOSE state with < 30 second timeouts.
Without this change it will take several minutes until conntrack count
comes back to normal.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Conntrack gc worker to evict stale entries.
GC happens once every 5 seconds, but we only scan at most 1/64th of the
table (and not more than 8k) buckets to avoid hogging cpu.
This means that a complete scan of the table will take several minutes
of wall-clock time.
Considering that the gc run will never have to evict any entries
during normal operation because those will happen from packet path
this should be fine.
We only need gc to make sure userspace (conntrack event listeners)
eventually learn of the timeout, and for resource reclaim in case the
system becomes idle.
We do not disable BH and cond_resched for every bucket so this should
not introduce noticeable latencies either.
A followup patch will add a small change to speed up GC for the extreme
case where most entries are timed out on an otherwise idle system.
v2: Use cond_resched_rcu_qs & add comment wrt. missing restart on
nulls value change in gc worker, suggested by Eric Dumazet.
v3: don't call cancel_delayed_work_sync twice (again, Eric).
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dumping we already have to look at the entire table, so we might
as well toss those entries whose timeout value is in the past.
We also look at every entry during resize operations.
However, eviction there is not as simple because we hold the
global resize lock so we can't evict without adding a 'expired' list
to drop from later. Considering that resizes are very rare it doesn't
seem worth doing it.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as
Eric Dumazet pointed out during netfilter workshop 2016.
Eric also says: "Another reason was the fact that Thomas was about to
change max timer range [..]" (500462a9de, 'timers: Switch to
a non-cascading wheel').
Remove the timer and use a 32bit jiffies value containing timestamp until
entry is valid.
During conntrack lookup, even before doing tuple comparision, check
the timeout value and evict the entry in case it is too old.
The dying bit is used as a synchronization point to avoid races where
multiple cpus try to evict the same entry.
Because lookup is always lockless, we need to bump the refcnt once
when we evict, else we could try to evict already-dead entry that
is being recycled.
This is the standard/expected way when conntrack entries are destroyed.
Followup patches will introduce garbage colliction via work queue
and further places where we can reap obsoleted entries (e.g. during
netlink dumps), this is needed to avoid expired conntracks from hanging
around for too long when lookup rate is low after a busy period.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The reliable event delivery mode currently (ab)uses the DYING bit to
detect which entries on the dying list have to be skipped when
re-delivering events from the eache worker in reliable event mode.
Currently when we delete the conntrack from main table we only set this
bit if we could also deliver the netlink destroy event to userspace.
If we fail we move it to the dying list, the ecache worker will
reattempt event delivery for all confirmed conntracks on the dying list
that do not have the DYING bit set.
Once timer is gone, we can no longer use if (del_timer()) to detect
when we 'stole' the reference count owned by the timer/hash entry, so
we need some other way to avoid racing with other cpu.
Pablo suggested to add a marker in the ecache extension that skips
entries that have been unhashed from main table but are still waiting
for the last reference count to be dropped (e.g. because one skb waiting
on nfqueue verdict still holds a reference).
We do this by adding a tristate.
If we fail to deliver the destroy event, make a note of this in the
eache extension. The worker can then skip all entries that are in
a different state. Either they never delivered a destroy event,
e.g. because the netlink backend was not loaded, or redelivery took
place already.
Once the conntrack timer is removed we will now be able to replace
del_timer() test with test_and_set_bit(DYING, &ct->status) to avoid
racing with other cpu that tries to evict the same conntrack.
Because DYING will then be set right before we report the destroy event
we can no longer skip event reporting when dying bit is set.
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In case nf_conntrack_tuple_taken did not find a conflicting entry
check that all entries in this hash slot were tested and restart
in case an entry was moved to another chain.
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: ea781f197d ("netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2016-08-29
This series contains updates to fm10k only.
Jake provides all the changes in this series starting with fixes an issue
where VF devices may fail during an unbind/bind and we will never zero
the reference counter for the pci_dev structure. Updated the hot path
to use SW counters instead of checking for hardware Tx pending for
possible transmit hangs, which will improve performance. Fixed the NAPI
budget accounting so that fm10k_poll will return actual work done,
capped at (budget - 1) instead of returning 0. Added a check to ensure
that the device is in the normal IO state before continuing to probe,
which allows us to give a more descriptive message of what is wrong
in the case of uncorrectable AER error. In preparation for adding Geneve
Rx offload support, refactored the current VXLAN offload flow to be a bit
more generic. Added support for receive offloads on one Geneve tunnel.
Ensure that other bits in the RXQCTL register do not get cleared, to
make sure that bits related to queue ownership are maintained. Fixed
an issue in queue ownership assignment which casued a race condition
between the PF and the VF such that potentially a VF could cause FUM
fault errors due to normal PF/VF driver behavior.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
while moving xattrs to expand the extended inode. Also add some
sanity checks to the block group descriptors to make sure we don't end
up overwriting the superblock.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJXw7i2AAoJEPL5WVaVDYGj96gH/A8rNgx7BoqPx3kanVEamblT
tM0X9JcEGmKHN4enRts2b78EWbR0/U0SOP92+fg9SSq2MDJ0/kdaKLWmbUwx8jUi
B7HMEqCprlCdigK7wwt3xF+6edyZRhtzlWy3bhxJ40f0KT5CuriSQbxogr931uKl
hUKW2h5JtUqHtINzTt4oWjVm8xwrScxuYHYAcpw0G42ZzfO6xQOzQdowcx4m3cE9
PrtTbU5MwW8/wgsdLiClScQq30MK/GCbHh5heyRt1BcNo9+MDsZDOgdavh9StfnW
Bl1N6zwRtRBJNcpKWfTfwU4NTIvStCTyA8BJgKgE95YIHDsstJVl4MO7ot25qbM=
=pXe+
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix bugs that could cause kernel deadlocks or file system corruption
while moving xattrs to expand the extended inode.
Also add some sanity checks to the block group descriptors to make
sure we don't end up overwriting the superblock"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: avoid deadlock when expanding inode size
ext4: properly align shifted xattrs when expanding inodes
ext4: fix xattr shifting when expanding inodes part 2
ext4: fix xattr shifting when expanding inodes
ext4: validate that metadata blocks do not overlap superblock
ext4: reserve xattr index for the Hurd
Pull networking fixes from David Miller:
1) Segregate namespaces properly in conntrack dumps, from Liping Zhang.
2) tcp listener refcount fix in netfilter tproxy, from Eric Dumazet.
3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz.
4) Fix use-after-free in tcp_xmit_retransmit_queue().
5) Userspace header fixups (use of __u32, missing includes, etc.) from
Mikko Rapeli.
6) Further refinements to fragmentation wrt gso and tunnels, from
Shmulik Ladkani.
7) Trigger poll correctly for zero length UDP packets, from Eric
Dumazet.
8) TCP window scaling fix, also from Eric Dumazet.
9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets.
10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet.
11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng.
12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric Dumazet.
13) Add new device ID to alx driver, from Owen Lin.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
Add Killer E2500 device ID in alx driver.
net: smc91x: fix SMC accesses
Documentation: networking: dsa: Remove platform device TODO
net/mlx5: Increase number of ethtool steering priorities
net/mlx5: Add error prints when validate ETS failed
net/mlx5e: Fix memory leak if refreshing TIRs fails
net/mlx5e: Add ethtool counter for TX xmit_more
net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ
net/mlx5e: Don't wait for SQ completions on close
net/mlx5e: Don't post fragmented MPWQE when RQ is disabled
net/mlx5e: Don't wait for RQ completions on close
net/mlx5e: Limit UMR length to the device's limitation
rhashtable: fix a memory leak in alloc_bucket_locks()
sfc: fix potential stack corruption from running past stat bitmask
team: loadbalance: push lacpdus to exact delivery
net: hns: dereference ppe_cb->ppe_common_cb if it is non-null
8139cp: Fix one possible deadloop in cp_rx_poll
i40e: Change some init flow for the client
Revert "phy: IRQ cannot be shared"
net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
...
Remove module related code from two drivers that are only configurable as
built-in.
intel_pmic_gpio:
- Make explicitly non-modular
platform/olpc:
- Make ec explicitly non-modular
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXw8nIAAoJEKbMaAwKp364M/gH/iWSc0YIZppWjAF3brN8Vgv/
PSKzG2BJHiDV/LYsGBBmgPTcT7Y56Oun88MkVCdTx/oKuTYY0CzMd5FE1rpNJoyz
8kKy8Vi4PIj4qqAZA2//8GSY7iD/wX+9Zm2e42Hjo2gXxGF2RyrbLQue9C9vHrWr
3WyCZEeN/gaED6G7tKMOdPQnqAJFdqjvAhsvKU/dedWIWcgB9mee6drQUrJBuWfO
qnLUQrQXVGXQO3xOaPbJmusVhzrbUm4qlhb6cW86S/lNFp6JmN9XmgxULt/cHewE
TIdhODBDYgbrBDOzViQ4QVXZprq6P2EXzawXK9EWjNswZODJW5hIAgwPOizTgLo=
=GL9U
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
"Remove module related code from two drivers that are only configurable
as built-in: intel_pmic_gpio and platform/olpc"
* tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
intel_pmic_gpio: Make explicitly non-modular
platform/olpc: Make ec explicitly non-modular
Andrew Donnellan (1):
cxl: use pcibios_free_controller_deferred() when removing vPHBs
Andrzej Hajda (1):
powerpc/powernv/pci: fix iterator signedness
Boqun Feng (1):
powerpc, hotplug: Avoid to touch non-existent cpumasks.
Christophe Leroy (1):
powerpc: sysdev: cpm: fix gpio save_regs functions
Cyril Bur (1):
powerpc: signals: Discard transaction state from signal frames
Guenter Roeck (1):
powerpc: cputhreads: Add missing include file
Markus Elfring (3):
drivers/macintosh: Delete owner assignment
powerpc/512x: Delete unnecessary assignment for the field "owner"
powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"
Mauricio Faria de Oliveira (1):
powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
Michael Ellerman (1):
powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
Mukesh Ojha (1):
powerpc/powernv : Drop reference added by kset_find_obj()
Nicholas Piggin (3):
powerpc/pseries: PACA save area fix for general exception vs MCE
powerpc/pseries: PACA save area fix for MCE vs MCE
powerpc/tm: do not use r13 for tabort_syscall
Paolo Bonzini (1):
powerpc: move hmi.c to arch/powerpc/kvm/
Paul Gortmaker (1):
powerpc: migrate exception table users off module.h and onto extable.h
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXw7jmAAoJEHM62YSLdExeu6YP+gMNWkW4CDjvIo8OsVMhYKqX
RP3c6xa8APcizfdxssKMwX+STP6nfEzRUMCsDMmChSSpb1ZtSNdwM2t2zmlLDe8r
wKHN7Chqw1EJsnIT6TLa79fvHft03MyGQCTjeRvr54aFcTVJHuAgZnTeuejFBrtN
hHCsO/pEv+KjgGlDkN+UvEMUs0quI5Ri8kRJpDxO6il1agcPpeYFYB3ksIcgwaVH
0X/MwLJbzENmiDLhtb3S0jQWGMijmpPbcz5b3z36gIpTxddEPnC9UhYCdc924V/C
sREuku7rihXDEda6B/qNpk8kOxXxIkE/dh5eWVMyWVDHIXJA6hy4z0zpoF9xe4Ba
H2/ERyCfsP3LqonOpDOWKs/tkTAmMXOwdfgr+geoc4vDsbR2X8jyjHqb3bdj7J2A
YtcjmQ/w9HCnzC0weme1+0xR2tfyjUciNOdNoRHP/shLAObzc14xXlpg0IZ3eou2
oozGqkHnGoA6SrsqezF0UmsAf3884BcVscY1N45p1s8ulMRkyHTgPHM8KnGpHoHq
1zgUZji2F4VFgjGrMJyMZ82c3usP3/N65d2nkvRv/QzkmdTat4OmXLiB0Qky26Wy
gW7GsyRpGdEvW/KxhLxuobxPw7xg4mGQ3MdUMWzL3Df17IuRquuFhjljeQIUsQDE
I7NVfZATWJ7mqJQU2s5L
=+1I0
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Ben Herrenschmidt:
"This was meant to be sent early last week, but I has a change pending
on one of the fixes and other things made me forget all about. Ugh.
We have some misc fixes for powerpc 4.8. Some trivial bits and some
regressions, and a trivial cleanup or two that I saw no point in
letting rot in patchwork"
* tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: signals: Discard transaction state from signal frames
powerpc/powernv : Drop reference added by kset_find_obj()
powerpc/tm: do not use r13 for tabort_syscall
powerpc: move hmi.c to arch/powerpc/kvm/
powerpc: sysdev: cpm: fix gpio save_regs functions
powerpc/pseries: PACA save area fix for MCE vs MCE
powerpc/pseries: PACA save area fix for general exception vs MCE
powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
powerpc, hotplug: Avoid to touch non-existent cpumasks.
powerpc: migrate exception table users off module.h and onto extable.h
powerpc/powernv/pci: fix iterator signedness
powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)
cxl: use pcibios_free_controller_deferred() when removing vPHBs
powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner"
powerpc/512x: Delete unnecessary assignment for the field "owner"
drivers/macintosh: Delete owner assignment
powerpc: cputhreads: Add missing include file
Attribute array it87_attributes_in lacks its NULL terminator,
causing random behavior when operating on the attribute group.
Fixes: 5292971563 ("hwmon: (it87) Use is_visible for voltage sensors")
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
When the PF assigns a new MAC address to a VF it uses the base address
registers to store the MAC address. This allows a VF which loads after
this setup the ability to get the initial address without having to wait
for a mailbox message. Unfortunately to do this, the PF must take queue
ownership away from the VF, which can cause fault errors when there is
already an active VF driver.
This queue ownership assignment causes race condition between the PF and
the VF such that potentially a VF can cause FUM fault errors due to
normal PF/VF driver behavior.
It is not safe to simply allow the PF to write the base address
registers without taking queue ownership back as the PF must also
disable the queues, and this would impact active VF use. The current
code is safe because the queue ownership will prevent the VF from
actually writing but does trigger the FUM fault.
We can do better by simply avoiding the register write process when
a mailbox message suffices. If the message can be sent over the mailbox,
then we will not perform the queue ownership assignment and we won't
update the base address to be the same as the MAC address.
We do still have to write the TXQCTL registers in order to update the
VID of the queue. This is necessary because the TXQCTL register is
read-only from the VF, and thus the VF cannot do this for itself. This
register does not need to wait for the Tx queue to be disabled and is
safe for the PF to write during normal VF operation, so we move this
write to the top of the function above the mailbox message. Without
this, the TXQCTL register would be misconfigured and cause the VF to Tx
hang.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ensure that other bits in the RXQCTL register do not get cleared. This
ensures that bits related to queue ownership are maintained.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to how we handle VXLAN offload, enable support for a single
Geneve tunnel.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In preparation for adding Geneve Rx offload support, refactor the
current VXLAN offload flow to be a bit more generic so that it will be
easier to add the new Geneve code. The fm10k hardware supports one VXLAN
and one Geneve tunnel, so we will eventually treat the VXLAN and Geneve
tunnels identically. To this end, factor out the code that handles the
current list so that we can use the generic flow for both tunnels in the
next patch.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the event of a surprise remove, we expect the driver to go down,
which includes calling .stop_hw(). However, this function will return an
error because the queues won't appear to cleanly disable. Prevent this
and avoid the unnecessary checks by just returning when
FM10K_REMOVED(hw->hw_addr) is true.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the event of an uncorrectable AER error occurring when the driver has
not loaded, the recovery routines are not done. This is done because
future loads of the driver may not be aware of the IO state and may not
be able to recover at all. In this case, when we next load the driver it
fails due to what appears to be a surprise remove event. Instead, add
a check to ensure that the device is in the normal IO state before
continuing to probe. This allows us to give a more descriptive message
of what is wrong.
Without this change, the driver will attempt to probe up to our first
call of .reset_hw() which will be unable to read registers and act as if
a surprise remove event occurred.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When fm10k_poll fully cleans rings it returns 0. This is incorrect as it
messes up the budget accounting in the core NAPI code. Fix this by
returning actual work done, capped at budget - 1 since the core doesn't
expect a return of the full budget when the driver modifies the NAPI
status.
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
While technically not needed, as all our uses of ACCESS_ONCE are scalar
types, we already use READ_ONCE in a few places, and for code
readability we can swap all the uses of the older ACCESS_ONCE into
READ_ONCE.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function is only used in fm10k_ethtool.c, so make it static.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A previous patch added support to check for hardware Tx pending in the
fm10k_down routine. This support was intended to ensure that we
accurately check what the hardware state is. However, checking for Tx
hangs in this manor during the hotpath results in a large performance
hit. Avoid this by making the hotpath check use the SW counters instead.
Fixes: a0f53cf49cb0 ("fm10k: use actual hardware registers when checking for pending Tx", 2016-06-08)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A previous patch removed the pci_disable_device() call in
.io_error_detected. This call corresponded to a pci_enable_device_mem()
call within .io_slot_reset handler. Change the call here to
a pci_reenable_device() so that it does not increment and leak the
enable_cnt reference count for the device. Without this change, VF
devices may fail during an unbind/bind, and we'll never zero the
reference counter for the pci_dev structure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The Kconfig entry controlling compilation of this code is:
drivers/platform/x86/Kconfig:config GPIO_INTEL_PMIC
drivers/platform/x86/Kconfig: bool "Intel PMIC GPIO support"
...meaning that it currently is not being built as a module by anyone.
Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.
We delete the MODULE_LICENSE tag etc. since all that information
was (or is now) contained at the top of the file in the comments.
We don't replace module.h with init.h since the file already has that.
Cc: Alek Du <alek.du@intel.com>
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
The Kconfig entry controlling compilation of this code is:
arch/x86/Kconfig:config OLPC
arch/x86/Kconfig: bool "One Laptop Per Child support"
...meaning that it currently is not being built as a module by anyone.
Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.
We delete the MODULE_LICENSE tag etc. since all that information
was (or is now) contained at the top of the file in the comments.
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
The addition of VLAN support caused a possible use of uninitialized
data if we encounter a zero TCA_FLOWER_KEY_ETH_TYPE key, as pointed
out by "gcc -Wmaybe-uninitialized":
net/sched/cls_flower.c: In function 'fl_change':
net/sched/cls_flower.c:366:22: error: 'ethertype' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This changes the code to only set the ethertype field if it
was nonzero, as before the patch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9399ae9a6c ("net_sched: flower: Add vlan support")
Cc: Hadar Hen Zion <hadarh@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The newly added reset logic uses helper functions for the MMIO that
may fail. However, when the read operation fails, we end up writing
back uninitialized data to the register, as gcc warns:
drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c: In function 'xgene_enet_link_state':
drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:213:2: error: 'data' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c:209:6: note: 'data' was declared here
u32 data;
We already print a warning to the console log if that happens,
the best alternative that I can see is skip the rest of the reset
sequence if the register value cannot be read: Most likely the
write would fail as well, and if it succeeded, worse things could
happen.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 3eb7cb9dc9 ("drivers: net: xgene: XFI PCS reset when link is down")
Cc: Fushen Chen <fchen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enhances ethtool link mode bitmap to include
missing interface modes for 1G/10G speeds
Changes:
1000baseX is the mode introduced to cover all 1G Fiber cases.
All modes under 1000BaseX i.e. 1000BASE-SX, 1000BASE-LX, 1000BASE-LX10
and 1000BASE-BX10 are not explicitly defined at this moment.
10G CR,SR,LR and ER link modes are included for 10G speed..
Issue:
ethtool on 1G/10G SFP port reports Base-T
as this port supports 1000baseX,10G CR, SR and LR modes.
root@tor-02$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000000 (0)
Link detected: yes
After fix:
root@tor-02$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseX/Full
10000baseCR/Full
10000baseSR/Full
10000baseLR/Full
10000baseER/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000000 (0)
Link detected: yes
Signed-off-by: Vidya Sagar Ravipati <vidya@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After resume from hibernate on arm64, any amd-xgbe devices that were
running when we hibernated are reported as down, even when it is not.
Re-plugging the cables does not cause the interface to come back, the
link must be marked as down then up via 'ip set link' using the serial
console.
This happens because the device has been power-cycled and possibly
re-initialised by firmware, whereas the driver's memory structures have
been restored from the hibernate image and the two do not agree.
Schedule a restart of the device after powerup in case the world changed
while we were asleep.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When TCP operates in lossy environments (between 1 and 10 % packet
losses), many SACK blocks can be exchanged, and I noticed we could
drop them on busy senders, if these SACK blocks have to be queued
into the socket backlog.
While the main cause is the poor performance of RACK/SACK processing,
we can try to avoid these drops of valuable information that can lead to
spurious timeouts and retransmits.
Cause of the drops is the skb->truesize overestimation caused by :
- drivers allocating ~2048 (or more) bytes as a fragment to hold an
Ethernet frame.
- various pskb_may_pull() calls bringing the headers into skb->head
might have pulled all the frame content, but skb->truesize could
not be lowered, as the stack has no idea of each fragment truesize.
The backlog drops are also more visible on bidirectional flows, since
their sk_rmem_alloc can be quite big.
Let's add some room for the backlog, as only the socket owner
can selectively take action to lower memory needs, like collapsing
receive queues or partial ofo pruning.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b70661c708 ("net: smc91x: use run-time configuration on all ARM
machines") broke some ARM platforms through several mistakes. Firstly,
the access size must correspond to the following rule:
(a) at least one of 16-bit or 8-bit access size must be supported
(b) 32-bit accesses are optional, and may be enabled in addition to
the above.
Secondly, it provides no emulation of 16-bit accesses, instead blindly
making 16-bit accesses even when the platform specifies that only 8-bit
is supported.
Reorganise smc91x.h so we can make use of the existing 16-bit access
emulation already provided - if 16-bit accesses are supported, use
16-bit accesses directly, otherwise if 8-bit accesses are supported,
use the provided 16-bit access emulation. If neither, BUG(). This
exactly reflects the driver behaviour prior to the commit being fixed.
Since the conversion incorrectly cut down the available access sizes on
several platforms, we also need to go through every platform and fix up
the overly-restrictive access size: Arnd assumed that if a platform can
perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access
size needed to be specified - not so, all available access sizes must
be specified.
This likely fixes some performance regressions in doing this: if a
platform does not support 8-bit accesses, 8-bit accesses have been
emulated by performing a 16-bit read-modify-write access.
Tested on the Intel Assabet/Neponset platform, which supports only 8-bit
accesses, which was broken by the original commit.
Fixes: b70661c708 ("net: smc91x: use run-time configuration on all ARM machines")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 83c0afaec7 ("net: dsa: Add new binding implementation"),
the shortcomings of the dsa platform device have been addressed, remove
that TODO item.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in dev_warn message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in dev_warn message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in dev_err message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert says:
====================
strp: Generalize stream parser to work with other socket types
Add a read_sock protocol operation function that allows something like
tcp_read_sock to be called for other protocol types.
Specific changes in this patch set:
- Add read_sock function to proto_ops. This has the same signature as
tcp_read_sock. sk_read_actor_t is also defined in net.h.
- Set peek_len and read_sock proto_op functions for TCPv4 and TCPv6
stream ops.
- Remove references to tcp in strparser.
- Call peek_len and read_sock operations from strparser instead of
calling TCP specific functions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
kcm and strparser need to work with any type of stream socket not just
TCP. Eliminate references to TCP and call generic proto_ops functions of
read_sock and peek_len. Also in strp_init check if the socket support
the proto_ops read_sock and peek_len.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In inet_stream_ops we set read_sock to tcp_read_sock and peek_len to
tcp_peek_len (which is just a stub function that calls tcp_inq).
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new function in proto_ops structure. This includes moving the
typedef got sk_read_actor into net.h and removing the definition from
tcp.h.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox 100G mlx5 fixes 2016-08-29
This series contains some bug fixes for the mlx5 core and mlx5
ethernet driver.
From Saeed, Fix UMR to consider hardware translation table field
size limitation when calculating the maximum number of MTTs required
by the driver. Three patches to speed-up netdevice close time by
serializing channel (SQs & RQs) destruction rather than issuing and
waiting for hardware interrupts to free them.
From Eran, Fix ethtool ring parameter reporting for striding RQ layout.
Add error prints on ETS validation failure.
From Kamal, Fix memory leak on error flow.
From Maor, Fix ethtool steering priorities number.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethtool has 11 flow tables, each flow table has its own priority.
Increase the number of priorities to be aligned with the number of flow
tables.
Fixes: 1174fce8d1 ('net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon set ETS failure due to user invalid input, add error prints to
specify the exact error to the user.
Fixes: cdcf11212b ('net/mlx5e: Validate BW weight values of ETS')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free 'in' command object also when mlx5_core_modify_tir fails.
Fixes: 724b2aa151 ("net/mlx5e: TIRs management refactoring")
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>