mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.
Fixes: 71f6d1b31f ("net: mvneta: replace Tx timer with a real interrupt")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.
The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.
In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.
The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.
Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.
No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.
This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
In sky2_change_mtu setting B0_IMSK to 0 may be delayed due to PCI write posting
which could result in irqs being still active when synchronize_irq is called.
Since we are not prepared to handle any further irqs after synchronize_irq
(our resources are freed after that) force the write by a consecutive read from
the same register.
Similar situation in sky2_all_down: Here we disabled irqs by a write to B0_IMSK
but did not ensure that this write took place before synchronize_irq. Fix that
too.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of a spurious interrupt dont forget to reenable the interrupts that
have been masked by reading the interrupt source register.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Mirko Lindner <mlindner@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In pxa168_eth_open() the irqs are enabled before napi. This opens a tiny time
window in which the irq handler is processed, disables irqs but then is not able
to schedule the not yet activated napi, leaving irqs disabled forever (since
irqs are reenabled in napi poll function).
Fix this race by activating napi before irqs are activated.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
If sky2->tx_le = pci_alloc_consistent() or sky2->tx_ring = kcalloc() in
sky2_alloc_buffers() fails, sky2->rx_ring = kcalloc() will never be called.
In this error case handling, sky2_rx_clean() is called from within
sky2_free_buffers().
In sky2_rx_clean() we find the following:
...
memset(sky2->rx_le, 0, RX_LE_BYTES);
...
This results in a memset using a NULL pointer and will crash the system.
Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch to a random RSS key rather than a fixed one.
Using netdev_rss_key_fill helper also ensures that all ports share
a common key.
See also commit 960fb622f8.
Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is to ensure the net_device's dev.parent is set before we used it
in dma_zalloc_coherent() from init_hash_table().
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ATM, txq_reclaim will dequeue and free an skb for each tx desc released
by the hw that has TX_LAST_DESC set. However, in case of TSO, each
hw desc embedding the last part of a segment has TX_LAST_DESC set,
losing the one-to-one 'last skb frag'/'TX_LAST_DESC set' correspondance,
which causes data corruption.
Fix this by checking TX_ENABLE_INTERRUPT instead of TX_LAST_DESC, and
warn when trying to dequeue from an empty txq (which can be symptomatic
of releasing skbs prematurely).
Fixes: 3ae8f4e0b9 ('net: mv643xx_eth: Implement software TSO')
Reported-by: Slawomir Gajzner <slawomir.gajzner@gmail.com>
Reported-by: Julien D'Ascenzio <jdascenzio@yahoo.fr>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
we are allocating memory using kzalloc for struct mvpp2_prs_entry,
but later when we are getting error we were just returning the error
value without releasing the memory.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use phy_print_status() to report a change in the PHY status.
The current message is not verbose enough, so this commit improves
it by using the generic status message.
After this change, the kernel reports PHY status down and up events as:
mvneta f1070000.ethernet eth0: Link is Down
mvneta f1070000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.
Signed-off-by: Varka Bhadram <varkab@cdac.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
With properly using libphy PHYs now, remove the in-driver PHY
mangling.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marvell Ethernet IP supports PHY negotiation driven by HW. This
fundamentally clashes with libphy (software) driven negotiation and
also cannot cope with quirky PHYs. Therefore, always disable any HW
negotiation features and properly use libphy's phy_device.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current libphy handling in pxa168_eth lacks proper phy_connect. Prepare
to fix this by first moving phy properties from platform_data to private
driver data.
Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If NO_DMA=y:
drivers/built-in.o: In function `rxq_deinit':
pxa168_eth.c:(.text+0x2a2f2e): undefined reference to `dma_free_coherent'
drivers/built-in.o: In function `txq_reclaim':
pxa168_eth.c:(.text+0x2a3044): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `txq_deinit':
pxa168_eth.c:(.text+0x2a310a): undefined reference to `dma_free_coherent'
drivers/built-in.o: In function `txq_init':
pxa168_eth.c:(.text+0x2a3226): undefined reference to `dma_alloc_coherent'
drivers/built-in.o: In function `rxq_init':
pxa168_eth.c:(.text+0x2a32d4): undefined reference to `dma_alloc_coherent'
drivers/built-in.o: In function `init_hash_table':
pxa168_eth.c:(.text+0x2a3354): undefined reference to `dma_alloc_coherent'
drivers/built-in.o: In function `rxq_refill':
pxa168_eth.c:(.text+0x2a345a): undefined reference to `dma_map_single'
drivers/built-in.o: In function `rxq_process':
pxa168_eth.c:(.text+0x2a39cc): undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `pxa168_eth_remove':
pxa168_eth.c:(.text+0x2a3b84): undefined reference to `dma_free_coherent'
drivers/built-in.o: In function `pxa168_eth_start_xmit':
pxa168_eth.c:(.text+0x2a3e8a): undefined reference to `dma_map_single'
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signedness bugs may occur when using signed char for bitops,
depending on if the highest bit is ever used.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PXA168_ETH need HAS_IOMEM, so depend on it, the related error (with
allmodconfig under um):
CC [M] drivers/net/ethernet/marvell/pxa168_eth.o
drivers/net/ethernet/marvell/pxa168_eth.c: In function ‘pxa168_eth_probe’:
drivers/net/ethernet/marvell/pxa168_eth.c:1605:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
iounmap(pep->base);
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a dependency to COMPILE_TEST so that the driver can be compiled for
test purposes.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Berlin SoCs have an Ethernet controller compatible with the pxa168.
Allow these SoCs to use the pxa168_eth driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch rework the way the MAC address is retrieved. The MAC address
can now, in addition to being random, be set in the device tree or
retrieved from the Ethernet controller MAC address registers. The
probing function will try to get a MAC address in the following order:
- From the device tree.
- From the Ethernet controller MAC address registers.
- Generate a random one.
This patch also adds a function to read the MAC address from the
Ethernet Controller registers.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing the MAC address, in addition to updating the dev_addr in
the net_device structure, this patch also update the MAC address
registers (high and low) of the Ethernet controller with the new MAC.
The address stored in these registers is used for IEEE 802.3x Ethernet
flow control, which is already enabled.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
IEEE 802.3x Ethernet flow control is disabled when bit (1 << 2) is set
in the port status register. Fix the flow control detection in the link
event handling function which was relying on the opposite assumption.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the device tree support to the pxa168_eth driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up a bit the pxa168_eth driver before adding the device tree
support.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the much more common pr_warn instead of pr_warning.
Other miscellanea:
o Typo fixes submiting/submitting
o Coalesce formats
o Realign arguments
o Add missing terminating '\n' to formats
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/marvell/mvneta.c: In function 'mvneta_skb_tx_csum':
drivers/net/ethernet/marvell/mvneta.c:1374:3: error: implicit declaration of function 'vlan_get_protocol' [-Werror=implicit-function-declaration]
__be16 l3_proto = vlan_get_protocol(skb);
^
Reporeted-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver doesn't appear to support vlan acceleration at
all. However, it does claim to support TSO and IP checksums
for vlan devices. Thus any configured vlan device would
end up passing down partial checksums or TSO frames.
The driver also uses the value from skb->protocol to
determine TSO and checksum offload information, but assumes
that skb->protocol holds the l3 protocol information.
As a result, vlan traffic with partial checksums or TSO
will fail those checks and TSO will not happen.
Fix this by using vlan_get_protocol() helper.
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miscellaneous
- Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT7PyAAAoJEFmIoMA60/r8kjQQALr/8oEfZoVcjgCb7waWOr25
hUTnrI6GBIAh/50hoBiPq0ouPCAKVv66+CUhuhFkLP7oJz+rMU0B9hfUvdLfmCpH
7ppaallkllT9nPFIr7h5RUWLXsoQyuHmCYmSrUCcnlT2LPgU0dN72YWElLisEM6Z
Pldg3933xyIQaCWviHjGEjWb7NvC+JY4pTkV5iyqGgU8Ale/eFYtLLSfdBEjIbGv
VDirYZmKELYeuncZPrTAsp4IENRMZn702wwDakMSODVMEWtJB5h4yrBawqQDlFP5
9ztIX6n9p9zkdVKbYZlx/Xwv6SYEnYXLxauVQMSO3Nck7Z10R5Ud+5uuCg/6mWH8
AQI4UV5bbJcg7zHgocTG9XLFLFPoPtD2JT6k6UT1LeUAiAOqcSzhRO+/qJBmJOWZ
Zv+EHXPlxBrl0zNifut6ZQrY17teuItVtmha70a/9W3PjnIx3KecqLcTwdTvDsOY
IAyH8WMZrBKpPpsczSmfE93i2Z1QRS91HEAOeSMxl/98dcDTdllYZS7spjoDll2f
xmpGDbpriLSCu2XsGHfTC9RbqA7CyuFlHggJSQDkT/5Esli0sCs7eweTuK3RVvPu
t6bUHK3yElb6x9qMZhb5q6l72wSMlGMishTdaxEHmqrEA8PtaIFodmVX2T/Zel5n
GHN6bysPqDItNR2v/3JX
=jJGu
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas:
"Part two of the PCI changes for v3.17:
- Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)
It's a mechanical change that removes uses of the
DEFINE_PCI_DEVICE_TABLE macro. I waited until later in the merge
window to reduce conflicts, but it's possible you'll still see a few"
* tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
Pull networking fixes from David Miller:
"Several networking final fixes and tidies for the merge window:
1) Changes during the merge window unintentionally took away the
ability to build bluetooth modular, fix from Geert Uytterhoeven.
2) Several phy_node reference count bug fixes from Uwe Kleine-König.
3) Fix ucc_geth build failures, also from Uwe Kleine-König.
4) Fix klog false positivies when netlink messages go to network
taps, by properly resetting the network header. Fix from Daniel
Borkmann.
5) Sizing estimate of VF netlink messages is too small, from Jiri
Benc.
6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian.
7) VLAN untagging is erroneously dependent upon whether the VLAN
module is loaded or not, but there are generic dependencies that
matter wrt what can be expected as the SKB enters the stack.
Make the basic untagging generic code, and do it unconditionally.
From Vlad Yasevich.
8) xen-netfront only has so many slots in it's transmit queue so
linearize packets that have too many frags. From Zoltan Kiss.
9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian
Fainelli"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits)
net: bcmgenet: correctly resume adapter from Wake-on-LAN
net: bcmgenet: update UMAC_CMD only when link is detected
net: bcmgenet: correctly suspend and resume PHY device
net: bcmgenet: request and enable main clock earlier
net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call
xen-netfront: Fix handling packets on compound pages with skb_linearize
net: fec: Support phys probed from devicetree and fixed-link
smsc: replace WARN_ON() with WARN_ON_SMP()
xen-netback: Don't deschedule NAPI when carrier off
net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile
wan: wanxl: Remove typedefs from struct names
m68k/atari: EtherNEC - ethernet support (ne)
net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call
hdlc: Remove typedefs from struct names
airo_cs: Remove typedef local_info_t
atmel: Remove typedef atmel_priv_ioctl
com20020_cs: Remove typedef com20020_dev_t
ethernet: amd: Remove typedef local_info_t
net: Always untag vlan-tagged traffic on input.
drivers: net: Add APM X-Gene SoC ethernet driver support.
...
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to
meet kernel coding style guidelines. This issue was reported by checkpatch.
A simplified version of the semantic patch that makes this change is as
follows (http://coccinelle.lip6.fr/):
// <smpl>
@@
identifier i;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer z;
@@
- DEFINE_PCI_DEVICE_TABLE(i)
+ const struct pci_device_id i[]
= z;
// </smpl>
[bhelgaas: add semantic patch]
Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Remove the now unnecessary memset too.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If there is a "phy" handle the probe function returns with holding a
reference to that node. Make sure that in the fixed phy case there is
also held a reference to yield a consistant state.
Also add the corresponding of_node_put in the error path and the remove
function.
Fixes: 83895bedee ("net: mvneta: add support for fixed links")
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements the ->ndo_do_ioctl() operation so that the
PHY-related ioctl() calls can work from userspace, which allows
applications like mii-tool or mii-diag to do their job.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is similar to commit 4d12bc63ab ("net: mvneta: fix
operation in 10 Mbit/s mode"), but this time for the mvpp2 driver. The
driver was properly taking into account the 1 Gbit/s and 100 Mbit/s
speeds, but not the 10 Mbit/s, which was handled as 100
Mbit/s. However, the MVPP2_GMAC_CONFIG_MII_SPEED bit in the
MVPP2_GMAC_AUTONEG_CONFIG register must remain cleared to allow 10
Mbit/s operation. This commit therefore fixes 10 Mbit/s operation.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all the users of mvpp2_bm_bufs_free() have been fixed, we can safely
clean the function prototype.
The function is always called to release all the buffers in a BM pool, and
the number of buffers freed is not needed. Therefore, we change the return
to a void, and remove the "num" parameter. This is a cosmetic change, to
make the code slightly cleaner.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a call to mvpp2_bm_bufs_free(), the caller usually wants to know
if the function successfully freed the requested number. However, this
cannot be done by looking into the BM pool count, because the current
buffer count was updated by mvpp2_bm_bufs_free().
In fact, the current callers of mvpp2_bm_bufs_free() use it to release
all the buffers in the pool, so we can fix this by simply checking
if the pool is not empty.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the network interfaces that are not configured by the bootloader
(using e.g. tftp or ping) can detect the link status but are unable to
transmit data.
The network controller has a functionality that allows the hardware to
continuously poll the PHY and directly update the MAC configuration accordingly
(speed, duplex, etc.). However, this doesn't work well with phylib's
software-based polling and updating MAC configuration in the driver's callback.
This commit fixes this issue by:
1. Setting MVPP2_PHY_AN_STOP_SMI0_MASK in MVPP2_PHY_AN_CFG0_REG in
mvpp2_init(), which disables the harware polling feature.
2. Disabling MVPP2_GMAC_PCS_ENABLE_MASK bit in MVPP2_GMAC_CTRL_2_REG in
mvpp2_port_mii_set() for port types other than SGMII.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This bit was originally wrong, the correct value is BIT(1), so fix it.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return -ENODEV from the no ports enabled error handling
case instead of 0.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
The proper string for this license is "GPL v2", instead of "GPLv2".
This commit fixes that.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds a new network driver for the network controller in Marvell
Armada 375 SoC.
Given the controller is very different from the ones in the other Marvell
SoCs that use the mv643xx_eth (Kirkwood, Orion, Discovery) and mvneta
(Armada 370/38x/XP) drivers, a new driver is needed.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
[Ezequiel: coding style cleanup]
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit fixes the command value generated for CSUM calculation
when running in big endian mode. The Ethernet protocol ID for IP was
being unconditionally byte-swapped in the layer 3 protocol check (with
swab16), which caused the mvneta driver to not function correctly in
big endian mode. This patch byte-swaps the ID conditionally with
htons.
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Thomas Fitzsimmons <fitzsim@fitzsim.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Maggie Mae Roxas, the mvneta driver doesn't behave
properly in 10 Mbit/s mode. This is due to a misconfiguration of the
MVNETA_GMAC_AUTONEG_CONFIG register: bit MVNETA_GMAC_CONFIG_MII_SPEED
must be set for a 100 Mbit/s speed, but cleared for a 10 Mbit/s speed,
which the driver was not properly doing. This commit adjusts that by
setting the MVNETA_GMAC_CONFIG_MII_SPEED bit only in 100 Mbit/s mode,
and relying on the fact that all the speed related bits of this
register are cleared at the beginning of the mvneta_adjust_link()
function.
This problem exists since c5aff18204 ("net: mvneta: driver for
Marvell Armada 370/XP network unit") which is the commit that
introduced the mvneta driver in the kernel.
Cc: <stable@vger.kernel.org> # v3.8+
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Reported-by: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Cc: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added FUJITSU SIEMENS A8NE-FM to the list of 32bit DMA boards
>From Tomi O.:
After I added an entry to this MB into the skge.c
driver in order to enable the mentioned 64bit dma disable quirk,
the network data corruptions ended and everything is fine again.
Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buffers for the TSO headers belong to a DMA coherent region which is
allocated at ndo_open() time, and released at ndo_stop() time.
Therefore, and contrary to the TSO payload descriptor buffers, the TSO header
buffers don't need to be unmapped. This commit adds a check to detect a
TSO header buffer and explicitly prevent the unmap.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After adding proper stop/wake thresholds, we can expect a queue to never
be full and drop the NETDEV_TX_BUSY return path. In any case, if the queue
cannot accommodate a TSO packet, the packet would be discarded.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently small MSS values may require too many TSO descriptors for
the default queue size. This commit prevents this situation by fixing
the maximum supported TSO number of segments to 100 and by setting a
minimum Tx queue size. The minimum Tx queue size is set so that at
least 2 worst-case skb can be accommodated.
In addition, the queue stop and wake thresholds values are adjusted
accordingly. The queue is stopped when there's room for only 1 worst-case
skb and waked when the number of descriptors is half that value.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit fixes the current dropped packet count by doing it properly,
increasing the count when a packet is discarded; i.e. the packet is not
processed and the driver returns NETDEV_TX_OK.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buffers for the TSO headers belong to a DMA coherent region which is
allocated at ndo_open() time, and released at ndo_stop() time.
Therefore, and contrary to the TSO payload descriptor buffers, the TSO header
buffers don't need to be unmapped. This commit adds a check to detect a
TSO header buffer and explicitly prevent the unmap.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tx descriptor release code currently calls dma_unmap_single() and
dev_kfree_skb_any() if the descriptor is associated with a non-NULL skb.
This is true only for the last fragment of the packet.
This is wrong, however, since every descriptor buffer is DMA mapped and needs
to be unmapped.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently small MSS values may require too many TSO descriptors for
the default queue size. This commit prevents this situation by fixing
the maximum supported TSO number of segments to 100 and by setting a
minimum Tx queue size. The minimum Tx queue size is set so that at
least 2 worst-case skb can be accommodated.
In addition, the queue stop and wake thresholds values are adjusted
accordingly. The queue is stopped when there's room for only 1 worst-case
skb and waked when the number of descriptors is half that value.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver has no need for a custom NAPI weigth. Use the default
one, which has the same value.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'weight' field is only used to pass the weigth to napi initialization
function. This commit removes the field, and instead uses a fixed value to
initialize the napi context.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit makes use of devm_kmalloc_array() for memory allocation and the
recently introduced devm_mdiobus_alloc() API to simplify driver's code.
While here, remove a redundant out of memory error message.
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver does not support multiple rx queues, and so it's a waste
of resources to have a default number larger than one (1).
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_prepare_mac_addr_change and eth_commit_mac_addr_change, instead
of manually checking and storing the MAC address, which makes the
code slightly more robust. This fixes the lack of valid MAC address check
in the driver's .ndo_set_mac_address hook.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit cleans-up mvneta_init(), which initializes the hardware
and allocates the rx/qx queues. The queue allocation is simplified
by using devm_kcalloc instead of kzalloc. The unused phy_addr parameter
is removed. While here, the 'hal' references in the comments are removed.
This commit makes no functionality change.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit checks the return code of mvneta_setup_txq() call
in mvneta_change_mtu(). Also, use the netdevice pointer directly
instead of dereferencing the port structure. While here, let's
fix a tiny comment typo.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A tiny clean-up to improve readability. This commit makes no functionality
change.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mv643xx_eth_adjust_link() is only used to call mv643xx_adjust_pscr().
This commit renames the latter to the former, and therefore removes the extra
and useless function.
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the TSO helper API has been introduced, this commit makes use
of it to add support for software TSO in this driver.
This feature allows to improve outbound throughput performance significantly.
Running iperf tests shows a 30% improvement, tested on a Kirkwood Openblocks
A6 board.
$ ethtool -K eth0 tso off
$ iperf -c 192.168.0.45 -t 3
------------------------------------------------------------
Client connecting to 192.168.0.45, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.159 port 46389 connected with 192.168.0.45 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 3.0 sec 217 MBytes 607 Mbits/sec
$ ethtool -K eth0 tso on
$ iperf -c 192.168.0.45 -t 3
------------------------------------------------------------
Client connecting to 192.168.0.45, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.0.159 port 46390 connected with 192.168.0.45 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 3.0 sec 336 MBytes 938 Mbits/sec
This commit is just an example of the usage of the TSO API, it works fine
but needs some more work. In particular, the descriptor unmapping path must
avoid unmapping the TSO headers.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using dma_map_single() instead of skb_frag_dma_map() allows to unmap
all the descriptors using dma_unmap_single(). This change allows
to introduce software TSO in a less intrusive way.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to ease the addition of new features, let's factorize the
feature list.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As specified in the datasheet, the driver can set the "L4Chk_Mode" flag
(bit 10) in the Tx descriptor command/status to specify that a frame is not
IP fragmented and that the controller is in charge of generating the TCP/IP
checksum. This must be used together with the "GL4chk" flag (bit 17).
These two flags allow to avoid setting the initial TCP checksum in the l4i_chk
field of the Tx descriptor, which is needed to support software TSO.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the code more readable by moving the initial checksum setup
and the command/status preparation to its own function.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the TSO helper API has been introduced, this commit makes use
of it to implement the TSO in this driver.
Using iperf to test and vmstat to check the CPU usage, shows a substantial
CPU usage drop when TSO is on (~15% vs. ~25%). HTTP-based tests performed
by Willy Tarreau have shown performance improvements.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rework mvneta_tx() so that the code that performs the final handling
before a sk_buff is transmitted is done only if the numbers of fragments
processed if positive.
This is preparation work to add the support for software TSO.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to ease the addition of new features, let's factorize the
feature list.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the introduction of of_phy_register_fixed_link(), this patch
introduces fixed link support in the mvneta driver, for Marvell Armada
370/XP SOCs.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net: get rid of SET_ETHTOOL_OPS
Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone.
This does that.
Mostly done via coccinelle script:
@@
struct ethtool_ops *ops;
struct net_device *dev;
@@
- SET_ETHTOOL_OPS(dev, ops);
+ dev->ethtool_ops = ops;
Compile tested only, but I'd seriously wonder if this broke anything.
Suggested-by: Dave Miller <davem@davemloft.net>
Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following commit:
commit 9ec36cafe4
Author: Rob Herring <robh@kernel.org>
Date: Wed Apr 23 17:57:41 2014 -0500
of/irq: do irq resolution in platform_get_irq
changed platform_get_irq() which now returns EINVAL and EPROBE_DEFER,
in addition to ENXIO. If there's no interrupt for mvmdio, platform_get_irq()
returns EINVAL, but we currently check only for ENXIO.
Fix this by looking for a positive integer, which is the proper way of
validating a virtual interrupt number.
While at it, add a proper handling for the deferral probe case.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Reviewed-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5445eaf309 ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.
However, some other platforms, namely the Armada XP GP don't use
SGMII, but a QSGMII connection between the MAC and the PHY, and this
case was not supported by the mvneta driver, which was relying on
configuration put in place by the bootloader. While this works when
the mvneta driver is built-in (because clocks are not gated), it
breaks when mvneta is built as a module, because the clock is gated
(all configuration is lost) and then re-enabled when the mvneta driver
is loaded.
In order to support all of RGMII, SGMII and QSGMII, this commit
reworks how the PHY interface configuration is done, and simplifies
it: it removes the mvneta_port_sgmii_config() and
mvneta_gmac_rgmii_set() functions, which were strange because
mvneta_gmac_rgmii_set() was called in all cases, even for SGMII
configurations. Also, the mvneta_gmac_rgmii_set() function was taking
a boolean as argument, which was always true.
Instead, all the PHY interface configuration logic is moved into the
mvneta_port_power_up() function, in a much simpler 'switch' construct,
with four cases:
- QSGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
the SERDES is configured in QSGMII. Technically speaking,
configuring the SERDES of the first port would be sufficient, but
it is simpler to do it on all ports.
- SGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
the SERDES is configured as SGMII.
- RGMII: the RGMIIEn bit in GMAC_CTRL_2 is set. The PCSEn bit is kept
cleared, and no SERDES configuration is done, because RGMII is not
using SERDES lanes.
- other: an error is returned. For this reason, the
mvneta_port_power_up() now returns an int instead of nothing, and
the return value is checked by mvneta_probe().
This has been successfully tested on:
* Armada XP DB, which has two RGMII and two SGMII connections
* Armada XP GP, which uses QSGMII for its four interfaces
* Armada 370 Mirabox, which has two RGMII connections
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit e3a8786c10. While
this commit allows to use the mvneta driver as a module on some
configurations, it breaks other configurations even if mvneta is used
built-in.
This breakage is due to the fact that on some RGMII platforms, the PCS
bit has to be set, and on some other platforms, it has to be
cleared. At the moment, we lack informations to know exactly the
significance of this bit (the datasheet only says "enables PCS"), and
so we can't produce a patch that will work on all platforms at this
point. And since this change is breaking the network completely for
many users, it's much better to revert it for now. We'll come back
later with a proper fix that takes into account all platforms.
Basically:
* Armada XP GP is configured as RGMII-ID, and needs the PCS bit to be
set.
* Armada 370 Mirabox is configured as RGMII-ID, and needs the PCS bit
to be cleared.
And at the moment, we don't know how to make the distinction between
those two cases. One hint is that the Armada XP GP appears in fact to
be using a QSGMII connection with the PHY (Quad-SGMII), but
configuring it as SGMII doesn't work, while RGMII-ID works. This needs
more investigation, but in the mean time, let's unbreak the network
for all those users.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reported-by: Arnaud Ebalard <arno@natisbad.org>
Reported-by: Alexander Reuter <Alexander.Reuter@gmx.net>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=73401
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/marvell/mvneta.c
The mvneta.c conflict is a case of overlapping changes,
a conversion to devm_ioremap_resource() vs. a conversion
to netdev_alloc_pcpu_stats.
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.
This commit switches to use devm_ioremap_resource() instead, which
automatically requests the resource (so the I/O registers region shows
up properly in /proc/iomem), and also is devm-style, which allows to
get rid of some error handling to unmap the I/O registers region.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
orion_mdio_reset() does nothing useful and is optional for the MDIO bus
code, so let's just remove it.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.
This commit switches to use devm_ioremap_resource() instead, which
automatically requests the resource (so the I/O registers region shows
up properly in /proc/iomem), and also is devm-style, which allows to
get rid of some error handling to unmap the I/O registers region.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5445eaf309 ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.
However, it turns out that the Armada XP GP, which uses RGMII, is
affected by a similar problem: its SERDES configuration is lost when
mvneta is loaded as a module, because this configuration is set by the
bootloader, and then lost because the clock is gated by the clock
framework until the mvneta driver is loaded again and the clock is
re-enabled.
However, it turns out that for the RGMII case, setting the SERDES
configuration is not sufficient: the PCS enable bit in the
MVNETA_GMAC_CTRL_2 register must also be set, like in the SGMII
configuration.
Therefore, this commit reworks the SGMII/RGMII initialization: the
only difference between the two now is a different SERDES
configuration, all the rest is identical.
In detail, to achieve this, the commit:
* Renames MVNETA_SGMII_SERDES_CFG to MVNETA_SERDES_CFG because it is
not specific to SGMII, but also used on RGMII configurations.
* Adds a MVNETA_RGMII_SERDES_PROTO definition, that must be used as
the MVNETA_SERDES_CFG value in RGMII configurations.
* Removes the mvneta_gmac_rgmii_set() and mvneta_port_sgmii_config()
functions, and instead directly do the SGMII/RGMII configuration in
mvneta_port_up(), from where those functions where called. It is
worth mentioning that mvneta_gmac_rgmii_set() had an 'enable'
parameter that was always passed as '1', so it was pretty useless.
* Reworks the mvneta_port_up() function to set the MVNETA_SERDES_CFG
register to the appropriate value depending on the RGMII vs. SGMII
configuration. It also unconditionally set the PCS_ENABLE bit (was
already done for SGMII, but is now also needed for RGMII), and sets
the PORT_RGMII bit (which was already done for both SGMII and
RGMII).
This commit was successfully tested with mvneta compiled as a module,
on both the OpenBlocks AX3 (SGMII configuration) and the Armada XP GP
(RGMII configuration).
Reported-by: Steve McIntyre <steve@einval.com>
Cc: stable@vger.kernel.org # 3.11.x: 5445eaf309 mvneta: Try to fix mvneta when compiled as module
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bit 3 of the MVNETA_GMAC_CTRL_2 is actually used to enable the PCS,
not the PSC: there was a typo in the name of the define, which this
commit fixes.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace dev_kfree_skb with dev_kfree_skb_any in sky2_xmit_frame that
can be called in hard irq and other contexts.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Replace dev_kfree_skb with dev_kfree_skb_any skge_xmit_free that can
be called in hard irq and other contexts, on the path that
handles dropped packets.
Replace dev_kfree_skb with dev_consume_skb_any in skge_tx_done that can
be called in hard irq and other contexts, on the path that handles
successfully transmitted skbs.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Replace dev_kfree_skb with dev_kfree_skb_any in mv643xx_eth_xmit and
txq_submit_skb that can be called in hard irq and other contexts,
on paths where the skbs are dropped.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.
This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the bh safe variant with the hard irq safe variant.
We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.
Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver reads the mac address from the device registers which would
need to have been programmed by the bootloader. This patch adds
the ability to pull the mac from devicetree via the pci device dt node.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Cc: netdev@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Changes since v2:
- eliminated use of stack tmpaddr per feedback
Changes since v1:
- simplified based on feedback
- fixed formatting
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/bonding/bond_3ad.h
drivers/net/bonding/bond_main.c
Two minor conflicts in bonding, both of which were overlapping
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
With the introduction of the support for Armada 375 and Armada 38x,
the hidden Kconfig option MACH_ARMADA_370_XP is being renamed to
MACH_MVEBU_V7. Therefore, the dependency that was used for the mvneta
driver can no longer work. This commit replaces this dependency by a
dependency on PLAT_ORION, which is used similarly for the mv643xx_eth
driver.
In addition to this, it takes this opportunity to adjust the
description and help text to indicate that the driver can is also used
for Armada 38x. Note that Armada 375 cannot use this driver as it has
a completely different networking unit, which will require a separate
driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are many drivers calling alloc_percpu() to allocate pcpu stats
and then initializing ->syncp. So just introduce a helper function for them.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function return parameter is not used in mvneta_tx_done_gbe(),
where the function is called. This patch makes the function return
void.
Reviewed-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvneta_tx_done_gbe() return value and third parameter are no more
used. This patch changes the function prototype and removes a useless
variable where the function is called.
Reviewed-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
calling dma_map_single()/dma_unmap_single() is quite expensive compared
to copying a small packet. So let's copy short frames and keep the buffers
mapped. We set the limit to 256 bytes which seems to give good results both
on the XP-GP board and on the AX3/4.
The Rx small packet rate increased by 16.4% doing this, from 486kpps to
573kpps. It is worth noting that even the call to the function
dma_sync_single_range_for_cpu() is expensive (300 ns) although less
than dma_unmap_single(). Without it, the packet rate raises to 711kpps
(+24% more). Thus on systems where coherency from device to CPU is
guaranteed by a snoop control unit, this patch should provide even more
gains, and probably rx_copybreak could be increased.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of build_skb() to allocate frags on the RX path. When frag size
is lower than a page size, we can use netdev_alloc_frag(), and we fall back
to kmalloc() for larger sizes. The frag size is stored into the mvneta_port
struct. The alloc/free functions check the frag size to decide what alloc/
free method to use. MTU changes are safe because the MTU change function
stops the device and clears the queues before applying the change.
With this patch, I observed a reproducible 2% performance improvement on
HTTP-based benchmarks, and 5% on small packet RX rate.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the mvneta driver tries to prefetch the current Rx
descriptor during read. Tests have shown that prefetching the
next one instead increases general performance by about 1% on
HTTP traffic.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
At several places, we already know the value of the rx status but
we call functions which dereference the pointer again to get it
and don't need the descriptor for anything else. Simplify this
task by replacing the rx desc pointer by the status word itself.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make mvneta_rxq_fill() use mvneta_rx_refill() instead of using
duplicate code.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mvneta_txq_bufs_free() calls mvneta_tx_done_policy() with
a non-null cause to retrieve the pointer to the next queue to process.
There are useless tests on the return queue number and on the pointer,
all of which are well defined within a known limited set. This code
path is fast, although not critical. Removing 3 tests here that the
compiler could not optimize (verified) is always desirable.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now the mvneta driver doesn't handle Tx IRQ, and relies on two
mechanisms to flush Tx descriptors : a flush at the end of mvneta_tx()
and a timer. If a burst of packets is emitted faster than the device
can send them, then the queue is stopped until next wake-up of the
timer 10ms later. This causes jerky output traffic with bursts and
pauses, making it difficult to reach line rate with very few streams.
A test on UDP traffic shows that it's not possible to go beyond 134
Mbps / 12 kpps of outgoing traffic with 1500-bytes IP packets. Routed
traffic tends to observe pauses as well if the traffic is bursty,
making it even burstier after the wake-up.
It seems that this feature was inherited from the original driver but
nothing there mentions any reason for not using the interrupt instead,
which the chip supports.
Thus, this patch enables Tx interrupts and removes the timer. It does
the two at once because it's not really possible to make the two
mechanisms coexist, so a split patch doesn't make sense.
First tests performed on a Mirabox (Armada 370) show that less CPU
seems to be used when sending traffic. One reason might be that we now
call the mvneta_tx_done_gbe() with a mask indicating which queues have
been done instead of looping over all of them.
The same UDP test above now happily reaches 987 Mbps / 87.7 kpps.
Single-stream TCP traffic can now more easily reach line rate. HTTP
transfers of 1 MB objects over a single connection went from 730 to
840 Mbps. It is even possible to go significantly higher (>900 Mbps)
by tweaking tcp_tso_win_divisor.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Arnaud Ebalard <arno@natisbad.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marvell has not published the chip's datasheet yet, so it's very hard
to find the relevant bits to manipulate to change the IRQ behaviour.
Fortunately, these bits are described in the proprietary LSP patch set
which is publicly available here :
http://www.plugcomputer.org/downloads/mirabox/
So let's put them back in the driver in order to reduce the burden of
current and future maintenance.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a queue timeout is reported, we can oops because of some
schedules while the caller is atomic, as shown below :
mvneta d0070000.ethernet eth0: tx timeout
BUG: scheduling while atomic: bash/1528/0x00000100
Modules linked in: slhttp_ethdiv(C) [last unloaded: slhttp_ethdiv]
CPU: 2 PID: 1528 Comm: bash Tainted: G WC 3.13.0-rc4-mvebu-nf #180
[<c0011bd9>] (unwind_backtrace+0x1/0x98) from [<c000f1ab>] (show_stack+0xb/0xc)
[<c000f1ab>] (show_stack+0xb/0xc) from [<c02ad323>] (dump_stack+0x4f/0x64)
[<c02ad323>] (dump_stack+0x4f/0x64) from [<c02abe67>] (__schedule_bug+0x37/0x4c)
[<c02abe67>] (__schedule_bug+0x37/0x4c) from [<c02ae261>] (__schedule+0x325/0x3ec)
[<c02ae261>] (__schedule+0x325/0x3ec) from [<c02adb97>] (schedule_timeout+0xb7/0x118)
[<c02adb97>] (schedule_timeout+0xb7/0x118) from [<c0020a67>] (msleep+0xf/0x14)
[<c0020a67>] (msleep+0xf/0x14) from [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194)
[<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) from [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24)
[<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) from [<c024afc7>] (dev_watchdog+0x18b/0x1c4)
[<c024afc7>] (dev_watchdog+0x18b/0x1c4) from [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c)
[<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) from [<c0020cad>] (run_timer_softirq+0x115/0x170)
[<c0020cad>] (run_timer_softirq+0x115/0x170) from [<c001ccb9>] (__do_softirq+0xbd/0x1a8)
[<c001ccb9>] (__do_softirq+0xbd/0x1a8) from [<c001cfad>] (irq_exit+0x61/0x98)
[<c001cfad>] (irq_exit+0x61/0x98) from [<c000d4bf>] (handle_IRQ+0x27/0x60)
[<c000d4bf>] (handle_IRQ+0x27/0x60) from [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8)
[<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) from [<c000fba9>] (__irq_usr+0x49/0x60)
Ben Hutchings attempted to propose a better fix consisting in using a
scheduled work for this, but while it fixed this panic, it caused other
random freezes and panics proving that the reset sequence in the driver
is unreliable and that additional fixes should be investigated.
When sending multiple streams over a link limited to 100 Mbps, Tx timeouts
happen from time to time, and the driver correctly recovers only when the
function is disabled.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stats writers are mvneta_rx() and mvneta_tx(). They don't lock anything
when they update the stats, and as a result, it randomly happens that
the stats freeze on SMP if two updates happen during stats retrieval.
This is very easily reproducible by starting two HTTP servers and binding
each of them to a different CPU, then consulting /proc/net/dev in loops
during transfers, the interface should immediately lock up. This issue
also randomly happens upon link state changes during transfers, because
the stats are collected in this situation, but it takes more attempts to
reproduce it.
The comments in netdevice.h suggest using per_cpu stats instead to get
rid of this issue.
This patch implements this. It merges both rx_stats and tx_stats into
a single "stats" member with a single syncp. Both mvneta_rx() and
mvneta_rx() now only update the a single CPU's counters.
In turn, mvneta_get_stats64() does the summing by iterating over all CPUs
to get their respective stats.
With this change, stats are still correct and no more lockup is encountered.
Note that this bug was present since the first import of the mvneta
driver. It might make sense to backport it to some stable trees. If
so, it depends on "d33dc73 net: mvneta: increase the 64-bit rx/tx stats
out of the hot path".
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Better count packets and bytes in the stack and on 32 bit then
accumulate them at the end for once. This saves two memory writes
and two memory barriers per packet. The incoming packet rate was
increased by 4.7% on the Openblocks AX3 thanks to this.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
This covers everything under drivers/net except for wireless, which
has been submitted separately.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On archs like S390 or um this driver cannot build nor work.
Make it depend on HAS_IOMEM to bypass build failures.
drivers/built-in.o: In function `orion_mdio_probe':
drivers/net/ethernet/marvell/mvmdio.c:228: undefined reference to `devm_ioremap'
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c
ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.
qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.
Signed-off-by: David S. Miller <davem@davemloft.net>
This version corrects the whitespace issue.
orion_mdio_wait_ready uses wait_event_timeout to wait for the
SMI interrupt to fire. wait_event_timeout waits for between
"timeout - 1" and "timeout" jiffies. In this case a 1ms timeout
when HZ is 1000 results in a wait of 0 to 1 jiffies, causing
premature timeouts.
This fix ensures a minimum timeout of 2 jiffies, ensuring
wait_event_timeout will always wait at least 1 jiffie.
Issue reported by Nicolas Schichan.
Tested-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using phydev, it should be phy_start/phy_stop'ed properly. This
driver doesn't do that, so add the corresponding calls to port_start/
stop respectively.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of open-coding a PHY reset through the MII BMCR register, use
phy_init_hw() which does this for us and ensures that PHY device fixups
are also applied. We also remove a call to ethernet_phy_reset() which is
now unncessary since phy_attach() calls phy_attach_direct() which in
turns calls phy_init_hw().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of open-coding a PHY reset through the MII BMCR register, use
phy_init_hw() which does that for us and will also make sure that PHY
fixups are applied if required. We also remove a call to phy_reset()
due to the following sequence of calls in the driver:
phy_scan()
-> phy_connect()
-> phy_connect_direct()
-> phy_attach_direct()
-> phy_init_hw()
and we only have a call to phy_init() after phy_scan().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge 'net' into 'net-next' to get the AF_PACKET bug fix that
Daniel's direct transmit changes depend upon.
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code unmaps the DMA mapping created for rx skb_buff's by
using the data_size as the the mapping size. This is wrong since the
correct size to specify should match the size used to create the mapping.
This commit removes the following DMA_API_DEBUG warning:
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860()
mvneta d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.21-01444-ga88ae13-dirty #92
[<c0013600>] (unwind_backtrace+0x0/0xf8) from [<c0010fb8>] (show_stack+0x10/0x14)
[<c0010fb8>] (show_stack+0x10/0x14) from [<c001afa0>] (warn_slowpath_common+0x48/0x68)
[<c001afa0>] (warn_slowpath_common+0x48/0x68) from [<c001b01c>] (warn_slowpath_fmt+0x30/0x40)
[<c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<c018d0fc>] (check_unmap+0x3a8/0x860)
[<c018d0fc>] (check_unmap+0x3a8/0x860) from [<c018d734>] (debug_dma_unmap_page+0x64/0x70)
[<c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<c0233f78>] (mvneta_rx+0xec/0x468)
[<c0233f78>] (mvneta_rx+0xec/0x468) from [<c023436c>] (mvneta_poll+0x78/0x16c)
[<c023436c>] (mvneta_poll+0x78/0x16c) from [<c02db468>] (net_rx_action+0x94/0x160)
[<c02db468>] (net_rx_action+0x94/0x160) from [<c0021e68>] (__do_softirq+0xe8/0x1d0)
[<c0021e68>] (__do_softirq+0xe8/0x1d0) from [<c0021ff8>] (do_softirq+0x4c/0x58)
[<c0021ff8>] (do_softirq+0x4c/0x58) from [<c0022228>] (irq_exit+0x58/0x90)
[<c0022228>] (irq_exit+0x58/0x90) from [<c000e7c8>] (handle_IRQ+0x3c/0x94)
[<c000e7c8>] (handle_IRQ+0x3c/0x94) from [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4)
[<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<c000dc20>] (__irq_svc+0x40/0x50)
Exception stack(0xc04f1f70 to 0xc04f1fb8)
1f60: c1fe46f8 00000000 00001d92 00001d92
1f80: c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000
1fa0: 00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff
[<c000dc20>] (__irq_svc+0x40/0x50) from [<c004c048>] (cpu_startup_entry+0x54/0x128)
[<c004c048>] (cpu_startup_entry+0x54/0x128) from [<c04c1a14>] (start_kernel+0x29c/0x2f0)
[<c04c1a14>] (start_kernel+0x29c/0x2f0) from [<00008074>] (0x8074)
---[ end trace d4955f6acd178110 ]---
Mapped at:
[<c018d600>] debug_dma_map_page+0x4c/0x11c
[<c0235d6c>] mvneta_setup_rxqs+0x398/0x598
[<c0236084>] mvneta_open+0x40/0x17c
[<c02dbbd4>] __dev_open+0x9c/0x100
[<c02dbe58>] __dev_change_flags+0x7c/0x134
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several files refer to an old address for the Free Software Foundation
in the file header comment. Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.
CC: Santosh Raspatur <santosh@chelsio.com>
CC: Dimitris Michailidis <dm@chelsio.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Santiago Leon <santil@linux.vnet.ibm.com>
CC: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
CC: Olof Johansson <olof@lixom.net>
CC: Manish Chopra <manish.chopra@qlogic.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Rajesh Borundia <rajesh.borundia@qlogic.com>
CC: Nicolas Pitre <nico@fluxnic.net>
CC: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"
1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.
We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.
In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.
2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.
3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.
5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.
6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.
7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.
8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.
9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.
11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.
12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.
13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.
14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.
15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.
16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.
18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.
19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.
20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...
We assume that "mp->phy" can be NULL a couple lines before the
dereference.
Fixes: 1cce16d37d ('net: mv643xx_eth: Add missing phy_addr_set in DT mode')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull core locking changes from Ingo Molnar:
"The biggest changes:
- add lockdep support for seqcount/seqlocks structures, this
unearthed both bugs and required extra annotation.
- move the various kernel locking primitives to the new
kernel/locking/ directory"
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
block: Use u64_stats_init() to initialize seqcounts
locking/lockdep: Mark __lockdep_count_forward_deps() as static
lockdep/proc: Fix lock-time avg computation
locking/doc: Update references to kernel/mutex.c
ipv6: Fix possible ipv6 seqlock deadlock
cpuset: Fix potential deadlock w/ set_mems_allowed
seqcount: Add lockdep functionality to seqcount/seqlock structures
net: Explicitly initialize u64_stats_sync structures for lockdep
locking: Move the percpu-rwsem code to kernel/locking/
locking: Move the lglocks code to kernel/locking/
locking: Move the rwsem code to kernel/locking/
locking: Move the rtmutex code to kernel/locking/
locking: Move the semaphore core to kernel/locking/
locking: Move the spinlock code to kernel/locking/
locking: Move the lockdep code to kernel/locking/
locking: Move the mutex code to kernel/locking/
hung_task debugging: Add tracepoint to report the hang
x86/locking/kconfig: Update paravirt spinlock Kconfig description
lockstat: Report avg wait and hold times
lockdep, x86/alternatives: Drop ancient lockdep fixup message
...
In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.
The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.
This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.
Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.
Feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.
However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.
Tested on Kirkwood.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checking if MAC address is valid using is_valid_ether_addr() is already done in
of_get_mac_address().
Signed-off-by: Luka Perkov <luka@openwrt.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amend the documentation in the mvmdio driver to note the fact
that it is now used by both the mvneta and mv643xx_eth drivers.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make only a single call to mutex_unlock in orion_mdio_write.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace manual poll of MVMDIO_SMI_READ_VALID with a call to
orion_mdio_wait_ready. This ensures a consistent timeout,
eliminates a busy loop, and allows for use of interrupts on
systems that support them.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amend orion_mdio_wait_ready so that the same timeout is used when
polling or using wait_event_timeout. Set the timeout to 1ms.
Replace udelay with usleep_range to avoid a busy loop, and set the
polling interval range as 45us to 55us, so that the first sleep
will be enough in almost all cases.
Generate the same log message at timeout when polling or using
wait_event_timeout.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
include/linux/netdevice.h
net/core/sock.c
Trivial merge issues.
Removal of "extern" for functions declaration in netdevice.h
at the same time "const" was added to an argument.
Two parallel line additions in net/core/sock.c
Signed-off-by: David S. Miller <davem@davemloft.net>
DT-based mv643xx_eth probes and creates platform_devices for the
port devices on its own. To allow fixups for ports based on the
device_node, we need to set .of_node of the corresponding device
with the correct node.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The periodic statistics timer gets started at port _probe() time, but
is stopped on _stop() only. In a modular environment, this can cause
the timer to access already deallocated memory, if the module is unloaded
without starting the eth device. To fix this, we add the timer right
before the port is started, instead of at _probe() time.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each port driver installs a periodic timer to update port statistics
by calling mib_counters_update. As mib_counters_update is also called
from non-timer context, we should not reschedule the timer there but
rather move it to timer-only context.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the memset/memcpy uses of 6 to ETH_ALEN
where appropriate.
Also convert some struct definitions and u8 array
declarations of [6] to ETH_ALEN.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In my patch c194992cbe ("skge: fix
broken driver") I didn't fix the skge bug correctly. The value of the
new mapping (not old) was passed to pci_unmap_single.
If we enable CONFIG_DMA_API_DEBUG, it results in this warning:
WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:986 check_sync+0x4c4/0x580()
skge 0000:02:07.0: DMA-API: device driver tries to sync DMA memory it has
not allocated [device address=0x000000023a0096c0] [size=1536 bytes]
This patch makes the skge driver pass the correct value to
pci_unmap_single and fixes the warning. It copies the old descriptor to
on-stack variable "ee" and unmaps it if mapping of the new descriptor
succeeded.
This patch should be backported to 3.11-stable.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch 136d8f377e broke the skge driver.
Note this part of the patch:
+ if (skge_rx_setup(skge, e, nskb, skge->rx_buf_size) < 0) {
+ dev_kfree_skb(nskb);
+ goto resubmit;
+ }
+
pci_unmap_single(skge->hw->pdev,
dma_unmap_addr(e, mapaddr),
dma_unmap_len(e, maplen),
PCI_DMA_FROMDEVICE);
skb = e->skb;
prefetch(skb->data);
- skge_rx_setup(skge, e, nskb, skge->rx_buf_size);
The function skge_rx_setup modifies e->skb to point to the new skb. Thus,
after this change, the new buffer, not the old, is returned to the
networking stack.
This bug is present in kernels 3.11, 3.11.1 and 3.12-rc1. The patch should
be queued for 3.11-stable.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Vasiliy Glazov <vascom2@gmail.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
net/bridge/br_multicast.c
net/ipv6/sit.c
The conflicts were minor:
1) sit.c changes overlap with change to ip_tunnel_xmit() signature.
2) br_multicast.c had an overlap between computing max_delay using
msecs_to_jiffies and turning MLDV2_MRC() into an inline function
with a name using lowercase instead of uppercase letters.
3) stmmac had two overlapping changes, one which conditionally allocated
and hooked up a dma_cfg based upon the presence of the pbl OF property,
and another one handling store-and-forward DMA made. The latter of
which should not go into the new of_find_property() basic block.
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements the ->ndo_do_ioctl() operation so that the
PHY-related ioctl() calls can work from userspace, which allows
applications like mii-tool or mii-diag to do their job.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit fixes a long-standing bug that has been reported by many
users: on some Armada 370 platforms, only the network interface that
has been used in U-Boot to tftp the kernel works properly in
Linux. The other network interfaces can see a 'link up', but are
unable to transmit data. The reports were generally made on the Armada
370-based Mirabox, but have also been given on the Armada 370-RD
board.
The network MAC in the Armada 370/XP (supported by the mvneta driver
in Linux) has a functionality that allows it to continuously poll the
PHY and directly update the MAC configuration accordingly (speed,
duplex, etc.). The very first versions of the driver submitted for
review were using this hardware mechanism, but due to this, the driver
was not integrated with the kernel phylib. Following reviews, the
driver was changed to use the phylib, and therefore a software based
polling. In software based polling, Linux regularly talks to the PHY
over the MDIO bus, and sees if the link status has changed. If it's
the case then the adjust_link() callback of the driver is called to
update the MAC configuration accordingly.
However, it turns out that the adjust_link() callback was not
configuring the hardware in a completely correct way: while it was
setting the speed and duplex bits correctly, it wasn't telling the
hardware to actually take into account those bits rather than what the
hardware-based PHY polling mechanism has concluded. So, in fact the
adjust_link() callback was basically a no-op.
However, the network happened to be working because on the network
interfaces used by U-Boot for tftp on Armada 370 platforms because the
hardware PHY polling was enabled by the bootloader, and left enabled
by Linux. However, the second network interface not used for tftp (or
both network interfaces if the kernel is loaded from USB, NAND or SD
card) didn't had the hardware PHY polling enabled.
This patch fixes this situation by:
(1) Making sure that the hardware PHY polling is disabled by clearing
the MVNETA_PHY_POLLING_ENABLE bit in the MVNETA_UNIT_CONTROL
register in the driver ->probe() function.
(2) Making sure that the duplex and speed selections made by the
adjust_link() callback are taken into account by clearing the
MVNETA_GMAC_AN_SPEED_EN and MVNETA_GMAC_AN_DUPLEX_EN bits in the
MVNETA_GMAC_AUTONEG_CONFIG register.
This patch has been tested on Armada 370 Mirabox, and now both network
interfaces are usable after boot.
[ Problem introduced by commit c5aff18 ("net: mvneta: driver for
Marvell Armada 370/XP network unit") ]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Jochen De Smet <jochen.armkernel@leahnim.org>
Cc: Peter Sanford <psanford@nearbuy.io>
Cc: Ethan Tuttle <ethan@ethantuttle.com>
Cc: Chény Yves-Gael <yves@cheny.fr>
Cc: Ryan Press <ryan@presslab.us>
Cc: Simon Guinot <simon.guinot@sequanux.org>
Cc: vdonnefort@lacie.com
Cc: stable@vger.kernel.org
Acked-by: Jason Cooper <jason@lakedaemon.net>
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Tested-by: Yves-Gael Cheny <yves@cheny.fr>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the wrapper function for retrieving the platform data instead of
accessing dev->platform_data directly. This is a cosmetic change
to make the code simpler and enhance the readability.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__GFP_ZERO is an uncommon flag and perhaps is better
not used. static inline dma_zalloc_coherent exists
so convert the uses of dma_alloc_coherent with __GFP_ZERO
to the more common kernel style with zalloc.
Remove memset from the static inline dma_zalloc_coherent
and add just one use of __GFP_ZERO instead.
Trivially reduces the size of the existing uses of
dma_zalloc_coherent.
Realign arguments as appropriate.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DMA sync should sync the whole receive buffer, not just
part of it. Fixes log messages dma_sync_check.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following is needed as well to fix warning/error about shifting a 32 bit
value 32 bits which occurs if building on 32 bit platform caused by conversion
to using dma_addr_t
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This old driver never checked for DMA mapping errors.
Causing splats with the new DMA mapping checks:
WARNING: at lib/dma-debug.c:937 check_unmap+0x47b/0x930()
skge 0000:01:09.0: DMA-API: device driver failed to check map
Add checks and unwind code.
Reported-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the mvneta driver is compiled as module, the clock is disabled before
it's loading. This will reset the registers values and all configuration
made by the bootloader.
This patch sets the "sgmii serdes configuration" register to a magical value
found in:
https://github.com/yellowback/ubuntu-precise-armadaxp/blob/master/arch/arm/mach-armadaxp/armada_xp_family/ctrlEnv/mvCtrlEnvLib.c
With this change, the interrupts are working/generated and ethernet is
working.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: David S. Miller <davem@davemloft.net>