Update the dates for files that have been added to in 2012-2013.
Drop the 'Solarstorm' brand name that's still lingering here.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This adds support for the EF10 network controller architecture and the
SFC9100 family, starting with SFC9120 'Farmingdale', and bumps the
driver version to 4.0.
New features in the SFC9100 family include:
- Flexible allocation of internal resources to PCIe physical and virtual
functions under firmware control
- RX event merging to reduce DMA writes at high packet rates
- Integrated RX timestamping
- PIO buffers for lower TX latency
- Firmware-driven data path that supports additional offload features
and filter types
- Delivery of packets between functions and to multiple recipients,
allowing firmware to implement a vswitch
- Multiple RX flow hash (RSS) contexts with their own hash keys and
indirection tables
- 40G MAC (single port only)
...not all of which are enabled in this initial driver or the initial
firmware release.
Much of the new code is by Jon Cooper.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The TX path firmware for EF10 supports 'option descriptors' to control
offloads and various other features. Add a flag and field for these
in struct efx_tx_buffer, and don't treat them as DMA descriptors on
completion.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On EF10, the firmware will initiate a queue flush in certain
error cases. We need to accept that flush events might appear
at any time after a queue has been initialised, not just when
we try to flush them.
We can handle Falcon-architecture in just the same way.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
RX DMA scatter is always enabled on EF10. Adjust the common RX
completion handling to allow for this.
RX completion events on EF10 include the length used from a single
descriptor, not the cumulative length used. Add a field to struct
efx_rx_queue to hold the cumulative length.
[bwh: Also fix a related comment]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Define a flag for struct efx_rx_buffer and efx_rx_packet() that
indicates packet length must be read from the prefix. If this
is set, read the length in __efx_rx_packet() (when the prefix
should have arrived in cache).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Add a counter for TX merged completion events.
This is implemented in the common TX path, because the NIC event
handlers only know how many descriptors were completed, not how many
packets.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
EF10 uses an entirely different RX prefix format from Falcon-arch.
Extend struct efx_nic_type to describe this.
[bwh: Also replace the magic numbers used for the Falcon-arch RX prefix]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Various hardware statistics that are available for Siena are
unavailable or meaningless for Falcon. Huntington adds further to the
NIC-type-specific statistics, as it has different MAC blocks from
Falcon/Siena.
All NIC types still provide most statistics by DMA, and use
little-endian byte order.
Therefore:
1. Add some general utility functions for reporting hardware statistics,
efx_nic_describe_stats() and efx_nic_update_stats().
2. Add an efx_nic_type::describe_stats operation to get the number and
names of statistics, implemented using efx_nic_describe_stats()
3. Change efx_nic_type::update_stats to store the core statistics
(struct rtnl_link_stats64) or full statistics (array of u64) in a
caller-provided buffer. Use efx_nic_update_stats() to aid in the
implementation.
4. Rename struct efx_ethtool_stat to struct efx_sw_stat_desc and
EFX_ETHTOOL_NUM_STATS to EFX_ETHTOOL_SW_STAT_COUNT.
5. Remove efx_nic::mac_stats and struct efx_mac_stats.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Merge the per-NIC-type MTD probe selection and struct efx_mtd_ops into
struct efx_nic_type. Move the implementations into the appropriate
source files.
Several NVRAM functions are now only called from MTD operations which
are now implemented in the same file (falcon.c or mcdi.c). There is no
need for them to be extern, or to be defined at all if CONFIG_SFC_MTD
is not enabled, so move them into the #ifdef CONFIG_SFC_MTD sections
in those files.
Most of the SPI-related definitions are also only used in falcon.c,
so move them there. Put the remainder of spi.h into nic.h (which
previously included it).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Currently we use struct efx_mtd to represent a physical NVRAM device
and struct efx_mtd_partition to represent a partition on that device.
But this only really makes sense for Falcon, as we don't know or care
whether MC-managed NVRAM partitions are on one or more physical
devices. It complicates iteration and provides little benefit.
Therefore:
- Replace the pointer to efx_mtd in mtd_info::priv with a pointer to efx_nic
- Move the falcon_spi_device pointer into the union in struct efx_mtd_partition
- Move the device name to efx_mtd_partition::dev_type_name
- Move the efx_mtd_ops pointer to efx_nic::mtd_ops
- Make efx_nic::mtd_list a list of partitions
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On Falcon we implement MAC filtering requested by the stack using the
MAC wrapper's single unicast filter and multicast hash filter. Siena
is very similar, though MAC configuration is mediated by the MC.
Since MCDI operations may sleep, reconfiguration is deferred from
ndo_set_rx_mode to a work item. However, it still updates the private
variables describing the filter state synchronously. Contrary to
comments, the later use of these variables is not protected using the
address lock, resulting in race conditions.
Move the state update to a new function
efx_farch_filter_sync_rx_mode() and make the Falcon-arch MAC
configuration functions call that, so that its use is consistently
serialised by the mac_lock.
Invert and rename the promiscuous flag to the more accurate
unicast_filter, and comment that both this and multicast_hash are
not used on EF10.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Aside from accelerated RFS, there is almost nothing that can be shared
between the filter table implementations for the Falcon architecture
and EF10.
Move the few shared functions into efx.c and rx.c and the rest into
farch.c. Introduce efx_nic_type operations for the implementation and
inline wrapper functions that call these.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Move the common state from struct efx_filter_state into struct efx_nic.
Rename struct efx_filter_state to efx_farch_filter_state and change
the type of efx_nic::filter_state to void *.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
EF10 functions don't have a fixed BAR size, and the minimum is not
large enough for all the queues we might want to allocate. We have to
find out the BAR size at run-time, and therefore phys_addr_channels
and mem_map_size cannot be defined per-NIC-type.
Change efx_nic_type::mem_map_size to a function pointer which is
called to find the wanted memory map size (before probe).
Replace efx_nic_type::phys_addr_channels with efx_nic::max_channels,
to be initialised by the probe function.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
MCDI v2 adds a second header dword with wider command and length
fields. It also defines extra error codes.
Change the fallback error number for unknown MCDI error codes from EIO
to EPROTO. EIO is treated as indicating the MCDI transport has failed
and we need to reset the function, which is rather drastic.
v2 error codes and lengths don't fit into completion events, so for a
v2-capable transport, always read the response header rather then
using the event fields.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Add efx_nic_type operations for the many efx_nic functions that need
to be implemented different on EF10. For now, change most of the
existing efx_nic_*() functions into inline wrappers. As a later step,
we may be able to improve branch prediction for operations used on the
fast path by copying the pointers into each queue/channel structure.
Move the Falcon/Siena implementations to new file farch.c and rename
the functions and static data to use a prefix of 'efx_farch_'.
Move efx_may_push_tx_desc() to nic.h, as the EF10 TX code will also
use it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Currently efx_stop_datapath() will try to flush our DMA queues (if DMA
is enabled), then finalise software and hardware state for each queue.
However, for EF10 we must ask the MC to finalise each queue, which
implicitly starts flushing it, and then wait for the flush events.
We therefore need to delegate more of this to the NIC type.
Combine all the hardware operations into a new NIC-type operation
efx_nic_type::fini_dmaq, and call this before tearing down the
software state and buffers for all the DMA queues.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
rx_queue::enabled guards refill, so rename it to reflect that. Clear
it at the start of the queue teardown process rather than waiting for
the RX queue to be flushed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
There are many problems with the current efx_stop_interrupts() and
efx_start_interrupts():
1. On Siena, it is unsafe to disable the master IRQ enable bit
(DRV_INT_EN_KER) while any IRQ sources are enabled.
2. On EF10 there is no master IRQ enable bit, so we cannot expect to
defer IRQs without tearing down event queues. (Though I don't think
we will need to keep any event queues around while the device is down,
as we do for VFDI on Siena.)
3. synchronize_irq() only waits for a running IRQ handler to finish,
not for any propagation through IRQ controllers. Therefore an IRQ may
still be received and handled after efx_stop_interrupts() returns.
IRQ handlers can then race with channel reallocation.
To fix this:
a. Introduce a software IRQ enable flag. So long as this is clear,
IRQ handlers will only acknowledge IRQs and not touch the channel
structures.
b. Define a new struct efx_msi_context as the context for MSIs. This
is never reallocated and is sufficient to find the software enable
flag and the channel structure. It also includes the channel/IRQ
name, which was previously separated out as it must also not be
reallocated.
c. Split efx_{start,stop}_interrupts() into
efx_{,soft_}_{enable,disable}_interrupts(). The 'soft' functions
don't touch the hardware master enable flag (if it exists) and don't
reinitialise or tear down channels with the keep_eventq flag set.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
efx_process_channel_now() is unneeded since self-tests can rely on
normal NAPI polling. Remove it and all calls to it.
efx_channel::work_pending and efx_channel_processed() are also
unneeded (the latter being the same as efx_nic_eventq_read_ack()).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On EF10, the firmware is in charge of allocating buffer table entries.
Change struct efx_special_buffer to use a struct efx_buffer member,
so that it can be used with efx_nic_{alloc,free}_buffer() in that
case.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Move the lowest layer (transport) of the current MCDI code to
per-NIC-type operations.
Introduce a new structure and efx_nic member for MCDI-specific data.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
PCI legacy interrupts are level-triggered, and we cannot mask them up
on an isolated device. Instead, disable the IRQ at the controller
until we have recovered.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We should not use net_device::dev_id to indicate the port number, as
this affects the way the local part of IPv6 addresses is normally
generated.
This field was intended for use where multiple devices may share a
single assigned MAC address and need to have different IPv6 addresses.
Siena's two ports each have their own MAC addresses.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
efx_start_datapath() asserts that we can fit 2 RX scatter buffers plus
a software structure, each appropriately aligned, into a single page.
Where L1_CACHE_BYTES == 256 and PAGE_SIZE == 4096, which is the case
on s390, this assertion fails.
The current scatter buffer size is also not a multiple of 64 or 128,
which are more common cache line sizes. If we can make both the start
and end of a scatter buffer cache-aligned, this will reduce the need
for read-modify-write operations on inter- processor links.
Fix the alignment by reducing EFX_RX_USR_BUF_SIZE to 2048 - 256 ==
1792. (We could use 2048 - L1_CACHE_BYTES, but EFX_RX_USR_BUF_SIZE
also affects user-level networking where a larger amount of
housekeeping data may be needed. Although this version of the driver
does not support user-level networking, I prefer to keep scattering
behaviour consistent with the out-of-tree version.)
This still doesn't fix the s390 build because like most architectures
it has NET_IP_ALIGN == 2. When NET_IP_ALIGN != 0 we cannot achieve
cache line alignment at either the start or end of a scatter buffer,
so there is actually no point in padding the buffers to a multiple of
the cache line size. All we need is 4-byte alignment of the network
header, so do that.
Adjust the assertions accordingly.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The two architectures that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
(powerpc and x86) now both define NET_IP_ALIGN as 0, so there is no
need for this optimisation any more.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocating 2 buffers per page is insanely inefficient when MTU is 1500
and PAGE_SIZE is 64K (as it usually is on POWER). Allocate as many as
we can fit, and choose the refill batch size at run-time so that we
still always use a whole page at once.
[bwh: Fix loop condition to allow for compound pages; rebase]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On POWER systems, DMA mapping/unmapping operations are very expensive.
These changes reduce these costs by trying to reuse DMA mapped pages.
After all the buffers associated with a page have been processed and
passed up, the page is placed into a ring (if there is room). For
each page that is required for a refill operation, a page in the ring
is examined to determine if its page count has fallen to 1, ie. the
kernel has released its reference to these packets. If this is the
case, the page can be immediately added back into the RX descriptor
ring, without having to re-map it for DMA.
If the kernel is still holding a reference to this page, it is removed
from the ring and unmapped for DMA. Then a new page, which can
immediately be used by RX buffers in the descriptor ring, is allocated
and DMA mapped.
The time a page needs to spend in the recycle ring before the kernel
has released its page references is based on the number of buffers
that use this page. As large pages can hold more RX buffers, the RX
recycle ring can be shorter. This reduces memory usage on POWER
systems, while maintaining the performance gain achieved by recycling
pages, following the driver change to pack more than two RX buffers
into large pages.
When an IOMMU is not present, the recycle ring can be small to reduce
memory usage, since DMA mapping operations are inexpensive.
With a small recycle ring, attempting to refill the descriptor queue
with more buffers than the equivalent size of the recycle ring could
ultimately lead to memory leaks if page entries in the recycle ring
were overwritten. To prevent this, the check to see if the recycle
ring is full is changed to check if the next entry to be written is
NULL.
[bwh: Combine and rebase several commits so this is complete
before the following buffer-packing changes. Remove module
parameter.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Enable RX DMA scattering iff an RX buffer large enough for the current
MTU will not fit into a single page and the NIC supports DMA
scattering for kernel-mode RX queues.
On Falcon and Siena, the RX_USR_BUF_SIZE field is used as the DMA
limit for both all RX queues with scatter enabled. Set it to 1824,
matching what Onload uses now.
Maintain a statistic for frames truncated due to lack of descriptors
(rx_nodesc_trunc). This is distinct from rx_frm_trunc which may be
incremented when scattering is disabled and implies an over-length
frame.
Whenever an MTU change causes scattering to be turned on or off,
update filters that point to the PF queues, but leave others
unchanged, as VF drivers assume scattering is off.
Add n_frags parameters to various functions, and make them iterate:
- efx_rx_packet()
- efx_recycle_rx_buffers()
- efx_rx_mk_skb()
- efx_rx_deliver()
Make efx_handle_rx_event() responsible for updating
efx_rx_queue::removed_count.
Change the RX pipeline state to a starting ring index and number of
fragments, and make __efx_rx_packet() responsible for clearing it.
Based on earlier versions by David Riddoch and Jon Cooper.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Adjust rx_buf->page_offset when we eat the RX hash prefix. Remove
efx_rx_buf_offset(), which is now redundant.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The Linux side of EEH is triggered by MMIO reads, but this
driver's data path does not issue any MMIO reads (except in
legacy interrupt mode). Therefore add a monitor function
to poll EEH periodically.
When preparing to reset the device based on our own error
detection, also poll EEH and defer to its recovery mechanism
if appropriate.
[bwh: Use a separate condition for the initial link poll; fix some
style errors]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Remove more dead code, and make efx_ptp_rx() pull the data it
needs into the header area.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Instead of having efx_ptp_rx() call netif_receive_skb() for an invalid
PTP packet, make it return false for rejected packets and have
efx_rx_deliver() pass them up.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We assume that the mapping between DMA and virtual addresses is done
on whole pages, so we can find the page offset of an RX buffer using
the lower bits of the DMA address. However, swiotlb maps in units of
2K, breaking this assumption.
Add an explicit page_offset field to struct efx_rx_buffer.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We sometimes hit a "failed to flush" timeout on some TX queues, but the
flushes have completed and the flush completion events seem to go missing.
In this case, we can check the TX_DESC_PTR_TBL register and drain the
queues if the flushes had finished.
[bwh: Minor fixes to coding style]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Receiving pause frames can block TX queue flushes. Earlier changes
work around this by reconfiguring the MAC during flushes for VFs, but
during flushes for the PF we would only change the fc_disable counter.
Unless the MAC is reconfigured for some other reason during the flush
(which I would not expect to happen) this had no effect at all.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Where a PTP clock driver is associated with a net or PHY driver, it
should be enabled automatically whenever that driver is enabled.
Therefore:
- Make PTP clock drivers select rather than depending on PTP_1588_CLOCK
- Remove separate boolean options for PTP clock drivers that are built
as part of net driver modules. (This also fixes cases where the PTP
subsystem is wrongly forced to be built-in.)
- Set 'default y' for PTP clock drivers that depend on specific net
drivers but are built separately
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are now standard functions for dealing with little-endian bit
arrays, so use them instead of our own implementations.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>