It was possible to call hif_stop() 2 times through
ath10k_htc_connect_init() timeout failpath which
could lead to double free_irq() kernel splat for
multiple MSI interrupt case.
Re-order init sequence to avoid this problem. The
HTC stop shouldn't stop HIF implicitly since it
doesn't implicitly start it. Since the re-ordering
required some functions to be split/removed/renamed
rename a few functions to make more sense while at
it.
Reported-By: Ben Greear <greearb@candelatech.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It is inefficient to grab irqsave spinlocks for
skb lists for each queue/dequeue action.
Using rx_ring.lock and tx_lock allows to use less
heavy bh spinlock functions and moving locking
upwards allows to toggle spinlocks less often.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Going through full htc tx path for htt tx is a
waste of resources. By skipping it it's possible
to easily submit scatter-gather to the pci hif for
reduced host cpu load and improved performance.
The new approach uses dma pool to store the
following metadata for each tx request:
* msdu fragment list
* htc header
* htt tx command
The htt tx command contains a msdu prefetch.
Instead of copying it original mapped msdu address
is used to submit a second scatter-gather item to
hif to make a complete htt tx command.
The htt tx command itself hands over dma mapped
pointers to msdus and completion of the command
itself doesn't mean the frame has been sent and
can be unmapped/freed. This is why htc tx
completion is skipped for htt tx as all tx related
resources are freed upon htt tx completion
indication event (which also implicitly means htt
tx command itself was completed).
Since now each htt tx request effectively consists
of 2 copy engine items CE_HTT_H2T_MSG_SRC_NENTRIES
is updated to allow maximum of
TARGET_10X_NUM_MSDU_DESC msdus being queued. This
keeps the tx path resource management simple.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
There's no real benefit from using them. DMA-API
already provides debugging. Some skbuffs are
already mapped directly with DMA-API since wrapper
arguments were insufficient and extending them
would be pointless.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
10.1.389 firmware has some differences in
calculation of number of outstanding HTT TX
completions. This led to FW crashes of 10.1.389
while main firmware branch was unnaffected.
The patch makes sure ath10k doesn't queue up more
MSDUs than it should.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This is still the only way to submit mgmt frames in case
of 10.X firmware.
This patch introduces wmi_mgmt_tx queue, because of the
fact WMI command can block. This is a problem for
ath10k_tx_htt(), since it's called from atomic context.
The skb queue and worker are introduced to move the mgmt
frame handling out of .tx callback context and not block.
Signed-off-by: Bartosz Markowski <bartosz.markowski@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Instead of allocating sk_buff for a mere 16-byte
tx fragment list buffer use headroom of the
original msdu sk_buff.
This decreases CPU cache pressure and improves
performance.
Measured improvement on AP135 is 560mbps ->
590mbps of UDP TX briding traffic.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Use a saner goto scheme for failure handling. Also
group operations more sensibly.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Until now the all MSDU transfer related structures
were freed when all resources were unreferenced.
Now HTC transfer is freed independently and HTT
transfer is so too.
This yields a way more simpler ath10k_skb_cb and
should possibly enable parallel pipe processing
(which is now serialized in
ath10k_pci_process_ce routine).
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This reduces number of memory accesses and
hopefully contributes to better performance in the
future.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
New firmware comes with new HTT protocol version.
In 3.0 the separate mgmt tx command has been
removed. All traffic is to be pushed through data
tx (tx_frm) command with a twist - FW seems to not
be able (yet?) to access tx fragment table so for
manamgement frames frame pointer is passed
directly.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
HW supports L3/L4 tx checksum offloading.
This should reduce CPU load and improve
performance on slow host machines.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This reduces number of allocations and simplifies
memory managemnt.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This reduces number of allocations and simplifies
memory managemnt.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Here's a new mac80211 driver for Qualcomm Atheros 802.11ac QCA98xx devices.
A major difference from ath9k is that there's now a firmware and
that's why we had to implement a new driver.
The wiki page for the driver is:
http://wireless.kernel.org/en/users/Drivers/ath10k
The driver has had many authors, they are listed here alphabetically:
Bartosz Markowski <bartosz.markowski@tieto.com>
Janusz Dziedzic <janusz.dziedzic@tieto.com>
Kalle Valo <kvalo@qca.qualcomm.com>
Marek Kwaczynski <marek.kwaczynski@tieto.com>
Marek Puzyniak <marek.puzyniak@tieto.com>
Michal Kazior <michal.kazior@tieto.com>
Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>