OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Wei Liu	836fbaf459	xen-netback: use skb_is_gso in xenvif_start_xmit In `5bd076708` ("Xen-netback: Fix issue caused by using gso_type wrongly") we use skb_is_gso to determine if we need an extra slot to accommodate the SKB. There's similar error in interface.c. Change that to use skb_is_gso as well. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Annie Li <annie.li@oracle.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Paul Durrant <paul.durrant@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:36:32 -04:00
Annie Li	5bd0767086	Xen-netback: Fix issue caused by using gso_type wrongly Current netback uses gso_type to check whether the skb contains gso offload, and this is wrong. Gso_size is the right one to check gso existence, and gso_type is only used to check gso type. Some skbs contains nonzero gso_type and zero gso_size, current netback would treat these skbs as gso and create wrong response for this. This also causes ssh failure to domu from other server. V2: use skb_is_gso function as Paul Durrant suggested Signed-off-by: Annie Li <annie.li@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 21:57:50 -04:00
Zoltan Kiss	e9275f5e2d	xen-netback: Aggregate TX unmap operations Unmapping causes TLB flushing, therefore we should make it in the largest possible batches. However we shouldn't starve the guest for too long. So if the guest has space for at least two big packets and we don't have at least a quarter ring to unmap, delay it for at most 1 milisec. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:57:21 -05:00
Zoltan Kiss	093507885a	xen-netback: Timeout packets in RX path A malicious or buggy guest can leave its queue filled indefinitely, in which case qdisc start to queue packets for that VIF. If those packets came from an another guest, it can block its slots and prevent shutdown. To avoid that, we make sure the queue is drained in every 10 seconds. The QDisc queue in worst case takes 3 round to flush usually. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:57:15 -05:00
Zoltan Kiss	e3377f36ca	xen-netback: Handle guests with too many frags Xen network protocol had implicit dependency on MAX_SKB_FRAGS. Netback has to handle guests sending up to XEN_NETBK_LEGACY_SLOTS_MAX slots. To achieve that: - create a new skb - map the leftover slots to its frags (no linear buffer here!) - chain it to the previous through skb_shinfo(skb)->frag_list - map them - copy and coalesce the frags into a brand new one and send it to the stack - unmap the 2 old skb's pages It's also introduces new stat counters, which help determine how often the guest sends a packet with more than MAX_SKB_FRAGS frags. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:35 -05:00
Zoltan Kiss	1bb332af4c	xen-netback: Add stat counters for zerocopy These counters help determine how often the buffers had to be copied. Also they help find out if packets are leaked, as if "sent != success + fail", there are probably packets never freed up properly. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:35 -05:00
Zoltan Kiss	62bad3199a	xen-netback: Remove old TX grant copy definitons and fix indentations These became obsolete with grant mapping. I've left intentionally the indentations in this way, to improve readability of previous patches. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:35 -05:00
Zoltan Kiss	f53c3fe8da	xen-netback: Introduce TX grant mapping This patch introduces grant mapping on netback TX path. It replaces grant copy operations, ditching grant copy coalescing along the way. Another solution for copy coalescing is introduced in "xen-netback: Handle guests with too many frags", older guests and Windows can broke before that patch applies. There is a callback (xenvif_zerocopy_callback) from core stack to release the slots back to the guests when kfree_skb or skb_orphan_frags called. It feeds a separate dealloc thread, as scheduling NAPI instance from there is inefficient, therefore we can't do dealloc from the instance. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:35 -05:00
Zoltan Kiss	3e2234b314	xen-netback: Handle foreign mapped pages on the guest RX path RX path need to know if the SKB fragments are stored on pages from another domain. Logically this patch should be after introducing the grant mapping itself, as it makes sense only after that. But to keep bisectability, I moved it here. It shouldn't change any functionality here. xenvif_zerocopy_callback and ubuf_to_vif are just stubs here, they will be introduced properly later on. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:35 -05:00
Zoltan Kiss	121fa4b777	xen-netback: Minor refactoring of netback code This patch contains a few bits of refactoring before introducing the grant mapping changes: - introducing xenvif_tx_pending_slots_available(), as this is used several times, and will be used more often - rename the thread to vifX.Y-guest-rx, to signify it does RX work from the guest point of view Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:34 -05:00
Zoltan Kiss	8f13dd9612	xen-netback: Use skb->cb for pending_idx Storing the pending_idx at the first byte of the linear buffer never looked good, skb->cb is a more proper place for this. It also prevents the header to be directly grant copied there, and we don't have the pending_idx after we copied the header here, so it's time to change it. It also introduces helpers for the RX side Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-07 15:56:34 -05:00
Zoltan Kiss	9ab9831b4c	xen-netback: Fix Rx stall due to race condition The recent patch to fix receive side flow control (11b57f90257c1d6a91cee720151b69e0c2020cf6: xen-netback: stop vif thread spinning if frontend is unresponsive) solved the spinning thread problem, however caused an another one. The receive side can stall, if: - [THREAD] xenvif_rx_action sets rx_queue_stopped to true - [INTERRUPT] interrupt happens, and sets rx_event to true - [THREAD] then xenvif_kthread sets rx_event to false - [THREAD] rx_work_todo doesn't return true anymore Also, if interrupt sent but there is still no room in the ring, it take quite a long time until xenvif_rx_action realize it. This patch ditch that two variable, and rework rx_work_todo. If the thread finds it can't fit more skb's into the ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's kept as 0. Then rx_work_todo will check if: - there is something to send to the ring (like before) - there is space for the topmost packet in the queue I think that's more natural and optimal thing to test than two bool which are set somewhere else. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-05 16:24:08 -08:00
Paul Durrant	2721637c1c	xen-netback: use new skb_checksum_setup function Use skb_checksum_setup to set up partial checksum offsets rather then a private implementation. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-14 14:24:19 -08:00
Paul Durrant	11b57f9025	xen-netback: stop vif thread spinning if frontend is unresponsive The recent patch to improve guest receive side flow control (`ca2f09f2`) had a slight flaw in the wait condition for the vif thread in that any remaining skbs in the guest receive side netback internal queue would prevent the thread from sleeping. An unresponsive frontend can lead to a permanently non-empty internal queue and thus the thread will spin. In this case the thread should really sleep until the frontend becomes responsive again. This patch adds an extra flag to the vif which is set if the shared ring is full and cleared when skbs are drained into the shared ring. Thus, if the thread runs, finds the shared ring full and can make no progress the flag remains set. If the flag remains set then the thread will sleep, regardless of a non-empty queue, until the next event from the frontend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-09 23:05:46 -05:00
David S. Miller	56a4342dfe	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c net/ipv6/ip6_tunnel.c net/ipv6/ip6_vti.c ipv6 tunnel statistic bug fixes conflicting with consolidation into generic sw per-cpu net stats. qlogic conflict between queue counting bug fix and the addition of multiple MAC address support. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-06 17:37:45 -05:00
Josh Boyer	f35f76ee76	xen-netback: Include header for vmalloc Commit `ac3d5ac277` ("xen-netback: fix guest-receive-side array sizes") added calls to vmalloc and vfree in the interface.c file without including <linux/vmalloc.h>. This causes build failures if the -Werror=implicit-function-declaration flag is passed. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-05 20:34:36 -05:00
Paul Durrant	ac3d5ac277	xen-netback: fix guest-receive-side array sizes The sizes chosen for the metadata and grant_copy_op arrays on the guest receive size are wrong; - The meta array is needlessly twice the ring size, when we only ever consume a single array element per RX ring slot - The grant_copy_op array is way too small. It's sized based on a bogus assumption: that at most two copy ops will be used per ring slot. This may have been true at some point in the past but it's clear from looking at start_new_rx_buffer() that a new ring slot is only consumed if a frag would overflow the current slot (plus some other conditions) so the actual limit is MAX_SKB_FRAGS grant_copy_ops per ring slot. This patch fixes those two sizing issues and, because grant_copy_ops grows so much, it pulls it out into a separate chunk of vmalloc()ed memory. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-29 22:31:30 -05:00
Paul Durrant	b89587a7af	xen-netback: add gso_segs calculation netback already has code which parses IPv4 and v6 headers to set up checksum offsets and these are always applied to GSO packets being sent from frontends. It's therefore suboptimal that GSOs are being marked SKB_GSO_DODGY to defer the gso_segs calculation when netback already has all necessary information to hand to do the calculation. This patch adds that calculation. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 15:11:49 -05:00
Wei Yongjun	0c8d087c04	xen-netback: fix some error return code 'err' is overwrited to 0 after maybe_pull_tail() call, so the error code was not set if skb_partial_csum_set() call failed. Fix to return error -EPROTO from those error handling case instead of 0. Fixes: `d52eb0d46f` ('xen-netback: make sure skb linear area covers checksum field') Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 14:58:47 -05:00
David S. Miller	143c905494	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-18 16:42:06 -05:00
Wei Yongjun	7022ef8b2a	xen-netback: fix fragments error handling in checksum_setup_ip() Fix to return -EPROTO error if fragments detected in checksum_setup_ip(). Fixes: `1431fb31ec` ('xen-netback: fix fragment detection in checksum setup') Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-17 16:18:18 -05:00
Paul Durrant	a3314f3d40	xen-netback: fix gso_prefix check There is a mistake in checking the gso_prefix mask when passing large packets to a guest. The wrong shift is applied to the bit - the raw skb gso type is used rather then the translated one. This leads to large packets being handed to the guest without the GSO metadata. This patch fixes the check. The mistake manifested as errors whilst running Microsoft HCK large packet offload tests between a pair of Windows 8 VMs. I have verified this patch fixes those errors. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-12 15:47:18 -05:00
Paul Durrant	d9601a36ff	xen-netback: napi: don't prematurely request a tx event This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the former call has the side effect of advancing the ring event pointer and therefore inviting another interrupt from the frontend before the napi poll has actually finished, thereby defeating the point of napi. The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in xenvif_poll, the napi poll function, if the work done is less than the budget i.e. when actually transitioning back to interrupt mode. Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-12 13:35:38 -05:00
Paul Durrant	10574059ce	xen-netback: napi: fix abuse of budget netback seems to be somewhat confused about the napi budget parameter. The parameter is supposed to limit the number of skbs processed in each poll, but netback has this confused with grant operations. This patch fixes that, properly limiting the work done in each poll. Note that this limit makes sure we do not process any more data from the shared ring than we intend to pass back from the poll. This is important to prevent tx_queue potentially growing without bound. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-12 13:35:38 -05:00
Paul Durrant	d52eb0d46f	xen-netback: make sure skb linear area covers checksum field skb_partial_csum_set requires that the linear area of the skb covers the checksum field. The checksum setup code in netback was only doing that pullup in the case when the pseudo header checksum was being recalculated though. This patch makes that pullup unconditional. (I pullup the whole transport header just for simplicity; the requirement is only for the check field but in the case of UDP this is the last field in the header and in the case of TCP it's the last but one). The lack of pullup manifested as failures running Microsoft HCK network tests on a pair of Windows 8 VMs and it has been verified that this patch fixes the problem. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-11 16:46:24 -05:00
Paul Durrant	ca2f09f2b2	xen-netback: improve guest-receive-side flow control The way that flow control works without this patch is that, in start_xmit() the code uses xenvif_count_skb_slots() to predict how many slots xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek' counter which it then uses to determine if the shared ring has that amount of space available by checking whether 'req_prod' has passed that value. If the ring doesn't have space the tx queue is stopped. xenvif_gop_skb() will then consume slots and update 'req_cons' and issue responses, updating 'rsp_prod' as it goes. The frontend will consume those responses and post new requests, by updating req_prod. So, req_prod chases req_cons which chases rsp_prod, and can never exceed that value. Thus if xenvif_count_skb_slots() ever returns a number of slots greater than xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot possibly achieve (since it's limited by the 'real' req_cons) and, if this happens enough times, req_cons_peek gets more than a ring size ahead of req_cons and the tx queue then remains stopped forever waiting for an unachievable amount of space to become available in the ring. Having two routines trying to calculate the same value is always going to be fragile, so this patch does away with that. All we essentially need to do is make sure that we have 'enough stuff' on our internal queue without letting it build up uncontrollably. So start_xmit() makes a cheap optimistic check of how much space is needed for an skb and only turns the queue off if that is unachievable. net_rx_action() is the place where we could do with an accurate predicition but, since that has proven tricky to calculate, a cheap worse-case (but not too bad) estimate is all we really need since the only thing we must prevent is xenvif_gop_skb() consuming more slots than are available. Without this patch I can trivially stall netback permanently by just doing a large guest to guest file copy between two Windows Server 2008R2 VMs on a single host. Patch tested with frontends in: - Windows Server 2008R2 - CentOS 6.0 - Debian Squeeze - Debian Wheezy - SLES11 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Annie Li <annie.li@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:33:12 -05:00
David S. Miller	34f9f43710	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 20:20:14 -05:00
Jeff Kirsher	adf8d3ff6e	drivers/net/*: Fix FSF address in file headers Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Paul Mackerras <paulus@samba.org> CC: Ian Campbell <ian.campbell@citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Rusty Russell <rusty@rustcorp.com.au> CC: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 12:37:55 -05:00
Paul Durrant	1431fb31ec	xen-netback: fix fragment detection in checksum setup The code to detect fragments in checksum_setup() was missing for IPv4 and too eager for IPv6. (It transpires that Windows seems to send IPv6 packets with a fragment header even if they are not a fragment - i.e. offset is zero, and M bit is not set). This patch also incorporates a fix to callers of maybe_pull_tail() where skb->network_header was being erroneously added to the length argument. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> cc: David Miller <davem@davemloft.net> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-05 20:31:40 -05:00
Paul Durrant	67fa36609f	xen-netback: clear vif->task on disconnect xenvif_start_xmit() relies on checking vif->task for NULL to determine whether the vif is ready to accept packets. The task thread is stopped in xenvif_disconnect() but task is not set to NULL. Thus, on a re-connect the check will give a false positive. Also since commit `ea732dff5c` (Handle backend state transitions in a more robust way) it should not be possible for xenvif_connect() to be called if the vif is already connected so change the check of vif->tx_irq to a BUG_ON() and also add a BUG_ON(vif->task). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-03 11:49:30 -05:00
Andy Whitcroft	ae5e8127b7	xen-netback: include definition of csum_ipv6_magic We are now using csum_ipv6_magic, include the appropriate header. Avoids the following error: drivers/net/xen-netback/netback.c:1313:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration] tcph->check = ~csum_ipv6_magic(&ipv6h->saddr, Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-28 18:38:06 -05:00
David Vrabel	db739ef37f	xen-netback: stop the VIF thread before unbinding IRQs If the VIF thread is still running after unbinding the Tx and Rx IRQs in xenvif_disconnect(), the thread may attempt to raise an event which will BUG (as the irq is unbound). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-21 13:09:43 -05:00
David S. Miller	394efd19d5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/netconsole.c net/bridge/br_private.h Three mostly trivial conflicts. The net/bridge/br_private.h conflict was a function signature (argument addition) change overlapping with the extern removals from Joe Perches. In drivers/net/netconsole.c we had one change adjusting a printk message whilst another changed "printk(KERN_INFO" into "pr_info(". Lastly, the emulex change was a new inline function addition overlapping with Joe Perches's extern removals. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-04 13:48:30 -05:00
Wei Liu	059dfa6a93	xen-netback: use jiffies_64 value to calculate credit timeout time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jason Luan <jianhai.luan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-29 00:24:49 -04:00
Paul Durrant	82cada22a0	xen-netback: enable IPv6 TCP GSO to the guest This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate extra or prefix segments to pass the large packet to the frontend. New xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled to determine if the frontend is capable of handling such packets. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:35:17 -04:00
Paul Durrant	a946858768	xen-netback: handle IPv6 TCP GSO packets from the guest This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs if the frontend passes an extra segment with the new type XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:35:17 -04:00
Paul Durrant	7365bcfa32	xen-netback: Unconditionally set NETIF_F_RXCSUM There is no mechanism to insist that a guest always generates a packet with good checksum (at least for IPv4) so we must handle checksum offloading from the guest and hence should set NETIF_F_RXCSUM. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:35:16 -04:00
Paul Durrant	2eba61d55e	xen-netback: add support for IPv6 checksum offload from guest For performance of VM to VM traffic on a single host it is better to avoid calculation of TCP/UDP checksum in the sending frontend. To allow this this patch adds the code necessary to set up partial checksum for IPv6 packets and xenstore flag feature-ipv6-csum-offload to advertise that fact to frontends. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:35:16 -04:00
Paul Durrant	146c8a77d2	xen-netback: add support for IPv6 checksum offload to guest Check xenstore flag feature-ipv6-csum-offload to determine if a guest is happy to accept IPv6 packets with only partial checksum. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:35:14 -04:00
David Vrabel	dc62ccaccf	xen-netback: transition to CLOSED when removing a VIF If a guest is destroyed without transitioning its frontend to CLOSED, the domain becomes a zombie as netback was not grant unmapping the shared rings. When removing a VIF, transition the backend to CLOSED so the VIF is disconnected if necessary (which will unmap the shared rings etc). This fixes a regression introduced by `279f438e36` (xen-netback: Don't destroy the netdev until the vif is shut down). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Paul Durrant <Paul.Durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:15:51 -04:00
Paul Durrant	ea732dff5c	xen-netback: Handle backend state transitions in a more robust way When the frontend state changes netback now specifies its desired state to a new function, set_backend_state(), which transitions through any necessary intermediate states. This fixes an issue observed with some old Windows frontend drivers where they failed to transition through the Closing state and netback would not behave correctly. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:13:28 -04:00
Paul Durrant	279f438e36	xen-netback: Don't destroy the netdev until the vif is shut down Without this patch, if a frontend cycles through states Closing and Closed (which Windows frontends need to do) then the netdev will be destroyed and requires re-invocation of hotplug scripts to restore state before the frontend can move to Connected. Thus when udev is not in use the backend gets stuck in InitWait. With this patch, the netdev is left alone whilst the backend is still online and is only de-registered and freed just prior to destroying the vif (which is also nicely symmetrical with the netdev allocation and registration being done during probe) so no re-invocation of hotplug scripts is required. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-19 14:03:51 -04:00
David Vrabel	6e43fc04a6	xen-netback: count number required slots for an skb more carefully When a VM is providing an iSCSI target and the LUN is used by the backend domain, the generated skbs for direct I/O writes to the disk have large, multi-page skb->data but no frags. With some lengths and starting offsets, xen_netbk_count_skb_slots() would be one short because the simple calculation of DIV_ROUND_UP(skb_headlen(), PAGE_SIZE) was not accounting for the decisions made by start_new_rx_buffer() which does not guarantee responses are fully packed. For example, a skb with length < 2 pages but which spans 3 pages would be counted as requiring 2 slots but would actually use 3 slots. skb->data: \| 1111\|222222222222\|3333 \| Fully packed, this would need 2 slots: \|111122222222\|22223333 \| But because the 2nd page wholy fits into a slot it is not split across slots and goes into a slot of its own: \|1111 \|222222222222\|3333 \| Miscounting the number of slots means netback may push more responses than the number of available requests. This will cause the frontend to get very confused and report "Too many frags/slots". The frontend never recovers and will eventually BUG. Fix this by counting the number of required slots more carefully. In xen_netbk_count_skb_slots(), more closely follow the algorithm used by xen_netbk_gop_skb() by introducing xen_netbk_count_frag_slots() which is the dry-run equivalent of netbk_gop_frag_copy(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-12 23:22:13 -04:00
Kees Cook	a9677bc024	xen-netback: fix possible format string flaw This makes sure a format string cannot accidentally leak into the kthread_run() call. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-12 17:20:03 -04:00
Wei Liu	7376419a46	xen-netback: rename functions As we move to 1:1 model and melt xen_netbk and xenvif together, it would be better to use single prefix for all functions in xen-netback. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 01:18:04 -04:00
Wei Liu	b3f980bd82	xen-netback: switch to NAPI + kthread 1:1 model This patch implements 1:1 model netback. NAPI and kthread are utilized to do the weight-lifting job: - NAPI is used for guest side TX (host side RX) - kthread is used for guest side RX (host side TX) Xenvif and xen_netbk are made into one structure to reduce code size. This model provides better scheduling fairness among vifs. It is also prerequisite for implementing multiqueue for Xen netback. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 01:18:04 -04:00
Wei Liu	43e9d19432	xen-netback: remove page tracking facility The data flow from DomU to DomU on the same host in current copying scheme with tracking facility: copy DomU --------> Dom0 DomU \| ^ \|____________________________\| copy The page in Dom0 is a page with valid MFN. So we can always copy from page Dom0, thus removing the need for a tracking facility. copy copy DomU --------> Dom0 -------> DomU Simple iperf test shows no performance regression (obviously we copy twice either way): W/ tracking: ~5.3Gb/s W/o tracking: ~5.4Gb/s Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-29 01:18:04 -04:00
Wei Liu	8ef2c3bca5	xen-netback: xenbus.c: use more current logging styles Convert one printk to pr_<level>. Add a missing newline in several places to avoid message interleaving, coalesce formats, reflow modified lines to 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 00:52:55 -07:00
Joe Perches	383eda32b8	xen: Use more current logging styles Instead of mixing printk and pr_<level> forms, just use pr_<level> Miscellaneous changes around these conversions: Add a missing newline to avoid message interleaving, coalesce formats, reflow modified lines to 80 columns. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:31:25 -07:00
Dan Carpenter	07cc61bfc0	xen-netback: double free on unload There is a typo here, "i" vs "j", so we would crash on module_exit(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:24:57 -07:00
David S. Miller	d98cae64e4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/wireless/ath/ath9k/Kconfig drivers/net/xen-netback/netback.c net/batman-adv/bat_iv_ogm.c net/wireless/nl80211.c The ath9k Kconfig conflict was a change of a Kconfig option name right next to the deletion of another option. The xen-netback conflict was overlapping changes involving the handling of the notify list in xen_netbk_rx_action(). Batman conflict resolution provided by Antonio Quartulli, basically keep everything in both conflict hunks. The nl80211 conflict is a little more involved. In 'net' we added a dynamic memory allocation to nl80211_dump_wiphy() to fix a race that Linus reported. Meanwhile in 'net-next' the handlers were converted to use pre and post doit handlers which use a flag to determine whether to hold the RTNL mutex around the operation. However, the dump handlers to not use this logic. Instead they have to explicitly do the locking. There were apparent bugs in the conversion of nl80211_dump_wiphy() in that we were not dropping the RTNL mutex in all the return paths, and it seems we very much should be doing so. So I fixed that whilst handling the overlapping changes. To simplify the initial returns, I take the RTNL mutex after we try to allocate 'tb'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 16:49:39 -07:00
Jan Beulich	94f950c406	xen-netback: don't de-reference vif pointer after having called xenvif_put() When putting vif-s on the rx notify list, calling xenvif_put() must be deferred until after the removal from the list and the issuing of the notification, as both operations dereference the pointer. Changing this got me to notice that the "irq" variable was effectively unused (and was of too narrow type anyway). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-13 01:25:24 -07:00
Wei Liu	e1f00a69ec	xen-netback: split event channels support for Xen backend driver Netback and netfront only use one event channel to do TX / RX notification, which may cause unnecessary wake-up of processing routines. This patch adds a new feature called feature-split-event-channels to netback, enabling it to handle TX and RX events separately. Netback will use tx_irq to notify guest for TX completion, rx_irq for RX notification. If frontend doesn't support this feature, tx_irq equals to rx_irq. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-23 18:40:37 -07:00
Wei Liu	b103f358d9	xen-netback: enable user to unload netback module This patch enables user to unload netback module, which is useful when user wants to upgrade to a newer netback module without rebooting the host. Netfront cannot handle netback removal event. As we cannot fix all possible frontends we add module get / put along with vif get / put to avoid mis-unloading of netback. To unload netback module, user needs to shutdown all VMs or migrate them to another host or unplug all vifs before hand. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>¬ Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-17 18:23:07 -07:00
Wei Liu	f1db320ec5	xen-netback: remove dead code The array mmap_pages is never touched in the initialization function. This is remnant of mapping mechanism, which does not exist upstream. In current upstream code this array only tracks usage of pages inside netback. Those pages are allocated when contructing a SKB and passed directly to network subsystem. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-17 18:23:07 -07:00
Wei Liu	376414945d	xen-netback: better names for thresholds This patch only changes some names to avoid confusion. In this patch we have: MAX_SKB_SLOTS_DEFAULT -> FATAL_SKB_SLOTS_DEFAULT max_skb_slots -> fatal_skb_slots #define XEN_NETBK_LEGACY_SLOTS_MAX XEN_NETIF_NR_SLOTS_MIN The fatal_skb_slots is the threshold to determine whether a packet is malicious. XEN_NETBK_LEGACY_SLOTS_MAX is the maximum slots a valid packet can have at this point. It is defined to be XEN_NETIF_NR_SLOTS_MIN because that's guaranteed to be supported by all backends. Suggested-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-02 16:50:08 -04:00
Wei Liu	59ccb4ebbc	xen-netback: avoid allocating variable size array on stack Tune xen_netbk_count_requests to not touch working array beyond limit, so that we can make working array size constant. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-02 16:50:08 -04:00
Wei Liu	ac69c26e7a	xen-netback: remove redundent parameter in netbk_count_requests Tracking down from the caller, first_idx is always equal to vif->tx.req_cons. Remove it to avoid confusion. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-02 16:50:08 -04:00
Wei Liu	03393fd5cc	xen-netback: don't disconnect frontend when seeing oversize packet Some frontend drivers are sending packets > 64 KiB in length. This length overflows the length field in the first slot making the following slots have an invalid length. Turn this error back into a non-fatal error by dropping the packet. To avoid having the following slots having fatal errors, consume all slots in the packet. This does not reopen the security hole in XSA-39 as if the packet as an invalid number of slots it will still hit fatal error case. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-22 15:37:01 -04:00
Wei Liu	2810e5b9a7	xen-netback: coalesce slots in TX path and fix regressions This patch tries to coalesce tx requests when constructing grant copy structures. It enables netback to deal with situation when frontend's MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS. With the help of coalescing, this patch tries to address two regressions avoid reopening the security hole in XSA-39. Regression 1. The reduction of the number of supported ring entries (slots) per packet (from 18 to 17). This regression has been around for some time but remains unnoticed until XSA-39 security fix. This is fixed by coalescing slots. Regression 2. The XSA-39 security fix turning "too many frags" errors from just dropping the packet to a fatal error and disabling the VIF. This is fixed by coalescing slots (handling 18 slots when backend's MAX_SKB_FRAGS is 17) which rules out false positive (using 18 slots is legit) and dropping packets using 19 to `max_skb_slots` slots. To avoid reopening security hole in XSA-39, frontend sending packet using more than max_skb_slots is considered malicious. The behavior of netback for packet is thus: 1-18 slots: valid 19-max_skb_slots slots: drop and respond with an error max_skb_slots+ slots: fatal error max_skb_slots is configurable by admin, default value is 20. Also change variable name from "frags" to "slots" in netbk_count_requests. Please note that RX path still has dependency on MAX_SKB_FRAGS. This will be fixed with separate patch. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-22 15:37:01 -04:00
Jason Wang	bea8933647	xen-netback: switch to use skb_partial_csum_set() Switch to use skb_partial_csum_set() to simplify the codes. Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-12 14:58:33 -04:00
stephen hemminger	9eaee8beee	xen-netback: fix sparse warning Fix warning about 0 used as NULL. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-10 23:23:42 -04:00
Jason Wang	40893fd0fd	net: switch to use skb_probe_transport_header() Switch to use the new help skb_probe_transport_header() to do the l4 header probing for untrusted sources. For packets with partial csum, the header should already been set by skb_partial_csum_set(). Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-27 12:48:31 -04:00
Jason Wang	f9ca8f7439	netback: set transport header before passing it to kernel Currently, for the packets receives from netback, before doing header check, kernel just reset the transport header in netif_receive_skb() which pretends non l4 header. This is suboptimal for precise packet length estimation (introduced in 1def9238: net_sched: more precise pkt_len computation) which needs correct l4 header for gso packets. The patch just reuse the header probed by netback for partial checksum packets and tries to use skb_flow_dissect() for other cases, if both fail, just pretend no l4 header. Cc: Eric Dumazet <edumazet@google.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-26 12:44:44 -04:00
Wei Liu	27f852282a	xen-netback: remove skb in xen_netbk_alloc_page This variable is never used. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-25 12:19:47 -04:00
David S. Miller	629821d9b0	Revert "xen: netback: remove redundant xenvif_put" This reverts commit `d37204566a`. This change is incorrect, as per Jan Beulich: ==================== But this is wrong from all we can tell, we discussed this before (Wei pointed to the discussion in an earlier reply). The core of it is that the put here parallels the one in netbk_tx_err(), and the one in xenvif_carrier_off() matches the get from xenvif_connect() (which normally would be done on the path coming through xenvif_disconnect()). ==================== And a previous discussion of this issue is at: http://marc.info/?l=xen-devel&m=136084174026977&w=2 Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 13:04:34 -05:00
Andrew Jones	d37204566a	xen: netback: remove redundant xenvif_put netbk_fatal_tx_err() calls xenvif_carrier_off(), which does a xenvif_put(). As callers of netbk_fatal_tx_err should only have one reference to the vif at this time, then the xenvif_put in netbk_fatal_tx_err is one too many. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-19 00:51:09 -05:00
David S. Miller	6338a53a2b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net into net Pull in 'net' to take in the bug fixes that didn't make it into 3.8-final. Also, deal with the semantic conflict of the change made to net/ipv6/xfrm6_policy.c A missing rt6->n neighbour release was added to 'net', but in 'net-next' we no longer cache the neighbour entries in the ipv6 routes so that change is not appropriate there. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-18 23:34:21 -05:00
David Vrabel	3e55f8b306	xen-netback: cancel the credit timer when taking the vif down If the credit timer is left armed after calling xen_netbk_remove_xenvif(), then it may fire and attempt to schedule the vif which will then oops as vif->netbk == NULL. This may happen both in the fatal error path and during normal disconnection from the front end. The sequencing during shutdown is critical to ensure that: a) vif->netbk doesn't become unexpectedly NULL; and b) the net device/vif is not freed. 1. Mark as unschedulable (netif_carrier_off()). 2. Synchronously cancel the timer. 3. Remove the vif from the schedule list. 4. Remove it from it netback thread group. 5. Wait for vif->refcnt to become 0. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reported-by: Christopher S. Aker <caker@theshore.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-14 13:16:49 -05:00
David Vrabel	35876b5ffc	xen-netback: correctly return errors from netbk_count_requests() netbk_count_requests() could detect an error, call netbk_fatal_tx_error() but return 0. The vif may then be used afterwards (e.g., in a call to netbk_tx_error(). Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may be freed immediately after the call to netbk_fatal_tx_error() (e.g., if the vif is also removed). Netback thread Xenwatch thread ------------------------------------------- netbk_fatal_tx_err() netback_remove() xenvif_disconnect() ... free_netdev() netbk_tx_err() Oops! Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Christopher S. Aker <caker@theshore.net> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-14 13:16:49 -05:00
David S. Miller	fd5023111c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-08 18:02:14 -05:00
Ian Campbell	b9149729eb	netback: correct netbk_tx_err to handle wrap around. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:29:29 -05:00
Ian Campbell	4cc7c1cb7b	xen/netback: free already allocated memory on failure in xen_netbk_get_requests Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:29:28 -05:00
Matthew Daley	7d5145d8eb	xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop. Signed-off-by: Matthew Daley <mattjd@gmail.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:29:28 -05:00
Ian Campbell	48856286b6	xen/netback: shutdown the ring if it contains garbage. A buggy or malicious frontend should not be able to confuse netback. If we spot anything which is not as it should be then shutdown the device and don't try to continue with the ring in a potentially hostile state. Well behaved and non-hostile frontends will not be penalised. As well as making the existing checks for such errors fatal also add a new check that ensures that there isn't an insane number of requests on the ring (i.e. more than would fit in the ring). If the ring contains garbage then previously is was possible to loop over this insane number, getting an error each time and therefore not generating any more pending requests and therefore not exiting the loop in xen_netbk_tx_build_gops for an externded period. Also turn various netdev_dbg calls which no precipitate a fatal error into netdev_err, they are rate limited because the device is shutdown afterwards. This fixes at least one known DoS/softlockup of the backend domain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:29:28 -05:00
Matt Wilson	4a633a602c	xen-netback: allow changing the MAC address of the interface Sometimes it is useful to be able to change the MAC address of the interface for netback devices. For example, when using ebtables it may be useful to be able to distinguish traffic from different interfaces without depending on the interface name. Reported-by: Nikita Borzykh <sample.n@gmail.com> Reported-by: Paul Harvey <stockingpaul@hotmail.com> Cc: netdev@vger.kernel.org Cc: xen-devel@lists.xen.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-23 13:42:20 -05:00
Ian Campbell	6a8ed462f1	xen: netback: handle compound page fragments on transmit. An SKB paged fragment can consist of a compound page with order > 0. However the netchannel protocol deals only in PAGE_SIZE frames. Handle this in netbk_gop_frag_copy and xen_netbk_count_skb_slots by iterating over the frames which make up the page. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 22:50:45 -04:00
Konrad Rzeszutek Wilk	ae1659ee6b	Merge branch 'xenarm-for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm into stable/for-linus-3.7 * 'xenarm-for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: arm: introduce a DTS for Xen unprivileged virtual machines MAINTAINERS: add myself as Xen ARM maintainer xen/arm: compile netback xen/arm: compile blkfront and blkback xen/arm: implement alloc/free_xenballooned_pages with alloc_pages/kfree xen/arm: receive Xen events on ARM xen/arm: initialize grant_table on ARM xen/arm: get privilege status xen/arm: introduce CONFIG_XEN on ARM xen: do not compile manage, balloon, pci, acpi, pcpu and cpu_hotplug on ARM xen/arm: Introduce xen_ulong_t for unsigned long xen/arm: Xen detection and shared_info page mapping docs: Xen ARM DT bindings xen/arm: empty implementation of grant_table arch specific functions xen/arm: sync_bitops xen/arm: page.h definitions xen/arm: hypercalls arm: initial Xen support Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-26 16:43:35 -04:00
Andres Lagar-Cavilla	c571898ffc	xen/gndev: Xen backend support for paged out grant targets V4. Since Xen-4.2, hvm domains may have portions of their memory paged out. When a foreign domain (such as dom0) attempts to map these frames, the map will initially fail. The hypervisor returns a suitable errno, and kicks an asynchronous page-in operation carried out by a helper. The foreign domain is expected to retry the mapping operation until it eventually succeeds. The foreign domain is not put to sleep because itself could be the one running the pager assist (typical scenario for dom0). This patch adds support for this mechanism for backend drivers using grant mapping and copying operations. Specifically, this covers the blkback and gntdev drivers (which map foreign grants), and the netback driver (which copies foreign grants). * Add a retry method for grants that fail with GNTST_eagain (i.e. because the target foreign frame is paged out). * Insert hooks with appropriate wrappers in the aforementioned drivers. The retry loop is only invoked if the grant operation status is GNTST_eagain. It guarantees to leave a new status code different from GNTST_eagain. Any other status code results in identical code execution as before. The retry loop performs 256 attempts with increasing time intervals through a 32 second period. It uses msleep to yield while waiting for the next retry. V2 after feedback from David Vrabel: * Explicit MAX_DELAY instead of wrap-around delay into zero * Abstract GNTST_eagain check into core grant table code for netback module. V3 after feedback from Ian Campbell: * Add placeholder in array of grant table error descriptions for unrelated error code we jump over. * Eliminate single map and retry macro in favor of a generic batch flavor. * Some renaming. * Bury most implementation in grant_table.c, cleaner interface. V4 rebased on top of sync of Xen grant table interface headers. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> [v5: Fixed whitespace issues] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-21 09:23:51 -04:00
Stefano Stabellini	ca98163376	xen/arm: compile netback Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-08-08 17:21:23 +00:00
Annie Li	1e0b6eac6a	xen/netback: only non-freed SKB is queued into tx_queue After SKB is queued into tx_queue, it will be freed if request_gop is NULL. However, no dequeue action is called in this situation, it is likely that tx_queue constains freed SKB. This patch should fix this issue, and it is based on 3.5.0-rc4+. This issue is found through code inspection, no bug is seen with it currently. I run netperf test for several hours, and no network regression was found. Signed-off-by: Annie Li <annie.li@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 00:50:20 -07:00
Simon Graham	e26b203ede	xen/netback: Calculate the number of SKB slots required correctly When calculating the number of slots required for a packet header, the code was reserving too many slots if the header crossed a page boundary. Since netbk_gop_skb copies the header to the start of the page, the count of slots required for the header should be based solely on the header size. This problem is easy to reproduce if a VIF is bridged to a USB 3G modem device as the skb->data value always starts near the end of the first page. Signed-off-by: Simon Graham <simon.graham@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-24 16:22:53 -04:00
Joe Perches	e404decb0f	drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages alloc failures use dump_stack so emitting an additional out-of-memory message is an unnecessary duplication. Remove the allocation failure messages. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-31 16:20:21 -05:00
Linus Torvalds	90160371b3	Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits) xen/pciback: Expand the warning message to include domain id. xen/pciback: Fix "device has been assigned to X domain!" warning xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un\|]bind" xen/xenbus: don't reimplement kvasprintf via a fixed size buffer xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX. Xen: consolidate and simplify struct xenbus_driver instantiation xen-gntalloc: introduce missing kfree xen/xenbus: Fix compile error - missing header for xen_initial_domain() xen/netback: Enable netback on HVM guests xen/grant-table: Support mappings required by blkback xenbus: Use grant-table wrapper functions xenbus: Support HVM backends xen/xenbus-frontend: Fix compile error with randconfig xen/xenbus-frontend: Make error message more clear xen/privcmd: Remove unused support for arch specific privcmp mmap xen: Add xenbus_backend device xen: Add xenbus device driver xen: Add privcmd device driver xen/gntalloc: fix reference counts on multi-page mappings ...	2012-01-10 10:09:59 -08:00
stephen hemminger	813abbbaa3	xen-netback: make ops structs const All tables of function pointers should be const to make hacks more difficult. Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-05 13:23:00 -05:00
Jan Beulich	73db144b58	Xen: consolidate and simplify struct xenbus_driver instantiation The 'name', 'owner', and 'mod_name' members are redundant with the identically named fields in the 'driver' sub-structure. Rather than switching each instance to specify these fields explicitly, introduce a macro to simplify this. Eliminate further redundancy by allowing the drvname argument to DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from the ID table will be used for .driver.name). Also eliminate the questionable xenbus_register_{back,front}end() wrappers - their sole remaining purpose was the checking of the 'owner' field, proper setting of which shouldn't be an issue anymore when the macro gets used. v2: Restore DRV_NAME for the driver name in xen-pciback. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-01-04 17:01:17 -05:00
Daniel De Graaf	2a14b24406	xen/netback: Enable netback on HVM guests Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-12-20 17:07:50 -05:00
David S. Miller	959327c784	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-06 21:10:05 -05:00
Wei Liu	e34c0246d6	netback: fix typo in comment "variables a used" should be "variables are used". Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-06 13:20:56 -05:00
Wei Liu	16ecba2605	netback: remove redundant assignment New value for netbk->mmap_pages[pending_idx] is assigned in xen_netbk_alloc_page(), no need for a second assignment which exposes internal to the outside world. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-06 13:14:24 -05:00
Wei Liu	6b84bd1674	netback: Fix alert message. The original message in netback_init was 'kthread_run() fails', which should be 'kthread_create() fails'. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-06 00:32:39 -05:00
David S. Miller	6dec4ac4ee	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv4/inet_diag.c	2011-11-26 14:47:03 -05:00
Jan Beulich	5ccb3ea720	xen-netback: use correct index for invalidation in xen_netbk_tx_check_gop() Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-21 15:32:15 -05:00
Michał Mirosław	c8f44affb7	net: introduce and use netdev_features_t for device features sets v2: add couple missing conversions in drivers split unexporting netdev_fix_features() implemented %pNF convert sock::sk_route_(no?)caps Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-16 17:43:10 -05:00
Linus Torvalds	06d381484f	Merge branch 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: net: xen-netback: use API provided by xenbus module to map rings block: xen-blkback: use API provided by xenbus module to map rings xen: use generic functions instead of xen_{alloc, free}_vm_area()	2011-11-06 18:31:36 -08:00
David Vrabel	c9d6369978	net: xen-netback: use API provided by xenbus module to map rings The xenbus module provides xenbus_map_ring_valloc() and xenbus_map_ring_vfree(). Use these to map the Tx and Rx ring pages granted by the frontend. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-10-26 10:02:56 -04:00
Eric Dumazet	9e903e0852	net: add skb frag size accessors To ease skb->truesize sanitization, its better to be able to localize all references to skb frags size. Define accessors : skb_frag_size() to fetch frag size, and skb_frag_size_{set\|add\|sub}() to manipulate it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:10:46 -04:00
David S. Miller	88c5100c28	Merge branch 'master' of github.com:davem330/net Conflicts: net/batman-adv/soft-interface.c	2011-10-07 13:38:43 -04:00
Ian Campbell	ea066ad158	xen: netback: convert to SKB paged frag API. netback currently uses frag->page to store a temporary index reference while processing incoming requests. Since frag->page is to become opaque switch instead to using page_offset. Add a wrapper to tidy this up and propagate the fact that the indexes are only u16 through the code (this was already true in practice but unsigned long and in were inconsistently used as variable and parameter types) Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: xen-devel@lists.xensource.com Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-05 17:36:00 -04:00
David Vrabel	d0e5d83284	net: xen-netback: correctly restart Tx after a VM restore/migrate If a VM is saved and restored (or migrated) the netback driver will no longer process any Tx packets from the frontend. xenvif_up() does not schedule the processing of any pending Tx requests from the front end because the carrier is off. Without this initial kick the frontend just adds Tx requests to the ring without raising an event (until the ring is full). This was caused by `47103041e9` (net: xen-netback: convert to hw_features) which reordered the calls to xenvif_up() and netif_carrier_on() in xenvif_connect(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 14:15:46 -04:00

1 2 3 4

155 Commits