OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David S. Miller	9286ae01ac	Merge branch 'sunvnet-napi' Sowmini Varadhan says: ==================== sunvnet: NAPIfy sunvnet This patchset converts the sunvnet driver to use the NAPI framework. Changes since v4 to Patch1: vnet_event accumulates LDC_EVENT_* bits into rx_event. vnet_event_napi() unrolls send_events() logic to process all rx_event bits. Changes since v5: Patch 1: use net_device.h definition for NAPI_POLL_WEIGHT. Drop sparclinux changes (patch3) per David Miller feedback Patch 1 in the series addresses the packet-receive path- all the vnet_event() processing is moved into NAPI context. This patch is dependant on the sparc-next commit: "sparc64: Add vio_set_intr() to enable/disable Rx interrupts" (sparc commit id `ca605b7dd7`) Patch 2 uses RCU to fix race conditions between vnet_port_remove and paths that access/modify port-related state, such as vnet_start_xmit. Patch 3 leverages from the NAPIfied Rx path, dropping superfluous usage of the irqsave/irqrestores on the vio.lock where possible. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 16:20:20 -04:00
Sowmini Varadhan	13b13dd97c	sunvnet: Remove irqsave/irqrestore on vio.lock After the NAPIfication of sunvnet, we no longer need to synchronize by doing irqsave/restore on vio.lock in the I/O fastpath. NAPI ->poll() is non-reentrant, so all RX processing occurs strictly in a serialized environment. TX reclaim is done in NAPI context, so the netif_tx_lock can be used to serialize critical sections between Tx and Rx paths. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 16:20:16 -04:00
Sowmini Varadhan	2a968dd8f7	sunvnet: Use RCU to synchronize port usage with vnet_port_remove() A vnet_port_remove could be triggered as a result of an ldm-unbind operation by the peer, module unload, or other changes to the inter-vnet-link configuration. When this is concurrent with vnet_start_xmit(), there are several race sequences possible, such as thread 1 thread 2 vnet_start_xmit -> tx_port_find spin_lock_irqsave(&vp->lock..) ret = __tx_port_find(..) spin_lock_irqrestore(&vp->lock..) vio_remove -> .. ->vnet_port_remove spin_lock_irqsave(&vp->lock..) cleanup spin_lock_irqrestore(&vp->lock..) kfree(port) /* attempt to use ret will bomb */ This patch adds RCU locking for port access so that vnet_port_remove will correctly clean up port-related state. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Dwight Engen <dwight.engen@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 16:20:15 -04:00
Sowmini Varadhan	69088822ab	sunvnet: NAPIfy sunvnet Move Rx packet procssing to the NAPI poll callback. Disable VIO interrupt and unconditioanlly go into NAPI context from vnet_event. Note that we want to minimize the number of LDC STOP/START messages sent. Specifically, do not send a STOP message if vnet_walk_rx does not read all the available descriptors because of the NAPI budget limitation. Instead, note the end index as part of port state, and resume from this index when the next poll callback is triggered. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com> Acked-by: Dwight Engen <dwight.engen@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 16:20:15 -04:00
David S. Miller	132fb57984	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-10-23 This series contains updates to i40e and i40evf. Jesse modifies the i40e driver to only notify the firmware on link up/down and qualified module events. Also simplified the job of managing link state by using the admin queue receive event for link events as a signal to tell the driver to update link state. Jeff (me) cleans up the inconsistent use of tabs for indentation in the admin queue command header file. Neerav converts the use of udelay() to usleep_range(). Anjali fixes a bug where receive would stop after some stress by adding a sleep and restart as well as moving the setting of flow control because it should be done at a PF level and not a VSI level. Mitch adds code to handle link events when updating the PF switch, which allows link information to be properly provided to VFS in all cases. Catherine adds driver support for 10GBaseT and bumps driver version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 16:41:02 -04:00
Fabian Frederick	74bca138e1	net: llc: include linux/errno.h instead of asm/errno.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:51:42 -04:00
Fabian Frederick	75da1469f9	lapb: move EXPORT_SYMBOL after functions. See Documentation/CodingStyle Chapter 6 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:51:42 -04:00
David S. Miller	5f3619f275	Merge branch 'berlin_ethernet' Sebastian Hesselbarth says: ==================== Marvell PXA168 libphy handling and Berlin Ethernet This patch series deals with a removing a IP feature that can be found on all currently supported Marvell Ethernet IP (pxa168_eth, mv643xx_eth, mvneta). The MAC IP allows to automatically perform PHY auto-negotiation without software interaction. However, this feature (a) fundamentally clashes with the way libphy works and (b) is unable to deal with quirky PHYs that require special treatment. In this series, pxa168_eth driver is rewritten to completely disable that feature and properly deal with libphy provided PHYs. As usual, a branch on top of v3.18-rc1 can be found at git://git.infradead.org/users/hesselba/linux-berlin.git devel/bg2-bg2cd-eth-v2 Patches 1-5 should go through David's net tree, I'll pick up the DT patches 6-9. There have been some changes, compared to the RFT - added phy-connection-type property to BG2Q PHY DT node - bail out from pxa168_eth_adjust_link when there is no change in PHY parameters. Also, add a call to phy_print_status. compared to v1 - move phy-connection-type to ethernet node instead of PHY node Patch 1 adds support for Marvell 88E3016 FastEthernet PHY that is also integrated in Marvell Berlin BG2/BG2CD SoCs. Patch 2 allows to pass phy_interface_t on pxa168_eth platform_data that is only used by mach-mmp/gplug. From the board setup, I guessed gplug's PHY is connected via RMII. The patch still isn't even compile tested. Patches 3-5 prepare proper libphy handling and finally remove all in-driver PHY mangling related to the feature explained above. Patches 6-9 add corresponding ethernet DT nodes to BG2, BG2CD, add a phy-connection-type property to BG2Q and enable ethernet on BG2-based Sony NSZ-GS7. I have tested all this on GS7 successfully with ip=dhcp on 100M FD. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:25 -04:00
Sebastian Hesselbarth	9ff32fe1b9	net: pxa168_eth: Remove in-driver PHY mangling With properly using libphy PHYs now, remove the in-driver PHY mangling. Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:21 -04:00
Sebastian Hesselbarth	1a14913289	net: pxa168_eth: Remove HW auto-negotiaion Marvell Ethernet IP supports PHY negotiation driven by HW. This fundamentally clashes with libphy (software) driven negotiation and also cannot cope with quirky PHYs. Therefore, always disable any HW negotiation features and properly use libphy's phy_device. Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:21 -04:00
Sebastian Hesselbarth	9d8ea73d3e	net: pxa168_eth: Prepare proper libphy handling Current libphy handling in pxa168_eth lacks proper phy_connect. Prepare to fix this by first moving phy properties from platform_data to private driver data. Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:20 -04:00
Sebastian Hesselbarth	e7de17abed	net: pxa168_eth: Provide phy_interface mode on platform_data The PXA168 Ethernet IP support MII and RMII connection to its PHY. Currently, pxa168 platform_data does not provide a way to pass that and there is one user of pxa168 platform_data (mach-mmp/gplug). Given the pinctrl settings of gplug it uses RMII, so add and pass a corresponding phy_interface_t. Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:20 -04:00
Sebastian Hesselbarth	6b358aedce	phy: marvell: Add support for 88E3016 FastEthernet PHY Marvell 88E3016 is a FastEthernet PHY that also can be found in Marvell Berlin SoCs as integrated PHY. Tested-by: Antoine Ténart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:49:20 -04:00
Geert Uytterhoeven	d4c3363e84	natsemi/macsonic: Remove superfluous interrupt disable/restore As of commit `e4dc601bf9` ("m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()"), this is no longer needed. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:43:28 -04:00
Geert Uytterhoeven	7f30b7420b	cirrus/mac89x0: Remove superfluous interrupt disable/restore As of commit `e4dc601bf9` ("m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()"), this is no longer needed. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:43:28 -04:00
Rasmus Villemoes	00fd5d94c2	net: typhoon: Remove redundant casts Both image_data and typhoon_fw->data are const u8, so the cast to u8 is unnecessary and confusing. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: David Dillow <dave@thedillows.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:41:31 -04:00
Sébastien Barré	16704b129b	Removed unused function sctp_addr_is_valid() sctp_addr_is_valid() only appeared in its definition. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:37:21 -04:00
David S. Miller	fad71e4a11	Merge branch 'ipv6_route' Martin KaFai Lau says: ==================== ipv6: Reduce the number of fib6_lookup() calls from ip6_pol_route() This patch set is trying to reduce the number of fib6_lookup() calls from ip6_pol_route(). I have adapted davem's udpflooda and kbench_mod test (https://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git) to support IPv6 and here is the result: Before: [root]# for i in $(seq 1 3); do time ./udpflood -l 20000000 -c 250 2401:face:face:face::2; done real 0m34.190s user 0m3.047s sys 0m31.108s real 0m34.635s user 0m3.125s sys 0m31.475s real 0m34.517s user 0m3.034s sys 0m31.449s [root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2 [ 660.160976] ip6_route_kbench: ip6_route_output tdiff: 933 [ 660.207261] ip6_route_kbench: ip6_route_output tdiff: 988 [ 660.253492] ip6_route_kbench: ip6_route_output tdiff: 896 [ 660.298862] ip6_route_kbench: ip6_route_output tdiff: 898 After: [root]# for i in $(seq 1 3); do time ./udpflood -l 20000000 -c 250 2401:face:face:face::2; done real 0m32.695s user 0m2.925s sys 0m29.737s real 0m32.636s user 0m3.007s sys 0m29.596s real 0m32.797s user 0m2.866s sys 0m29.898s [root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2 [ 881.220793] ip6_route_kbench: ip6_route_output tdiff: 684 [ 881.253477] ip6_route_kbench: ip6_route_output tdiff: 640 [ 881.286867] ip6_route_kbench: ip6_route_output tdiff: 630 [ 881.320749] ip6_route_kbench: ip6_route_output tdiff: 653 /**************************** udpflood.c ***************************/ / It is an adaptation of the Eric Dumazet's and David Miller's * udpflood tool, by adding IPv6 support. / typedef uint32_t u32; static int debug =3D 0; / Allow -fstrict-aliasing / typedef union sa_u { struct sockaddr_storage a46; struct sockaddr_in a4; struct sockaddr_in6 a6; } sa_u; static int usage(void) { printf("usage: udpflood [ -l count ] [ -m message_size ] [ -c num_ip_addrs= ] IP_ADDRESS\n"); return -1; } static u32 get_last32h(const sa_u sa) { if (sa->a46.ss_family =3D=3D PF_INET) return ntohl(sa->a4.sin_addr.s_addr); else return ntohl(sa->a6.sin6_addr.s6_addr32[3]); } static void set_last32h(sa_u sa, u32 last32h) { if (sa->a46.ss_family =3D=3D PF_INET) sa->a4.sin_addr.s_addr =3D htonl(last32h); else sa->a6.sin6_addr.s6_addr32[3] =3D htonl(last32h); } static void print_saddr(const sa_u sa, const char msg) { char buf[64]; if (!debug) return; switch (sa->a46.ss_family) { case PF_INET: inet_ntop(PF_INET, &(sa->a4.sin_addr.s_addr), buf, sizeof(buf)); break; case PF_INET6: inet_ntop(PF_INET6, &(sa->a6.sin6_addr), buf, sizeof(buf)); break; } printf("%s: %s\n", msg, buf); } static int send_packets(const sa_u sa, size_t num_addrs, int count, int ms= g_sz) { char msg =3D malloc(msg_sz); sa_u saddr; u32 start_addr32h, end_addr32h, cur_addr32h; int fd, i, err; if (!msg) return -ENOMEM; memset(msg, 0, msg_sz); memcpy(&saddr, sa, sizeof(saddr)); cur_addr32h =3D start_addr32h =3D get_last32h(&saddr); end_addr32h =3D start_addr32h + num_addrs; fd =3D socket(saddr.a46.ss_family, SOCK_DGRAM, 0); if (fd < 0) { perror("socket"); err =3D fd; goto out_nofd; } / connect to avoid the kernel spending time in figuring * out the source address (i.e pin the src address) / err =3D connect(fd, (struct sockaddr ) &saddr, sizeof(saddr)); if (err < 0) { perror("connect"); goto out; } print_saddr(&saddr, "start_addr"); for (i =3D 0; i < count; i++) { print_saddr(&saddr, "sendto"); err =3D sendto(fd, msg, msg_sz, 0, (struct sockaddr )&saddr, sizeof(saddr)); if (err < 0) { perror("sendto"); goto out; } if (++cur_addr32h >=3D end_addr32h) cur_addr32h =3D start_addr32h; set_last32h(&saddr, cur_addr32h); } err =3D 0; out: close(fd); out_nofd: free(msg); return err; } int main(int argc, char argv, char envp) { int port, msg_sz, count, num_addrs, ret; sa_u start_addr; port =3D 6000; msg_sz =3D 32; count =3D 10000000; num_addrs =3D 1; while ((ret =3D getopt(argc, argv, "dl:s:p:c:")) >=3D 0) { switch (ret) { case 'l': sscanf(optarg, "%d", &count); break; case 's': sscanf(optarg, "%d", &msg_sz); break; case 'p': sscanf(optarg, "%d", &port); break; case 'c': sscanf(optarg, "%d", &num_addrs); break; case 'd': debug =3D 1; break; case '?': return usage(); } } if (num_addrs < 1) return usage(); if (!argv[optind]) return usage(); start_addr.a4.sin_port =3D htons(port); if (inet_pton(PF_INET, argv[optind], &start_addr.a4.sin_addr)) start_addr.a46.ss_family =3D PF_INET; else if (inet_pton(PF_INET6, argv[optind], &start_addr.a6.sin6_addr.s6_add= r)) start_addr.a46.ss_family =3D PF_INET6; else return usage(); return send_packets(&start_addr, num_addrs, count, msg_sz); } /*************** ip6_route_kbench_mod.c ***************/ / We can't just use "get_cycles()" as on some platforms, such * as sparc64, that gives system cycles rather than cpu clock * cycles. / static inline unsigned long long get_tick(void) { unsigned long long t; __asm__ __volatile__("rd %%tick, %0" : "=r" (t)); return t; } static inline unsigned long long get_tick(void) { unsigned long long t; rdtscll(t); return t; } static inline unsigned long long get_tick(void) { return get_cycles(); } static int flow_oif = DEFAULT_OIF; static int flow_iif = DEFAULT_IIF; static u32 flow_mark = DEFAULT_MARK; static struct in6_addr flow_dst_ip_addr; static struct in6_addr flow_src_ip_addr; static int flow_tos = DEFAULT_TOS; static char dst_string[64]; static char src_string[64]; module_param_string(dst, dst_string, sizeof(dst_string), 0); module_param_string(src, src_string, sizeof(src_string), 0); static int __init flow_setup(void) { if (dst_string[0] && !in6_pton(dst_string, -1, &flow_dst_ip_addr.s6_addr[0], -1, NULL)) { pr_info("cannot parse \"%s\"\n", dst_string); return -1; } if (src_string[0] && !in6_pton(src_string, -1, &flow_src_ip_addr.s6_addr[0], -1, NULL)) { pr_info("cannot parse \"%s\"\n", dst_string); return -1; } return 0; } module_param_named(oif, flow_oif, int, 0); module_param_named(iif, flow_iif, int, 0); module_param_named(mark, flow_mark, uint, 0); module_param_named(tos, flow_tos, int, 0); static int warmup_count = DEFAULT_WARMUP_COUNT; module_param_named(count, warmup_count, int, 0); static void flow_init(struct flowi6 fl6) { memset(fl6, 0, sizeof(fl6)); fl6->flowi6_proto = IPPROTO_ICMPV6; fl6->flowi6_oif = flow_oif; fl6->flowi6_iif = flow_iif; fl6->flowi6_mark = flow_mark; fl6->flowi6_tos = flow_tos; fl6->daddr = flow_dst_ip_addr; fl6->saddr = flow_src_ip_addr; } static struct sk_buff fake_skb_get(void) { struct ipv6hdr hdr; struct sk_buff skb; skb = alloc_skb(4096, GFP_KERNEL); if (!skb) { pr_info("Cannot alloc SKB for test\n"); return NULL; } skb->dev = __dev_get_by_index(&init_net, flow_iif); if (skb->dev == NULL) { pr_info("Input device (%d) does not exist\n", flow_iif); goto err; } skb_reset_mac_header(skb); skb_reset_network_header(skb); skb_reserve(skb, MAX_HEADER + sizeof(struct ipv6hdr)); hdr = ipv6_hdr(skb); hdr->priority = 0; hdr->version = 6; memset(hdr->flow_lbl, 0, sizeof(hdr->flow_lbl)); hdr->payload_len = htons(sizeof(struct icmp6hdr)); hdr->nexthdr = IPPROTO_ICMPV6; hdr->saddr = flow_src_ip_addr; hdr->daddr = flow_dst_ip_addr; skb->protocol = htons(ETH_P_IPV6); skb->mark = flow_mark; return skb; err: kfree_skb(skb); return NULL; } static void do_full_output_lookup_bench(void) { unsigned long long t1, t2, tdiff; struct rt6_info rt; struct flowi6 fl6; int i; rt = NULL; for (i = 0; i < warmup_count; i++) { flow_init(&fl6); rt = (struct rt6_info )ip6_route_output(&init_net, NULL, &fl6); if (IS_ERR(rt)) break; ip6_rt_put(rt); } if (IS_ERR(rt)) { pr_info("ip_route_output_key: err=%ld\n", PTR_ERR(rt)); return; } flow_init(&fl6); t1 = get_tick(); rt = (struct rt6_info )ip6_route_output(&init_net, NULL, &fl6); t2 = get_tick(); if (!IS_ERR(rt)) ip6_rt_put(rt); tdiff = t2 - t1; pr_info("ip6_route_output tdiff: %llu\n", tdiff); } static void do_full_input_lookup_bench(void) { unsigned long long t1, t2, tdiff; struct sk_buff skb; struct rt6_info rt; int err, i; skb = fake_skb_get(); if (skb == NULL) goto out_free; err = 0; local_bh_disable(); for (i = 0; i < warmup_count; i++) { ip6_route_input(skb); rt = (struct rt6_info )skb_dst(skb); err = (!rt \|\| rt == init_net.ipv6.ip6_null_entry); skb_dst_drop(skb); if (err) break; } local_bh_enable(); if (err) { pr_info("Input route lookup fails\n"); goto out_free; } local_bh_disable(); t1 = get_tick(); ip6_route_input(skb); t2 = get_tick(); local_bh_enable(); rt = (struct rt6_info *)skb_dst(skb); err = (!rt \|\| rt == init_net.ipv6.ip6_null_entry); skb_dst_drop(skb); if (err) { pr_info("Input route lookup fails\n"); goto out_free; } tdiff = t2 - t1; pr_info("ip6_route_input tdiff: %llu\n", tdiff); out_free: kfree_skb(skb); } static void do_full_lookup_bench(void) { if (!flow_iif) do_full_output_lookup_bench(); else do_full_input_lookup_bench(); } static void do_bench(void) { do_full_lookup_bench(); do_full_lookup_bench(); do_full_lookup_bench(); do_full_lookup_bench(); } static int __init kbench_init(void) { if (flow_setup()) return -EINVAL; pr_info("flow [IIF(%d),OIF(%d),MARK(0x%08x),D("IP6_FMT")," "S("IP6_FMT"),TOS(0x%02x)]\n", flow_iif, flow_oif, flow_mark, IP6_PRT(flow_dst_ip_addr), IP6_PRT(flow_src_ip_addr), flow_tos); if (!cpu_has_tsc) { pr_err("X86 TSC is required, but is unavailable.\n"); return -EINVAL; } pr_info("sizeof(struct rt6_info)==%zu\n", sizeof(struct rt6_info)); do_bench(); return -ENODEV; } static void __exit kbench_exit(void) { } module_init(kbench_init); module_exit(kbench_exit); MODULE_LICENSE("GPL"); ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:52 -04:00
Martin KaFai Lau	367efcb932	ipv6: Avoid redoing fib6_lookup() with reachable = 0 by saving fn This patch save the fn before doing rt6_backtrack. Hence, without redo-ing the fib6_lookup(), saved_fn can be used to redo rt6_select() with RT6_LOOKUP_F_REACHABLE off. Some minor changes I think make sense to review as a single patch: * Remove the 'out:' goto label. * Remove the 'reachable' variable. Only use the 'strict' variable instead. After this patch, "failing ip6_ins_rt()" should be the only case that requires a redo of fib6_lookup(). Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Martin KaFai Lau	94c77bb41d	ipv6: Avoid redoing fib6_lookup() for RTF_CACHE hit case When there is a RTF_CACHE hit, no need to redo fib6_lookup() with reachable=0. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Martin KaFai Lau	a3c00e46ef	ipv6: Remove BACKTRACK macro It is the prep work to reduce the number of calls to fib6_lookup(). The BACKTRACK macro could be hard-to-read and error-prone due to its side effects (mainly goto). This patch is to: 1. Replace BACKTRACK macro with a function (fib6_backtrack) with the following return values: * If it is backtrack-able, returns next fn for retry. * If it reaches the root, returns NULL. 2. The caller needs to decide if a backtrack is needed (by testing rt == net->ipv6.ip6_null_entry). 3. Rename the goto labels in ip6_pol_route() to make the next few patches easier to read. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Kenjiro Nakayama	105970f608	net: Remove trailing whitespace in tcp.h icmp.c syncookies.c Remove trailing whitespace in tcp.h icmp.c syncookies.c Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:13:10 -04:00
Catherine Sullivan	e8720db1fb	i40e: Bump version Bump i40e version to 1.0.21. Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-By: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:05 -07:00
Akeem G Abodunrin	bf00b376d3	i40e: Moving variable declaration out of the loops Move the three variables out of the loop, so it only declares once. Change-ID: I436913777c7da3c16dc0031b59e3ffa61de74718 Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Patrick Lu <patrick.lu@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:05 -07:00
Mitch Williams	5960d33f91	i40e: Add 10GBaseT support Add driver support for 10GBaseT device. Change-ID: I4be6ed847ac0bddd220b9878a95c523b32038174 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:04 -07:00
Mitch Williams	a34a6711f8	i40e: process link events when setting up switch Add code to handle link events when updating the PF switch. This allows link information to be properly provided to VFs in all cases. Change-ID: If314c95f3d39259ef4c40a4a3b823381e28fb24f Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:04 -07:00
Anjali Singhai Jain	cafa2ee6fb	i40e: Fix a bug where Rx would stop after some time Move the setting of flow control because this should be done at a pf level not a vsi level. Also add a sleep and restart an to fix a bug where Rx would stop after some stress. Change-ID: I9a93d8c2ff27c39339eb00bc4ec1225e43900be0 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:03 -07:00
Neerav Parikh	f98a20068d	i40e/i40evf: Use usleep_range() instead of udelay() As per the Documentation/timers/timers-howto.txt it is preferred to use usleep_range() instead of udelay() if the delay value is > 10us in non-atomic contexts. So, replacing all the instances of udelay() with 10 or greater than 10 micro seconds delay in the driver and using usleep_range() instead. Change-ID: Iaa2ab499a4c26f6005e5d86cc421407ef9de16c7 Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:03 -07:00
Jeff Kirsher	8c570dcc8c	i40e/i40evf: Fix whitespace indentation This is one small step in making the indentation more consistent. If we truly want to align values, then use tabs rather than spaces. Change-ID: I12368bc77a52f296d1843fdcb67201a7d7cd4749 Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com>	2014-10-23 20:38:03 -07:00
Jesse Brandeburg	1e701e09d8	i40e: enable LSE poke and simplify link state The driver can do a simpler job of managing link state by simply using the admin queue receive event for link events as a doorbell that tells the driver to update link state. Additionally, add a workaround will help make sure the link state in the hardware is consistent with the link state the driver is reporting by refreshing the link state every service task interval. Change-ID: Ib95b5b7b8cc016e97d8009f6363c9f9eed301444 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:02 -07:00
Jesse Brandeburg	7e2453fee8	i40e: mask phy events Tell the firmware what kind of link related events the driver is interested in. In this case, just link up/down and qualified module events are the ones the driver really cares about. Change-ID: If132c812c340c8e1927c2caf6d55185296b66201 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-10-23 20:38:02 -07:00
Haiyang Zhang	942396b019	hyperv: Fix the total_data_buflen in send path total_data_buflen is used by netvsc_send() to decide if a packet can be put into send buffer. It should also include the size of RNDIS message before the Ethernet frame. Otherwise, a messge with total size bigger than send_section_size may be copied into the send buffer, and cause data corruption. [Request to include this patch to the Stable branches] Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 17:58:50 -04:00
David S. Miller	f765678e32	Merge branch 'amd-xgbe' Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver fixes 2014-10-22 The following series of patches includes fixes to the driver. - Properly handle feature changes via ethtool by using correctly sized variables - Perform proper napi packet counting and budget checking This patch series is based on net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 17:50:39 -04:00
Lendacky, Thomas	55ca6bcd73	amd-xgbe: Fix napi Rx budget accounting Currently the amd-xgbe driver increments the packets processed counter each time a descriptor is processed. Since a packet can be represented by more than one descriptor incrementing the counter in this way is not appropriate. Also, since multiple descriptors cause the budget check to be short circuited, sometimes the returned value from the poll function would be larger than the budget value resulting in a WARN_ONCE being triggered. Update the polling logic to properly account for the number of packets processed and exit when the budget value is reached. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 17:50:31 -04:00
Lendacky, Thomas	386f1c9650	amd-xgbe: Properly handle feature changes via ethtool The ndo_set_features callback function was improperly using an unsigned int to save the current feature value for features such as NETIF_F_RXCSUM. Since that feature is in the upper 32 bits of a 64 bit variable the result was always 0 making it not possible to actually turn off the hardware RX checksum support. Change the unsigned int type to the netdev_features_t type in order to properly capture the current value and perform the proper operation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 17:50:31 -04:00
Philipp Zabel	81f35ffde0	net: fec: ptp: fix NULL pointer dereference if ptp_clock is not set Since commit `278d240478` (net: fec: ptp: Enable PPS output based on ptp clock) fec_enet_interrupt calls fec_ptp_check_pps_event unconditionally, which calls into ptp_clock_event. If fep->ptp_clock is NULL, ptp_clock_event tries to dereference the NULL pointer. Since on i.MX53 fep->bufdesc_ex is not set, fec_ptp_init is never called, and fep->ptp_clock is NULL, which reliably causes a kernel panic. This patch adds a check for fep->ptp_clock == NULL in fec_enet_interrupt. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 17:48:06 -04:00
Sathya Perla	9e7ceb0607	net: fix saving TX flow hash in sock for outgoing connections The commit "net: Save TX flow hash in sock and set in skbuf on xmit" introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate and record flow hash(sk_txhash) in the socket structure. sk_txhash is used to set skb->hash which is used to spread flows across multiple TXQs. But, the above routines are invoked before the source port of the connection is created. Because of this all outgoing connections that just differ in the source port get hashed into the same TXQ. This patch fixes this problem for IPv4/6 by invoking the the above routines after the source port is available for the socket. Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit") Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 16:14:29 -04:00
Li RongQing	789f202326	xfrm6: fix a potential use after free in xfrm6_policy.c pskb_may_pull() maybe change skb->data and make nh and exthdr pointer oboslete, so recompute the nd and exthdr Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 15:38:48 -04:00
LEROY Christophe	8751b12cd9	net: fs_enet: set back promiscuity mode after restart After interface restart (eg: after link disconnection/reconnection), the bridge function doesn't work anymore. This is due to the promiscuous mode being cleared by the restart. The mac-fcc already includes code to set the promiscuous mode back during the restart. This patch adds the same handling to mac-fec and mac-scc. Tested with bridge function on MPC885 with FEC. Reported-by: Germain Montoies <germain.montoies@c-s.fr> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 15:33:13 -04:00
Karl Beldan	a63ba13eec	net: tso: fix unaligned access to crafted TCP header in helper API The crafted header start address is from a driver supplied buffer, which one can reasonably expect to be aligned on a 4-bytes boundary. However ATM the TSO helper API is only used by ethernet drivers and the tcp header will then be aligned to a 2-bytes only boundary from the header start address. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 12:52:55 -04:00
Jon Cooper	8fc963515e	sfc: remove incorrect EFX_BUG_ON_PARANOID check write_count and insert_count can wrap around, making > check invalid. Fixes: `70b33fb0dd` ("sfc: add support for skb->xmit_more"). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 12:51:16 -04:00
Sabrina Dubroca	7c1c97d54f	net: sched: initialize bstats syncp Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp. Fixes: `22e0f8b932` "net: sched: make bstats per cpu and estimator RCU safe" Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 21:45:21 -04:00
Alexei Starovoitov	32bf08a625	bpf: fix bug in eBPF verifier while comparing for verifier state equivalency the comparison was missing a check for uninitialized register. Make sure it does so and add a testcase. Fixes: `f1bca824da` ("bpf: add search pruning optimization to verifier") Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 21:43:46 -04:00
Thomas Graf	78fd1d0ab0	netlink: Re-add locking to netlink_lookup() and seq walker The synchronize_rcu() in netlink_release() introduces unacceptable latency. Reintroduce minimal lookup so we can drop the synchronize_rcu() until socket destruction has been RCUfied. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 21:34:49 -04:00
Ying Xue	1a194c2d59	tipc: fix lockdep warning when intra-node messages are delivered When running tipcTC&tipcTS test suite, below lockdep unsafe locking scenario is reported: [ 1109.997854] [ 1109.997988] ================================= [ 1109.998290] [ INFO: inconsistent lock state ] [ 1109.998575] 3.17.0-rc1+ #113 Not tainted [ 1109.998762] --------------------------------- [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] {SOFTIRQ-ON-W} state was registered at: [ 1109.998762] [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80 [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc] [ 1109.998762] [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc] [ 1109.998762] [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc] [ 1109.998762] [<ffffffff817676ee>] SYSC_connect+0xae/0xc0 [ 1109.998762] [<ffffffff81767b7e>] SyS_connect+0xe/0x10 [ 1109.998762] [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200 [ 1109.998762] [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f [ 1109.998762] irq event stamp: 241060 [ 1109.998762] hardirqs last enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0 [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0 [ 1109.998762] softirqs last enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50 [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [ 1109.998762] other info that might help us debug this: [ 1109.998762] Possible unsafe locking scenario: [ 1109.998762] [ 1109.998762] CPU0 [ 1109.998762] ---- [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] <Interrupt> [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] [ 1109.998762] * DEADLOCK * [ 1109.998762] [ 1109.998762] 2 locks held by swapper/7/0: [ 1109.998762] #0: (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] #1: (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [ 1109.998762] stack backtrace: [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113 [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 1109.998762] ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007 [ 1109.998762] ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001 [ 1109.998762] ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000 [ 1109.998762] Call Trace: [ 1109.998762] <IRQ> [<ffffffff81a209eb>] dump_stack+0x4e/0x68 [ 1109.998762] [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202 [ 1109.998762] [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50 [ 1109.998762] [<ffffffff810a406c>] mark_lock+0x28c/0x2f0 [ 1109.998762] [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0 [ 1109.998762] [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80 [ 1109.998762] [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10 [ 1109.998762] [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc] [ 1109.998762] [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc] [ 1109.998762] [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70 [ 1109.998762] [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0 [ 1109.998762] [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70 [ 1109.998762] [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0 [ 1109.998762] [<ffffffff81785518>] napi_gro_receive+0xd8/0x240 [ 1109.998762] [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530 [ 1109.998762] [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffff817842b1>] net_rx_action+0x141/0x310 [ 1109.998762] [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150 [ 1109.998762] [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0 [ 1109.998762] [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [<ffffffff81a30d07>] do_IRQ+0x67/0x110 [ 1109.998762] [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f [ 1109.998762] <EOI> [<ffffffff8100d2b7>] ? default_idle+0x37/0x250 [ 1109.998762] [<ffffffff8100d2b5>] ? default_idle+0x35/0x250 [ 1109.998762] [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20 [ 1109.998762] [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0 [ 1109.998762] [<ffffffff81034c78>] start_secondary+0x188/0x1f0 When intra-node messages are delivered from one process to another process, tipc_link_xmit() doesn't disable BH before it directly calls tipc_sk_rcv() on process context to forward messages to destination socket. Meanwhile, if messages delivered by remote node arrive at the node and their destinations are also the same socket, tipc_sk_rcv() running on process context might be preempted by tipc_sk_rcv() running BH context. As a result, the latter cannot obtain the socket lock as the lock was obtained by the former, however, the former has no chance to be run as the latter is owning the CPU now, so headlock happens. To avoid it, BH should be always disabled in tipc_sk_rcv(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:28:15 -04:00
Ying Xue	7b8613e0a1	tipc: fix a potential deadlock Locking dependency detected below possible unsafe locking scenario: CPU0 CPU1 T0: tipc_named_rcv() tipc_rcv() T1: [grab nametble write lock]* [grab node lock]* T2: tipc_update_nametbl() tipc_node_link_up() T3: tipc_nodesub_subscribe() tipc_nametbl_publish() T4: [grab node lock]* [grab nametble write lock]* The opposite order of holding nametbl write lock and node lock on above two different paths may result in a deadlock. If we move the the updating of the name table after link state named out of node lock, the reverse order of holding locks will be eliminated, and as a result, the deadlock risk. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:28:15 -04:00
David S. Miller	73829bf6fe	Merge branch 'enic' Govindarajulu Varadarajan says: ==================== enic: Bug fixes This series fixes the following problem. Please apply this to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:24:30 -04:00
Govindarajulu Varadarajan	39dc90c159	enic: Do not call napi_disable when preemption is disabled. In enic_stop, we disable preemption using local_bh_disable(). We disable preemption to wait for busy_poll to finish. napi_disable should not be called here as it might sleep. Moving napi_disable() call out side of local_bh_disable. BUG: sleeping function called from invalid context at include/linux/netdevice.h:477 in_atomic(): 1, irqs_disabled(): 0, pid: 443, name: ifconfig INFO: lockdep is turned off. Preemption disabled at:[<ffffffffa029c5c4>] enic_rfs_flw_tbl_free+0x34/0xd0 [enic] CPU: 31 PID: 443 Comm: ifconfig Not tainted 3.17.0-netnext-05504-g59f35b8 #268 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 ffff8800dac10000 ffff88020b8dfcb8 ffffffff8148a57c 0000000000000000 ffff88020b8dfcd0 ffffffff8107e253 ffff8800dac12a40 ffff88020b8dfd10 ffffffffa029305b ffff88020b8dfd48 ffff8800dac10000 ffff88020b8dfd48 Call Trace: [<ffffffff8148a57c>] dump_stack+0x4e/0x7a [<ffffffff8107e253>] __might_sleep+0x123/0x1a0 [<ffffffffa029305b>] enic_stop+0xdb/0x4d0 [enic] [<ffffffff8138ed7d>] __dev_close_many+0x9d/0xf0 [<ffffffff8138ef81>] __dev_close+0x31/0x50 [<ffffffff813974a8>] __dev_change_flags+0x98/0x160 [<ffffffff81397594>] dev_change_flags+0x24/0x60 [<ffffffff814085fd>] devinet_ioctl+0x63d/0x710 [<ffffffff81139c16>] ? might_fault+0x56/0xc0 [<ffffffff81409ef5>] inet_ioctl+0x65/0x90 [<ffffffff813768e0>] sock_do_ioctl+0x20/0x50 [<ffffffff81376ebb>] sock_ioctl+0x20b/0x2e0 [<ffffffff81197250>] do_vfs_ioctl+0x2e0/0x500 [<ffffffff81492619>] ? sysret_check+0x22/0x5d [<ffffffff81285f23>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8109fe19>] ? trace_hardirqs_on_caller+0x119/0x270 [<ffffffff811974ac>] SyS_ioctl+0x3c/0x80 [<ffffffff814925ed>] system_call_fastpath+0x1a/0x1f Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:24:25 -04:00
Govindarajulu Varadarajan	b6931c9ba7	enic: fix possible deadlock in enic_stop/ enic_rfs_flw_tbl_free The following warning is shown when spinlock debug is enabled. This occurs when enic_flow_may_expire timer function is running and enic_stop is called on same CPU. Fix this by using spink_lock_bh(). ================================= [ INFO: inconsistent lock state ] 3.17.0-netnext-05504-g59f35b8 #268 Not tainted --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. ifconfig/443 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&enic->rfs_h.lock)->rlock){+.?...}, at: enic_rfs_flw_tbl_free+0x34/0xd0 [enic] {IN-SOFTIRQ-W} state was registered at: [<ffffffff810a25af>] __lock_acquire+0x83f/0x21c0 [<ffffffff810a45f2>] lock_acquire+0xa2/0xd0 [<ffffffff814913fc>] _raw_spin_lock+0x3c/0x80 [<ffffffffa029c3d5>] enic_flow_may_expire+0x25/0x130[enic] [<ffffffff810bcd07>] call_timer_fn+0x77/0x100 [<ffffffff810bd8e3>] run_timer_softirq+0x1e3/0x270 [<ffffffff8105f9ae>] __do_softirq+0x14e/0x280 [<ffffffff8105fdae>] irq_exit+0x8e/0xb0 [<ffffffff8103da0f>] smp_apic_timer_interrupt+0x3f/0x50 [<ffffffff81493742>] apic_timer_interrupt+0x72/0x80 [<ffffffff81018143>] default_idle+0x13/0x20 [<ffffffff81018a6a>] arch_cpu_idle+0xa/0x10 [<ffffffff81097676>] cpu_startup_entry+0x2c6/0x330 [<ffffffff8103b7ad>] start_secondary+0x21d/0x290 irq event stamp: 2997 hardirqs last enabled at (2997): [<ffffffff81491865>] _raw_spin_unlock_irqrestore+0x65/0x90 hardirqs last disabled at (2996): [<ffffffff814915e6>] _raw_spin_lock_irqsave+0x26/0x90 softirqs last enabled at (2968): [<ffffffff813b57a3>] dev_deactivate_many+0x213/0x260 softirqs last disabled at (2966): [<ffffffff813b5783>] dev_deactivate_many+0x1f3/0x260 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&enic->rfs_h.lock)->rlock); <Interrupt> lock(&(&enic->rfs_h.lock)->rlock); * DEADLOCK * Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:24:25 -04:00
David S. Miller	d10845fc85	Merge branch 'gso_encap_fixes' Florian Westphal says: ==================== net: minor gso encapsulation fixes The following series fixes a minor bug in the gso segmentation handlers when encapsulation offload is used. Theoretically this could cause kernel panic when the stack tries to software-segment such a GRE offload packet, but it looks like there is only one affected call site (tbf scheduler) and it handles NULL return value. I've included a followup patch to add IS_ERR_OR_NULL checks where needed. While looking into this, I also found that size computation of the individual segments is incorrect if skb->encapsulation is set. Please see individual patches for delta vs. v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-20 12:38:19 -04:00

1 2 3 4 5 ...

480343 Commits All Branches Search

480343 Commits

All Branches