OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Nicolas Dichtel	963b89e80d	sit: fix 4in4 + IPsec scenario Since commit `32b8a8e59c` "sit: add IPv4 over IPv4 support", tunnel->parms.iph.protocol is 0 when both 4in4 and 6in4 are setup, but xfrm_lookup() is called only when proto is != 0, thus we need to pass the real value. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 13:42:03 -07:00
David S. Miller	a77471ff70	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== Just one patch this time. 1) Drop packets when the matching SA is in larval state and add a statistic counter for that. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 13:23:13 -07:00
Alexey Brodkin	a4a1139b24	arc_emac: fix compile-time errors & warnings on PPC64 As reported by "kbuild test robot" there were some errors and warnings on attempt to build kernel with "make ARCH=powerpc allmodconfig". And this patch addresses both errors and warnings. Below is a list of introduced changes: 1. Fix compile-time errors (misspellings in "dma_unmap_single") on PPC. 2. Use DMA address instead of "skb->data" as a pointer to data buffer. This fixed warnings on pointer to int conversion on 64-bit systems. 3. Re-implemented initial allocation of Rx buffers in "arc_emac_open" in the same way they're re-allocated during operation (receiving packets). So once again DMA address could be used instead of "skb->data". 4. Explicitly use EMAC_BUFFER_SIZE for Rx buffers allocation. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: netdev@vger.kernel.org Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Joe Perches <joe@perches.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Mischa Jonker <mjonker@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: linux-kernel@vger.kernel.org Cc: devicetree-discuss@lists.ozlabs.org Cc: Florian Fainelli <florian@openwrt.org> Cc: David Laight <david.laight@aculab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 01:35:44 -07:00
Veaceslav Falico	8599b52e14	bonding: add an option to fail when any of arp_ip_target is inaccessible Currently, we fail only when all of the ips in arp_ip_target are gone. However, in some situations we might need to fail if even one host from arp_ip_target becomes unavailable. All situations, obviously, rely on the idea that we need completely functional network, with all interfaces/addresses working correctly. One real world example might be: vlans on top on bond (hybrid port). If bond and vlans have ips assigned and we have their peers monitored via arp_ip_target - in case of switch misconfiguration (trunk/access port), slave driver malfunction or tagged/untagged traffic dropped on the way - we will be able to switch to another slave. Though any other configuration needs that if we need to have access to all arp_ip_targets. This patch adds this possibility by adding a new parameter - arp_all_targets (both as a module parameter and as a sysfs knob). It can be set to: 0 or any (the default) - which works exactly as it's working now - the slave is up if any of the arp_ip_targets are up. 1 or all - the slave is up if all of the arp_ip_targets are up. This parameter can be changed on the fly (via sysfs), and requires the mode to be active-backup and arp_validate to be enabled (it obeys the arp_validate config on which slaves to validate). Internally it's done through: 1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's an array of jiffies, meaning that slave->target_last_arp_rx[i] is the last time we've received arp from bond->params.arp_targets[i] on this slave. 2) If we successfully validate an arp from bond->params.arp_targets[i] in bond_validate_arp() - update the slave->target_last_arp_rx[i] with the current jiffies value. 3) When getting slave's last_rx via slave_last_rx(), we return the oldest time when we've received an arp from any address in bond->params.arp_targets[]. If the value of arp_all_targets == 0 - we still work the same way as before. Also, update the documentation to reflect the new parameter. v3->v4: Kill the forgotten rtnl_unlock(), rephrase the documentation part to be more clear, don't fail setting arp_all_targets if arp_validate is not set - it has no effect anyway but can be easier to set up. Also, print a warning if the last arp_ip_target is removed while the arp_interval is on, but not the arp_validate. v2->v3: Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(), use the same initialization value for target_last_arp_rx[] as is used for the default last_arp_rx, to avoid useless interface flaps. Also, instead of failing to remove the last arp_ip_target just print a warning - otherwise it might break existing scripts. v1->v2: Correctly handle adding/removing hosts in arp_ip_target - we need to shift/initialize all slave's target_last_arp_rx. Also, don't fail module loading on arp_all_targets misconfiguration, just disable it, and some minor style fixes. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	d7d35c681f	bonding: doc: some details on backup slave arp validation Add some details to bonding documentation on how backup slave arp validation works. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	aeea64ac71	bonding: don't trust arp requests unless active slave really works Currently, if we receive any arp packet on a backup slave in active-backup mode and arp_validate enabled, we suppose that it's an arp request, swap source/target ip and try to validate it. This optimization gives us virtually no downtime in the most common situation (active and backup slaves are in the same broadcast domain and the active slave failed). However, if we can't reach the arp_ip_target(s), we end up in an endless loop of reselecting slaves, because we receive our arp requests, sent by the active slave, and think that backup slaves are up, thus selecting them as active and, again, sending arp requests, which fool our backup slaves. Fix this by not validating the swapped arp packets if the current active slave didn't receive any arp reply after it was selected as active. This way we will only accept arp requests if we know that the current active slave can actually reach arp_ip_target. v3->v4: Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion. v1->v3: No change. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	2c14610210	bonding: don't validate arp if we don't have to Currently, we validate all the incoming arps if arp_validate not 0. However, we don't have to validate backup slaves if arp_validate == active and vice versa, so return early in bond_arp_rcv() in these cases. It works correctly now because we verify arp_validate in slave_last_rx(), however we're just doing useless work in bond_arp_rcv(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	0afee4e8b9	bonding: don't add duplicate targets to arp_ip_target Print a warning and skip them. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:38 -07:00
Veaceslav Falico	87a7b84b58	bonding: add helper function bond_get_targets_ip(targets, ip) Add function bond_get_targets_ip(targets, ip) which searches through targets array of ips (arp_targets) and returns the position of first match. If ip == 0, returns the first free slot. On failure to find the ip or free slot, return -1. Use it to verify if the arp we've received is valid and in sysfs. v1->v2: Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)", per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might update 'null' arp_ip_targets' last_rx. Also, address style. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:58:37 -07:00
Lad, Prabhakar	277e2a84c1	net: davinci_mdio: gaurd the DT code with IS_ENABLED(CONFIG_OF) guard the davinci_mdio_of_mtable table and davinci_mdio_probe_dt() with CONFIG_OF. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:52:29 -07:00
Lad, Prabhakar	151328c828	net: davinci_emac: simplify the OF parser code This patch cleans up the OF parser code, removes unnecessary checks on of_property_read_*() and guards davinci_emac_of_match table with CONFIG_OF. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:52:29 -07:00
Lad, Prabhakar	6892b41d97	net: davinci: emac: Convert to devm_* api Use devm_ioremap_resource instead of devm_request_mem_region()/devm_ioremap() and devm_request_irq() instead of request_irq(). This ensures more consistent error values and simplifies error paths. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:52:29 -07:00
Cong Wang	7623757661	doc: fix some syntax errors in netlink mmap sample code Cc: Patrick McHardy <kaber@trash.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:47:02 -07:00
Vlad Yasevich	3e4f8b7873	macvtap: Perform GSO on forwarding path. When macvtap forwards skb to its tap, it needs to check if GSO needs to be performed. This is sometimes necessary when the HW device performed GRO, but the guest reading from the tap does not support it (ex: Windows 7). Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:45:23 -07:00
Vlad Yasevich	2be5c76794	macvtap: Let TUNSETOFFLOAD actually controll offload features. When the user issues TUNSETOFFLOAD ioctl, macvtap does not do anything other then to verify arguments. This patch adds functionality to allow users to actually control offload features. NETIF_F_GSO and NETIF_F_GRO are always on, but the rest of the features can be controlled. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:44:56 -07:00
Vlad Yasevich	ac4e4af1e5	macvtap: Consistently use rcu functions Currently macvtap uses rcu_bh functions in its user facing fuction macvtap_get_user() and macvtap_put_user(). However, its packet handlers use normal rcu as the rcu_read_lock() is taken in netif_receive_skb(). We can safely discontinue the usage or rcu with bh disabled. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:44:56 -07:00
Vlad Yasevich	441ac0fcaa	macvtap: Convert to using rtnl lock Macvtap uses a private lock to protect the relationship between macvtap_queue and macvlan_dev. The private lock is not needed since the relationship is managed by user via open(), release(), and dellink() calls. dellink() already happens under rtnl, so we can safely convert open() and release(), and use it in ioctl() as well. Suggested by Eric Dumazet. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:44:56 -07:00
Eliezer Tamir	2d48d67fa8	net: poll/select low latency socket support select/poll busy-poll support. Split sysctl value into two separate ones, one for read and one for poll. updated Documentation/sysctl/net.txt Add a new poll flag POLL_LL. When this flag is set, sock_poll will call sk_poll_ll if possible. sock_poll sets this flag in its return value to indicate to select/poll when a socket that can busy poll is found. When poll/select have nothing to report, call the low-level sock_poll again until we are out of time or we find something. Once the system call finds something, it stops setting POLL_LL, so it can return the result to the user ASAP. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:35:52 -07:00
Alexey Brodkin	e4f2379db6	ethernet/arc/arc_emac - Add new driver Driver for non-standard on-chip ethernet device ARC EMAC 10/100, instantiated in some legacy ARC (Synopsys) FPGA Boards such as ARCAngel4/ML50x. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Joe Perches <joe@perches.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Mischa Jonker <mjonker@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-kernel@vger.kernel.org Cc: devicetree-discuss@lists.ozlabs.org Cc: Florian Fainelli <florian@openwrt.org> Cc: David Laight <david.laight@aculab.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:34:32 -07:00
Daniel Borkmann	62208f1245	net: sctp: simplify sctp_get_port No need to have an extra ret variable when we directly can return the value of sctp_get_port_local(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:05 -07:00
Daniel Borkmann	0a2fbac197	net: sctp: decouple cleaning some socket data from endpoint Rather instead of having the endpoint clean the garbage from the socket, use a sk_destruct handler sctp_destruct_sock(), that does the job for that when there are no more references on the socket. At least do this for our crypto transform through crypto_free_hash() that is allocated when in listening state. Also, perform sctp_put_port() only when sk is valid. At a later point in time we can still determine if there's an option of placing this into sk_prot->unhash() or sctp_endpoint_free() without any races. For now, leave it in sctp_endpoint_destroy() though. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	b527fe6933	net: sctp: minor: sctp_seq_dump_local_addrs add missing newline A trailing newline has been forgotten to add into the WARN(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	52db882f3f	net: sctp: migrate cookie life from timeval to ktime Currently, SCTP code defines its own timeval functions (since timeval is rarely used inside the kernel by others), namely tv_lt() and TIMEVAL_ADD() macros, that operate on SCTP cookie expiration. We might as well remove all those, and operate directly on ktime structures for a couple of reasons: ktime is available on all archs; complexity of ktime calculations depending on the arch is less than (reduces to a simple arithmetic operations on archs with BITS_PER_LONG == 64 or CONFIG_KTIME_SCALAR) or equal to timeval functions (other archs); code becomes more readable; macros can be thrown out. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	d36f82b243	ktime: add ms_to_ktime() and ktime_add_ms() helpers Add two ktime helper functions that i) convert a given msec value to a ktime structure and ii) that adds a msec value to a ktime structure. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	f072d7aba7	net: sctp: remove TEST_FRAME ifdef We do neither ship a test_frame.h, nor will this be compatible with the 2.5 out-of-tree lksctp kernel test suite anyway. So remove this artefact. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Jack Morgenstein	30e514a717	net/mlx4_core: Fail device init if num_vfs is negative Should not allow negative num_vfs Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Dotan Barak	674925edb4	net/mlx4_core: Add warning in case of command timeouts Warning prints when there are command timeout to help debugging future failures. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Dotan Barak	618fad954b	net/mlx4_core: Replace sscanf() with kstrtoint() It is not safe to use sscanf. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Dotan Barak	42f1e9020e	net/mlx4_en: Remove an unnecessary test Since this variable is now part of a structure and not allocated dynamically, this test is irrelevant now. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Yevgeny Petrilin	b944ebec78	net/mlx4_en: Add prints when TX timeout occurs Print a warning when a TX timeout is detected Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Eugenia Emantayev	0cc5c8bf11	net/mlx4_en: Fix a race between napi poll function and RX ring cleanup The RX rings were cleaned while there was still possible RX traffic completion handling. Change the sequance of events so that the port is closed and the QPs are being stopped before RX cleanup. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Eugenia Emantayev	9e19b54554	net/mlx4_en: Change log level from error to debug for vlan related messages The port vlan table size is 126 (used for IBoE) so after 126 we will not have space and the user need to see it only in debug print and not error. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Eugenia Emantayev	4801ae70d8	net/mlx4_en: Move register_netdev() to the end of initialization function To avoid a race between the open function and everything that happens after register_netdev() move it to be the last operation called. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Jack Morgenstein	6123db2ec5	net/mlx4_en: Do not query stats when device port is down There are no counters allocated to the eth device when the port is down, so this query is meaningless at that time. It also leads to querying incorrect counters (since the counter_index is not valid when the device port is down). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:38 -07:00
Dotan Barak	8850494a33	net/mlx4_en: Fix resource leak in error flow Wrong condition was used when calling iounmap. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:38 -07:00
Hannes Frederic Sowa	2b9651d72d	ipv6: remove old token ipv6 address as soon as possible If the tokenized ip address is re-set on an interface we depend on the arrival of a new router advertisment to call addrconf_verify to clean up the old address (which valid_lft is now set to 0). Old addresses can linger around for a longer time if e.g. the source of router advertisments vanishes. So, call addrconf_verify immediately after setting the new tokenized address to get rid of the old tokenized addresses. Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:28:24 -07:00
Hannes Frederic Sowa	876fd05ddb	ipv6: don't disable interface if last ipv6 address is removed The reason behind this change is that as soon as we delete the last ipv6 address of an interface we also lose the /proc/sys/net/ipv6/conf/<interface> directory. This seems to be a usability problem for me. I don't see any reason why we should shutdown ipv6 on that interface in such cases. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:23:03 -07:00
Hannes Frederic Sowa	b7b1bfce0b	ipv6: split duplicate address detection and router solicitation timer This patch splits the timers for duplicate address detection and router solicitations apart. The router solicitations timer goes into inet6_dev and the dad timer stays in inet6_ifaddr. The reason behind this patch is to reduce the number of unneeded router solicitations send out by the host if additional link-local addresses are created. Currently we send out RS for every link-local address on an interface. If the RS timer fires we pick a source address with ipv6_get_lladdr. This change could hurt people adding additional link-local addresses and specifying these addresses in the radvd clients section because we no longer guarantee that we use every ll address as source address in router solicitations. Cc: Flavio Leitner <fleitner@redhat.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Stevens <dlstevens@us.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:23:03 -07:00
Eric Dumazet	51151a16a6	mlx4: allow order-0 memory allocations in RX path Signed-off-by: Eric Dumazet <edumazet@google.com> mlx4 exclusively uses order-2 allocations in RX path, which are likely to fail under memory pressure. We therefore drop frames more than needed. This patch tries order-3, order-2, order-1 and finally order-0 allocations to keep good performance, yet allow allocations if/when memory gets fragmented. By using larger pages, and avoiding unnecessary get_page()/put_page() on compound pages, this patch improves performance as well, lowering false sharing on struct page. Also use GFP_KERNEL allocations in initialization path, as allocating 12 MB (390 order-3 pages) can easily fail with GFP_ATOMIC. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:18:31 -07:00
David S. Miller	3bae9db9aa	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next Ben Hutchings says: ==================== 1. Make EEH recovery work when using legacy interrupts, from Alexandre Rames. 2. Enable accelerated RFS for VLAN-tagged flows, from Andy Lutomirski. 3. Improve performance for non-TCP (and particularly UDP) traffic, which regressed in 3.10 when we switched to always allocating paged RX buffers. Partly by Jon Cooper. 4. Some minor bug fixes to IOMMU detection, timestamping capabilities, and IRQ cleanup on the probe failure path. I've dropped the RX skb cache, which improved some benchmarks but perhaps needs some reworking to be more generally useful. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:11:41 -07:00
Sebastian Ott	6541aa52a0	qeth: use default napi weight Since commit `82dc3c63c6` "net: introduce NAPI_POLL_WEIGHT" network drivers receive a warning when they use napi weight higher than NAPI_POLL_WEIGHT. This patch reduces QETH_NAPI_WEIGHT from 128 to 64 (NAPI_POLL_WEIGHT). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:10:14 -07:00
Stefan Raspl	ede8867166	qeth: Fix crash on initial MTU size change When the initial MTU size is changed prior to any activity on the device (e.g. by attaching a z/VM vNIC already configured in Linux to a guestLAN), we call dev_kfree_skb_irq(NULL) which results in a kernel panic. Adding a proper check for NULL pointers to address this issue. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:10:14 -07:00
Ursula Braun	a0c98523d7	qeth: change default standard blkt settings for OSA blkt settings (or LAN idle settings) for an OSA Express card determine when and how often an OSA Express card tells the operating system about new incoming packets. The semantic of these settings has changed starting with OSA Express3. Currently the qeth standard settings apply to OSA Express2 and older generations of OSA Express cards, while new generations of OSA Express cards require extra coding of their reasonable default. To cover future OSA Express generations the qeth default standard blkt setting is now the desired setting for OSA generations starting with OSA Express3, while the fixed set of older OSA Express cards receives its blkt settings explicitly. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Reviewed-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:10:14 -07:00
Stefan Raspl	fe44014a82	qeth: Increase default MTU for OSA devices Increase the default MTU for real OSA devices in layer 2 mode to 1500 Bytes for increased compatibility. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:10:14 -07:00
Andy Shevchenko	819bc78f53	netiucv: remove unused macro If someone is interested to dump something they may consider to use print_hex_dump() or print_hex_dump_bytes() kernel helpers. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:10:14 -07:00
Yuval Mintz	c957d09ffd	bnx2x: Remove sparse and coccinelle warnings This patch solves several sparse issues as well as an unneeded semicolon found via coccinelle. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 02:46:05 -07:00
Eric Dumazet	6da334ee0c	ipv6: add include file to suppress sparse warnings commit `f88c91ddba` ("ipv6: statically link register_inet6addr_notifier()" added following sparse warnings : net/ipv6/addrconf_core.c:83:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static? net/ipv6/addrconf_core.c:89:5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static? net/ipv6/addrconf_core.c:95:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static? Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 02:44:05 -07:00
Eric Dumazet	7ae8639c9d	tcp: remove invalid __rcu annotation struct tcp_fastopen_context has a field named tfm, which is a pointer to a crypto_cipher structure. It currently has a __rcu annotation, which is not needed at all. tcp_fastopen_ctx is the pointer fetched by rcu_dereference(), but once we have a pointer to current tcp_fastopen_context, we do not use/need rcu_dereference() to access tfm. This fixes a lot of sparse errors like the following : net/ipv4/tcp_fastopen.c:21:31: warning: incorrect type in argument 1 (different address spaces) net/ipv4/tcp_fastopen.c:21:31: expected struct crypto_cipher tfm net/ipv4/tcp_fastopen.c:21:31: got struct crypto_cipher [noderef] <asn:4>tfm Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 02:44:05 -07:00
Daniel Borkmann	e4fc408e0e	packet: nlmon: virtual netlink monitoring device for packet sockets Currently, there is no good possibility to debug netlink traffic that is being exchanged between kernel and user space. Therefore, this patch implements a netlink virtual device, so that netlink messages will be made visible to PF_PACKET sockets. Once there was an approach with a similar idea [1], but it got forgotten somehow. I think it makes most sense to accept the "overhead" of an extra netlink net device over implementing the same functionality from PF_PACKET sockets once again into netlink sockets. We have BPF filters that can already be easily applied which even have netlink extensions, we have RX_RING zero-copy between kernel- and user space that can be reused, and much more features. So instead of re-implementing all of this, we simply pass the skb to a given PF_PACKET socket for further analysis. Another nice benefit that comes from that is that no code needs to be changed in user space packet analyzers (maybe adding a dissector, but not more), thus out of the box, we can already capture pcap files of netlink traffic to debug/troubleshoot netlink problems. Also thanks goes to Thomas Graf, Flavio Leitner, Jesper Dangaard Brouer. [1] http://marc.info/?l=linux-netdev&m=113813401516110 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 16:39:05 -07:00
Daniel Borkmann	bcbde0d449	net: netlink: virtual tap device management Similarly to the networking receive path with ptype_all taps, we add the possibility to register netdevices that are for ARPHRD_NETLINK to the netlink subsystem, so that those can be used for netlink analyzers resp. debuggers. We do not offer a direct callback function as out-of-tree modules could do crap with it. Instead, a netdevice must be registered properly and only receives a clone, managed by the netlink layer. Symbols are exported as GPL-only. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 16:39:05 -07:00

1 2 3 4 5 ...

378205 Commits All Branches Search

378205 Commits

All Branches