linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Mugunthan V N	fc7a99fb71	drivers: net: davinci_cpdma: remove spinlock as SOFTIRQ-unsafe lock order detected remove spinlock in cpdma_desc_pool_destroy() as there is no active cpdma channel and iounmap should be called without auquiring lock. root@dra7xx-evm:~# modprobe -r ti_cpsw [ 50.539743] [ 50.541312] ====================================================== [ 50.547796] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] [ 50.554826] 3.14.19-02124-g95c5b7b #308 Not tainted [ 50.559939] ------------------------------------------------------ [ 50.566416] modprobe/1921 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 50.573347] (vmap_area_lock){+.+...}, at: [<c01127fc>] find_vmap_area+0x10/0x6c [ 50.581132] [ 50.581132] and this task is already holding: [ 50.587249] (&(&pool->lock)->rlock#2){..-...}, at: [<bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma] [ 50.597766] which would create a new lock dependency: [ 50.603048] (&(&pool->lock)->rlock#2){..-...} -> (vmap_area_lock){+.+...} [ 50.610296] [ 50.610296] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 50.618601] (&(&pool->lock)->rlock#2){..-...} ... which became SOFTIRQ-irq-safe at: [ 50.626829] [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c [ 50.632677] [<bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma] [ 50.640437] [<bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma] [ 50.647289] [<bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma] [ 50.654512] [<bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma] [ 50.661455] [<bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw] [ 50.667038] [<c05844f4>] net_rx_action+0xc0/0x1e8 [ 50.672150] [<c0048234>] __do_softirq+0xcc/0x304 [ 50.677183] [<c004873c>] irq_exit+0xa8/0xfc [ 50.681751] [<c000eeac>] handle_IRQ+0x50/0xb0 [ 50.686513] [<c0008638>] gic_handle_irq+0x28/0x5c [ 50.691628] [<c06590a4>] __irq_svc+0x44/0x5c [ 50.696289] [<c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44 [ 50.702591] [<c065a9c4>] do_page_fault.part.9+0x144/0x3c4 [ 50.708433] [<c065acb8>] do_page_fault+0x74/0x84 [ 50.713453] [<c00083dc>] do_DataAbort+0x34/0x98 [ 50.718391] [<c065923c>] __dabt_usr+0x3c/0x40 [ 50.723148] [ 50.723148] to a SOFTIRQ-irq-unsafe lock: [ 50.728893] (vmap_area_lock){+.+...} ... which became SOFTIRQ-irq-unsafe at: [ 50.736476] ... [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 50.741876] [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300 [ 50.747908] [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134 [ 50.754210] [<c011486c>] get_vm_area_caller+0x3c/0x48 [ 50.759692] [<c0114be0>] vmap+0x40/0x78 [ 50.763900] [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0 [ 50.769835] [<c093eac0>] start_kernel+0x320/0x388 [ 50.774952] [<80008074>] 0x80008074 [ 50.778793] [ 50.778793] other info that might help us debug this: [ 50.778793] [ 50.787181] Possible interrupt unsafe locking scenario: [ 50.787181] [ 50.794295] CPU0 CPU1 [ 50.799042] ---- ---- [ 50.803785] lock(vmap_area_lock); [ 50.807446] local_irq_disable(); [ 50.813652] lock(&(&pool->lock)->rlock#2); [ 50.820782] lock(vmap_area_lock); [ 50.827086] <Interrupt> [ 50.829823] lock(&(&pool->lock)->rlock#2); [ 50.834490] [ 50.834490] * DEADLOCK * [ 50.834490] [ 50.840695] 4 locks held by modprobe/1921: [ 50.844981] #0: (&__lockdep_no_validate__){......}, at: [<c03e53e8>] driver_detach+0x44/0xb8 [ 50.854038] #1: (&__lockdep_no_validate__){......}, at: [<c03e53f4>] driver_detach+0x50/0xb8 [ 50.863102] #2: (&(&ctlr->lock)->rlock){......}, at: [<bf017c34>] cpdma_ctlr_destroy+0x1c/0x114 [davinci_cpdma] [ 50.873890] #3: (&(&pool->lock)->rlock#2){..-...}, at: [<bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma] [ 50.884871] the dependencies between SOFTIRQ-irq-safe lock and the holding lock: [ 50.892827] -> (&(&pool->lock)->rlock#2){..-...} ops: 167 { [ 50.898703] IN-SOFTIRQ-W at: [ 50.901995] [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c [ 50.909476] [<bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma] [ 50.918878] [<bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma] [ 50.927366] [<bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma] [ 50.936218] [<bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma] [ 50.944794] [<bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw] [ 50.952009] [<c05844f4>] net_rx_action+0xc0/0x1e8 [ 50.958765] [<c0048234>] __do_softirq+0xcc/0x304 [ 50.965432] [<c004873c>] irq_exit+0xa8/0xfc [ 50.971635] [<c000eeac>] handle_IRQ+0x50/0xb0 [ 50.978035] [<c0008638>] gic_handle_irq+0x28/0x5c [ 50.984788] [<c06590a4>] __irq_svc+0x44/0x5c [ 50.991085] [<c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44 [ 50.999023] [<c065a9c4>] do_page_fault.part.9+0x144/0x3c4 [ 51.006510] [<c065acb8>] do_page_fault+0x74/0x84 [ 51.013171] [<c00083dc>] do_DataAbort+0x34/0x98 [ 51.019738] [<c065923c>] __dabt_usr+0x3c/0x40 [ 51.026129] INITIAL USE at: [ 51.029335] [<c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c [ 51.036729] [<bf017d78>] cpdma_chan_submit+0x4c/0x2f0 [davinci_cpdma] [ 51.045225] [<bf02863c>] cpsw_ndo_open+0x378/0x6bc [ti_cpsw] [ 51.052897] [<c058747c>] __dev_open+0x9c/0x104 [ 51.059287] [<c05876ec>] __dev_change_flags+0x88/0x160 [ 51.066420] [<c05877e4>] dev_change_flags+0x18/0x48 [ 51.073270] [<c05ed51c>] devinet_ioctl+0x61c/0x6e0 [ 51.080029] [<c056ee54>] sock_ioctl+0x5c/0x298 [ 51.086418] [<c01350a4>] do_vfs_ioctl+0x78/0x61c [ 51.092993] [<c01356ac>] SyS_ioctl+0x64/0x74 [ 51.099200] [<c000e580>] ret_fast_syscall+0x0/0x48 [ 51.105956] } [ 51.107696] ... key at: [<bf019000>] __key.21312+0x0/0xfffff650 [davinci_cpdma] [ 51.115912] ... acquired at: [ 51.119019] [<c00899ac>] lock_acquire+0x9c/0x104 [ 51.124138] [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 51.129341] [<c01127fc>] find_vmap_area+0x10/0x6c [ 51.134547] [<c0114960>] remove_vm_area+0x8/0x6c [ 51.139659] [<c0114a7c>] __vunmap+0x20/0xf8 [ 51.144318] [<c001c350>] __arm_iounmap+0x10/0x18 [ 51.149440] [<bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma] [ 51.156560] [<bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw] [ 51.162407] [<c03e62c8>] platform_drv_remove+0x18/0x1c [ 51.168063] [<c03e4c44>] __device_release_driver+0x70/0xc8 [ 51.174094] [<c03e5458>] driver_detach+0xb4/0xb8 [ 51.179212] [<c03e4a6c>] bus_remove_driver+0x4c/0x90 [ 51.184693] [<c00b024c>] SyS_delete_module+0x10c/0x198 [ 51.190355] [<c000e580>] ret_fast_syscall+0x0/0x48 [ 51.195661] [ 51.197217] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: [ 51.205986] -> (vmap_area_lock){+.+...} ops: 520 { [ 51.211032] HARDIRQ-ON-W at: [ 51.214321] [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 51.221090] [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.228750] [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134 [ 51.236690] [<c011486c>] get_vm_area_caller+0x3c/0x48 [ 51.243811] [<c0114be0>] vmap+0x40/0x78 [ 51.249654] [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0 [ 51.257239] [<c093eac0>] start_kernel+0x320/0x388 [ 51.263994] [<80008074>] 0x80008074 [ 51.269474] SOFTIRQ-ON-W at: [ 51.272769] [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 51.279525] [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.287190] [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134 [ 51.295126] [<c011486c>] get_vm_area_caller+0x3c/0x48 [ 51.302245] [<c0114be0>] vmap+0x40/0x78 [ 51.308094] [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0 [ 51.315669] [<c093eac0>] start_kernel+0x320/0x388 [ 51.322423] [<80008074>] 0x80008074 [ 51.327906] INITIAL USE at: [ 51.331112] [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 51.337775] [<c011376c>] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.345352] [<c0113a44>] __get_vm_area_node.isra.29+0x90/0x134 [ 51.353197] [<c011486c>] get_vm_area_caller+0x3c/0x48 [ 51.360224] [<c0114be0>] vmap+0x40/0x78 [ 51.365977] [<c09442f0>] check_writebuffer_bugs+0x54/0x1a0 [ 51.373464] [<c093eac0>] start_kernel+0x320/0x388 [ 51.380131] [<80008074>] 0x80008074 [ 51.385517] } [ 51.387260] ... key at: [<c0a66948>] vmap_area_lock+0x10/0x20 [ 51.393841] ... acquired at: [ 51.396945] [<c00899ac>] lock_acquire+0x9c/0x104 [ 51.402060] [<c06584e8>] _raw_spin_lock+0x28/0x38 [ 51.407266] [<c01127fc>] find_vmap_area+0x10/0x6c [ 51.412478] [<c0114960>] remove_vm_area+0x8/0x6c [ 51.417592] [<c0114a7c>] __vunmap+0x20/0xf8 [ 51.422252] [<c001c350>] __arm_iounmap+0x10/0x18 [ 51.427369] [<bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma] [ 51.434487] [<bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw] [ 51.440336] [<c03e62c8>] platform_drv_remove+0x18/0x1c [ 51.446000] [<c03e4c44>] __device_release_driver+0x70/0xc8 [ 51.452031] [<c03e5458>] driver_detach+0xb4/0xb8 [ 51.457147] [<c03e4a6c>] bus_remove_driver+0x4c/0x90 [ 51.462628] [<c00b024c>] SyS_delete_module+0x10c/0x198 [ 51.468289] [<c000e580>] ret_fast_syscall+0x0/0x48 [ 51.473584] [ 51.475140] [ 51.475140] stack backtrace: [ 51.479703] CPU: 0 PID: 1921 Comm: modprobe Not tainted 3.14.19-02124-g95c5b7b #308 [ 51.487744] [<c0016090>] (unwind_backtrace) from [<c0012060>] (show_stack+0x10/0x14) [ 51.495865] [<c0012060>] (show_stack) from [<c0652a20>] (dump_stack+0x78/0x94) [ 51.503444] [<c0652a20>] (dump_stack) from [<c0086f18>] (check_usage+0x408/0x594) [ 51.511293] [<c0086f18>] (check_usage) from [<c00870f8>] (check_irq_usage+0x54/0xb0) [ 51.519416] [<c00870f8>] (check_irq_usage) from [<c0088724>] (__lock_acquire+0xe54/0x1b90) [ 51.528077] [<c0088724>] (__lock_acquire) from [<c00899ac>] (lock_acquire+0x9c/0x104) [ 51.536291] [<c00899ac>] (lock_acquire) from [<c06584e8>] (_raw_spin_lock+0x28/0x38) [ 51.544417] [<c06584e8>] (_raw_spin_lock) from [<c01127fc>] (find_vmap_area+0x10/0x6c) [ 51.552726] [<c01127fc>] (find_vmap_area) from [<c0114960>] (remove_vm_area+0x8/0x6c) [ 51.560935] [<c0114960>] (remove_vm_area) from [<c0114a7c>] (__vunmap+0x20/0xf8) [ 51.568693] [<c0114a7c>] (__vunmap) from [<c001c350>] (__arm_iounmap+0x10/0x18) [ 51.576362] [<c001c350>] (__arm_iounmap) from [<bf017d08>] (cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]) [ 51.586494] [<bf017d08>] (cpdma_ctlr_destroy [davinci_cpdma]) from [<bf026294>] (cpsw_remove+0x48/0x8c [ti_cpsw]) [ 51.597261] [<bf026294>] (cpsw_remove [ti_cpsw]) from [<c03e62c8>] (platform_drv_remove+0x18/0x1c) [ 51.606659] [<c03e62c8>] (platform_drv_remove) from [<c03e4c44>] (__device_release_driver+0x70/0xc8) [ 51.616237] [<c03e4c44>] (__device_release_driver) from [<c03e5458>] (driver_detach+0xb4/0xb8) [ 51.625264] [<c03e5458>] (driver_detach) from [<c03e4a6c>] (bus_remove_driver+0x4c/0x90) [ 51.633749] [<c03e4a6c>] (bus_remove_driver) from [<c00b024c>] (SyS_delete_module+0x10c/0x198) [ 51.642781] [<c00b024c>] (SyS_delete_module) from [<c000e580>] (ret_fast_syscall+0x0/0x48) Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:38 -04:00
Mugunthan V N	ff9538b1fc	drivers: net: davinci_cpdma: remove kfree on objects allocated with devm_* apis memories allocated with devm_* apis must not be freed with kfree apis, so removing the kfree calls Fixes: `e194312854` ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().') Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
Prashant Sreedharan	2c7c9ea429	tg3: Add skb->xmit_more support Ring TX doorbell only if xmit_more is not set or the queue is stopped. Suggested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
Jiri Pirko	f76936d07c	ipv4: fix nexthop attlen check in fib_nh_match fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: `4e902c5741` ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
Eric Dumazet	ad971f616a	tcp: fix tcp_ack() performance problem We worked hard to improve tcp_ack() performance, by not accessing skb_shinfo() in fast path (`cd7d8498c9` tcp: change tcp_skb_pcount() location) We still have one spurious access because of ACK timestamping, added in commit `e1c8a607b2` ("net-timestamp: ACK timestamp for bytestreams") By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set, we can avoid two cache line misses for the common case. While we are at it, add two prefetchw() : One in tcp_ack() to bring skb at the head of write queue. One in tcp_clean_rtx_queue() loop to bring following skb, as we will delete skb from the write queue and dirty skb->next->prev. Add a couple of [un]likely() clauses. After this patch, tcp_ack() is no longer the most consuming function in tcp stack. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
David S. Miller	14cee8e377	Merge branch 'isdn' Tilman Schmidt says: ==================== Coverity patches for drivers/isdn Here's a series of patches for the ISDN CAPI subsystem and the Gigaset ISDN driver. Patches 1 to 7 are specific fixes for Coverity warnings. Patches 8 to 11 fix related problems with the handling of invalid CAPI command codes I noticed while working on this. Patch 12 fixes an unrelated problem I noticed during the subsequent regression tests. It would be great if these could still be merged. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:39 -04:00
Tilman Schmidt	86f8ef2c48	isdn/gigaset: fix usb_gigaset write_cmd result race In usb_gigaset function gigaset_write_cmd(), the length field of the command buffer structure could be cleared by the transmit tasklet before it was used for the function's return value. Fix by copying to a local variable before scheduling the tasklet. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:35 -04:00
Tilman Schmidt	340184b35a	isdn/capi: don't return NULL from capi_cmd2str() capi_cmd2str() is used in many places to build log messages. None of them is prepared to handle NULL as a result. Change the function to return printable string "INVALID_COMMAND" instead. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:35 -04:00
Tilman Schmidt	2bf3a09ea5	isdn/capi: handle CAPI 2.0 message parser failures Have callers of capi_cmsg2message and capi_message2cmsg handle non-zero return values indicating failure. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:35 -04:00
Tilman Schmidt	5510ab1804	isdn/capi: prevent NULL pointer dereference on invalid CAPI command An invalid CAPI 2.0 command/subcommand combination may retrieve a NULL pointer from the cpars[] array which will later be dereferenced by the parser routines. Fix by adding NULL pointer checks in strategic places. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:34 -04:00
Tilman Schmidt	854d23b77a	isdn/capi: refactor command/subcommand table accesses Encapsulate accesses to the CAPI 2.0 command/subcommand name and parameter tables in a single place in preparation for redesign. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:34 -04:00
Tilman Schmidt	5362247a42	isdn/capi: prevent index overrun from command_2_index() The result of the function command_2_index() is used to index two arrays mnames[] and cpars[] with max. index 0x4e but in its current form that function can produce results up to 3*(0x9+0x9)+0x7f = 0xb5. Fix by clamping all result values potentially overrunning the arrays to zero which is already handled as an invalid value. Re-spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:34 -04:00
Tilman Schmidt	9ea8aa8d50	isdn/capi: correct capi20_manufacturer argument type mismatch Function capi20_manufacturer() is declared with unsigned int cmd argument but called with unsigned long. Fix by correcting the function prototype since the actual argument is part of the user visible API. Spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:34 -04:00
Tilman Schmidt	b8324f9420	isdn/gigaset: fix non-heap pointer deallocation at_state structures may be allocated individually or as part of a cardstate or bc_state structure. The disconnect() function handled both cases, creating a risk that it might try to deallocate an at_state structure that had not been allocated individually. Fix by splitting disconnect() into two variants handling cases with and without an associated B channel separately, and adding an explicit check. Spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:34 -04:00
Tilman Schmidt	846ac30135	isdn/gigaset: fix NULL pointer dereference In do_action, a NULL pointer might be passed to function start_dial which will dereference it. Fix by adding a check for NULL before the call. Spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:33 -04:00
Tilman Schmidt	097933ddcd	isdn/gigaset: limit raw CAPI message dump length In dump_rawmsg, the length field from a received data package was used unscrutinized, allowing an attacker to control the size of the allocated buffer and the number of times the output loop iterates. Fix by limiting to a reasonable value. Spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:33 -04:00
Tilman Schmidt	ee7ff5fed2	isdn/gigaset: make sure controller name is null terminated In gigaset_isdn_regdev, the name field may not have a null terminator if the source string's length is equal to the buffer size. Fix by zero filling the structure and excluding the last byte of the name field from the copy. Spotted with Coverity. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:33 -04:00
Tilman Schmidt	1bdc07ebab	isdn/gigaset: missing break in do_facility_req If we take the unsupported supplementary service notification mask path, we end up falling through and overwriting the error code. Insert a break statement to skip the remainder of the switch case and proceed to sending the reply message. Spotted with Coverity. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:05:33 -04:00
David S. Miller	f787d6c8dd	Merge branch 'fec-ptp' Luwei Zhou says: ==================== Enable FEC pps feather Change from v2 to v3: -Using the default channel 0 to be PPS channel not PTP_PIN_SET/GETFUNC interface. -Using the linux definition of NSEC_PER_SEC. Change from v1 to v2: - Fix the potential 32-bit multiplication overflow issue. - Optimize the hareware adjustment code to improve efficiency as Richard suggested - Use ptp PTP_PIN_SET/GETFUNC interface to set PPS channel not device tree and add PTP_PF_PPS enumeration - Modify comments style ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 14:45:17 -04:00
Luwei Zhou	278d240478	net: fec: ptp: Enable PPS output based on ptp clock FEC ptp timer has 4 channel compare/trigger function. It can be used to enable pps output. The pulse would be ouput high exactly on N second. The pulse ouput high on compare event mode is used to produce pulse per second. The pulse width would be one cycle based on ptp timer clock source.Since 31-bit ptp hardware timer is used, the timer will wrap more than 2 seconds. We need to reload the compare compare event about every 1 second. Signed-off-by: Luwei Zhou <b45643@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 14:45:08 -04:00
Luwei Zhou	89bddcda7e	net: fec: ptp: Use hardware algorithm to adjust PTP counter. The FEC IP supports hardware adjustment for ptp timer. Refer to the description of ENET_ATCOR and ENET_ATINC registers in the spec about the hardware adjustment. This patch uses hardware support to adjust the ptp offset and frequency on the slave side. Signed-off-by: Luwei Zhou <b45643@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Fugang Duan <b38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 14:45:08 -04:00
Luwei Zhou	f28460b229	net: fec: ptp: Use the 31-bit ptp timer. When ptp switches from software adjustment to hardware ajustment, linux ptp can't converge. It is caused by the IP limit. Hardware adjustment logcial have issue when ptp counter runs over 0x80000000(31 bit counter). The internal IP reference manual already remove 32bit free-running count support. This patch replace the 32-bit PTP timer with 31-bit. Signed-off-by: Luwei Zhou <b45643@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 14:45:08 -04:00
Li RongQing	02ea80741a	ipv6: remove aca_lock spinlock from struct ifacaddr6 no user uses this lock. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:15:15 -04:00
Alexei Starovoitov	e0ee9c1215	x86: bpf_jit: fix two bugs in eBPF JIT compiler 1. JIT compiler using multi-pass approach to converge to final image size, since x86 instructions are variable length. It starts with large gaps between instructions (so some jumps may use imm32 instead of imm8) and iterates until total program size is the same as in previous pass. This algorithm works only if program size is strictly decreasing. Programs that use LD_ABS insn need additional code in prologue, but it was not emitted during 1st pass, so there was a chance that 2nd pass would adjust imm32->imm8 jump offsets to the same number of bytes as increase in prologue, which may cause algorithm to erroneously decide that size converged. Fix it by always emitting largest prologue in the first pass which is detected by oldproglen==0 check. Also change error check condition 'proglen != oldproglen' to fail gracefully. 2. while staring at the code realized that 64-byte buffer may not be enough when 1st insn is large, so increase it to 128 to avoid buffer overflow (theoretical maximum size of prologue+div is 109) and add runtime check. Fixes: `622582786c` ("net: filter: x86: internal BPF JIT") Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:13:14 -04:00
Eric Dumazet	b2532eb9ab	tcp: fix ooo_okay setting vs Small Queues TCP Small Queues (tcp_tsq_handler()) can hold one reference on sk->sk_wmem_alloc, preventing skb->ooo_okay being set. We should relax test done to set skb->ooo_okay to take care of this extra reference. Minimal truesize of skb containing one byte of payload is SKB_TRUESIZE(1) Without this fix, we have more chance locking flows into the wrong transmit queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:12:00 -04:00
Alexander Aring	31eff81e94	skbuff: fix ftrace handling in skb_unshare If the skb is not dropped afterwards we should run consume_skb instead kfree_skb. Inside of function skb_unshare we do always a kfree_skb, doesn't depend if skb_copy failed or was successful. This patch switch this behaviour like skb_share_check, if allocation of sk_buff failed we use kfree_skb otherwise consume_skb. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:10:31 -04:00
Alexander Duyck	2c2b2f0cb9	fm10k: Add skb->xmit_more support This change adds support for skb->xmit_more based on the changes that were made to igb to support the feature. The main changes are moving up the check for maybe_stop_tx so that we can check netif_xmit_stopped to determine if we must write the tail because we can add no further buffers. Acked-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:09:14 -04:00
Nimrod Andy	5bc26726ad	net: fec: Fix sparse warnings with different lock contexts for basic block reproduce: make ARCH=arm C=1 2>fec.txt drivers/net/ethernet/freescale/fec_main.o cat fec.txt sparse warnings: drivers/net/ethernet/freescale/fec_main.c:2916:12: warning: context imbalance in 'fec_set_features' - different lock contexts for basic block Christopher Li suggest to change as below: if (need_lock) { lock(); do_something_real(); unlock(); } else { do_something_real(); } Reported-by: Fabio Estevam <festevam@gmail.com> Suggested-by: Christopher Li <sparse@chrisli.org> Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:53:37 -04:00
Vince Bridgers	c53fed07a0	MAINTAINERS: Update contact information for Vince Bridgers Signed-off-by: Vince Bridgers <vbridger@opensource.altera.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:47:32 -04:00
David S. Miller	b27fa9939d	Merge branch 'sctp' Daniel Borkmann says: ==================== Here are some SCTP fixes. [ Note, immediate workaround would be to disable ASCONF (it is sysctl disabled by default). It is actually only used together with chunk authentication. ] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:29 -04:00
Daniel Borkmann	26b87c7881	net: sctp: fix remote memory pressure from excessive queueing This scenario is not limited to ASCONF, just taken as one example triggering the issue. When receiving ASCONF probes in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------> [...] ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------> ... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed ASCONFs and have increasing serial numbers, we process such ASCONF chunk(s) marked with !end_of_packet and !singleton, since we have not yet reached the SCTP packet end. SCTP does only do verification on a chunk by chunk basis, as an SCTP packet is nothing more than just a container of a stream of chunks which it eats up one by one. We could run into the case that we receive a packet with a malformed tail, above marked as trailing JUNK. All previous chunks are here goodformed, so the stack will eat up all previous chunks up to this point. In case JUNK does not fit into a chunk header and there are no more other chunks in the input queue, or in case JUNK contains a garbage chunk header, but the encoded chunk length would exceed the skb tail, or we came here from an entirely different scenario and the chunk has pdiscard=1 mark (without having had a flush point), it will happen, that we will excessively queue up the association's output queue (a correct final chunk may then turn it into a response flood when flushing the queue ;)): I ran a simple script with incremental ASCONF serial numbers and could see the server side consuming excessive amount of RAM [before/after: up to 2GB and more]. The issue at heart is that the chunk train basically ends with !end_of_packet and !singleton markers and since commit `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") therefore preventing an output queue flush point in sctp_do_sm() -> sctp_cmd_interpreter() on the input chunk (chunk = event_arg) even though local_cork is set, but its precedence has changed since then. In the normal case, the last chunk with end_of_packet=1 would trigger the queue flush to accommodate possible outgoing bundling. In the input queue, sctp_inq_pop() seems to do the right thing in terms of discarding invalid chunks. So, above JUNK will not enter the state machine and instead be released and exit the sctp_assoc_bh_rcv() chunk processing loop. It's simply the flush point being missing at loop exit. Adding a try-flush approach on the output queue might not work as the underlying infrastructure might be long gone at this point due to the side-effect interpreter run. One possibility, albeit a bit of a kludge, would be to defer invalid chunk freeing into the state machine in order to possibly trigger packet discards and thus indirectly a queue flush on error. It would surely be better to discard chunks as in the current, perhaps better controlled environment, but going back and forth, it's simply architecturally not possible. I tried various trailing JUNK attack cases and it seems to look good now. Joint work with Vlad Yasevich. Fixes: `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Daniel Borkmann	b69040d8e3	net: sctp: fix panic on duplicate ASCONF chunks When receiving a e.g. semi-good formed connection scan in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---------------- ASCONF_a; ASCONF_b -----------------> ... where ASCONF_a equals ASCONF_b chunk (at least both serials need to be equal), we panic an SCTP server! The problem is that good-formed ASCONF chunks that we reply with ASCONF_ACK chunks are cached per serial. Thus, when we receive a same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do not need to process them again on the server side (that was the idea, also proposed in the RFC). Instead, we know it was cached and we just resend the cached chunk instead. So far, so good. Where things get nasty is in SCTP's side effect interpreter, that is, sctp_cmd_interpreter(): While incoming ASCONF_a (chunk = event_arg) is being marked !end_of_packet and !singleton, and we have an association context, we do not flush the outqueue the first time after processing the ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it queued up, although we set local_cork to 1. Commit `2e3216cd54` changed the precedence, so that as long as we get bundled, incoming chunks we try possible bundling on outgoing queue as well. Before this commit, we would just flush the output queue. Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we continue to process the same ASCONF_b chunk from the packet. As we have cached the previous ASCONF_ACK, we find it, grab it and do another SCTP_CMD_REPLY command on it. So, effectively, we rip the chunk->list pointers and requeue the same ASCONF_ACK chunk another time. Since we process ASCONF_b, it's correctly marked with end_of_packet and we enforce an uncork, and thus flush, thus crashing the kernel. Fix it by testing if the ASCONF_ACK is currently pending and if that is the case, do not requeue it. When flushing the output queue we may relink the chunk for preparing an outgoing packet, but eventually unlink it when it's copied into the skb right before transmission. Joint work with Vlad Yasevich. Fixes: `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Daniel Borkmann	9de7922bc7	net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks Commit `6f4c618ddb` ("SCTP : Add paramters validity check for ASCONF chunk") added basic verification of ASCONF chunks, however, it is still possible to remotely crash a server by sending a special crafted ASCONF chunk, even up to pre 2.6.12 kernels: skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768 head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950 end:0x440 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:129! [...] Call Trace: <IRQ> [<ffffffff8144fb1c>] skb_put+0x5c/0x70 [<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp] [<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp] [<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20 [<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp] [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp] [<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0 [<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp] [<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp] [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp] [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp] [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter] [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0 [<ffffffff81497078>] ip_local_deliver+0x98/0xa0 [<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440 [<ffffffff81496ac5>] ip_rcv+0x275/0x350 [<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750 [<ffffffff81460588>] netif_receive_skb+0x58/0x60 This can be triggered e.g., through a simple scripted nmap connection scan injecting the chunk after the handshake, for example, ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ------------------ ASCONF; UNKNOWN ------------------> ... where ASCONF chunk of length 280 contains 2 parameters ... 1) Add IP address parameter (param length: 16) 2) Add/del IP address parameter (param length: 255) ... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the Address Parameter in the ASCONF chunk is even missing, too. This is just an example and similarly-crafted ASCONF chunks could be used just as well. The ASCONF chunk passes through sctp_verify_asconf() as all parameters passed sanity checks, and after walking, we ended up successfully at the chunk end boundary, and thus may invoke sctp_process_asconf(). Parameter walking is done with WORD_ROUND() to take padding into account. In sctp_process_asconf()'s TLV processing, we may fail in sctp_process_asconf_param() e.g., due to removal of the IP address that is also the source address of the packet containing the ASCONF chunk, and thus we need to add all TLVs after the failure to our ASCONF response to remote via helper function sctp_add_asconf_response(), which basically invokes a sctp_addto_chunk() adding the error parameters to the given skb. When walking to the next parameter this time, we proceed with ... length = ntohs(asconf_param->param_hdr.length); asconf_param = (void )asconf_param + length; ... instead of the WORD_ROUND()'ed length, thus resulting here in an off-by-one that leads to reading the follow-up garbage parameter length of 12336, and thus throwing an skb_over_panic for the reply when trying to sctp_addto_chunk() next time, which implicitly calls the skb_put() with that length. Fix it by using sctp_walk_params() [ which is also used in INIT parameter processing ] macro in the verification and* in ASCONF processing: it will make sure we don't spill over, that we walk parameters WORD_ROUND()'ed. Moreover, we're being more defensive and guard against unknown parameter types and missized addresses. Joint work with Vlad Yasevich. Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Bruno Thomsen	b838b4aced	phy/micrel: KSZ8031RNL RMII clock reconfiguration bug Bug: Unable to send and receive Ethernet packets with Micrel PHY. Affected devices: KSZ8031RNL (commercial temp) KSZ8031RNLI (industrial temp) Description: PHY device is correctly detected during probe. PHY power-up default is 25MHz crystal clock input and output 50MHz RMII clock to MAC. Reconfiguration of PHY to input 50MHz RMII clock from MAC causes PHY to become unresponsive if clock source is changed after Operation Mode Strap Override (OMSO) register setup. Cause: Long lead times on parts where clock setup match circuit design forces the usage of similar parts with wrong default setup. Solution: Swapped KSZ8031 register setup and added phy_write return code validation. Tested with Freescale i.MX28 Fast Ethernet Controler (fec). Signed-off-by: Bruno Thomsen <bth@kamstrup.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:41:03 -04:00
David S. Miller	01d2d484e4	Merge branch 'bcmgenet_systemport' Florian Fainelli says: ==================== net: bcmgenet & systemport fixes This patch series fixes an off-by-one error introduced during a previous change, and the two other fixes fix a wake depth imbalance situation for the Wake-on-LAN interrupt line. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:39:22 -04:00
Florian Fainelli	61b423a8a0	net: systemport: avoid unbalanced enable_irq_wake calls Multiple enable_irq_wake() calls will keep increasing the IRQ wake_depth, which ultimately leads to the following types of situation: 1) enable Wake-on-LAN interrupt w/o password 2) enable Wake-on-LAN interrupt w/ password 3) enable Wake-on-LAN interrupt w/o password 4) disable Wake-on-LAN interrupt After step 4), SYSTEMPORT would always wake-up the system no matter what wake-up device we use, which is not what we want. Fix this by making sure there are no unbalanced enable_irq_wake() calls. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:39:15 -04:00
Florian Fainelli	083731a8fb	net: bcmgenet: avoid unbalanced enable_irq_wake calls Multiple enable_irq_wake() calls will keep increasing the IRQ wake_depth, which ultimately leads to the following types of situation: 1) enable Wake-on-LAN interrupt w/o password 2) enable Wake-on-LAN interrupt w/ password 3) enable Wake-on-LAN interrupt w/o password 4) disable Wake-on-LAN interrupt After step 4), GENET would always wake-up the system no matter what wake-up device we use, which is not what we want. Fix this by making sure there are no unbalanced enable_irq_wake() calls. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:39:15 -04:00
Florian Fainelli	cf377d886f	net: bcmgenet: fix off-by-one in incrementing read pointer Commit `b629be5c83` ("net: bcmgenet: check harder for out of memory conditions") moved the increment of the local read pointer before reading from the hardware descriptor using dmadesc_get_length_status(), which creates an off-by-one situation. Fix this by moving again the read_ptr increment after we have read the hardware descriptor to get both the control block and the read pointer back in sync. Fixes: `b629be5c83` ("net: bcmgenet: check harder for out of memory conditions") Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Petri Gynther <pgynther@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:39:15 -04:00
David S. Miller	35b7a1915a	Merge branch 'net-drivers-pgcnt' Eric Dumazet says: ==================== net: fix races accessing page->_count This is illegal to use atomic_set(&page->_count, ...) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. The only case it is valid is when page->_count is 0, we can use this in __netdev_alloc_frag() Note that I never seen crashes caused by these races, the issue was reported by Andres Lagar-Cavilla and Hugh Dickins. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:36 -04:00
Eric Dumazet	4c450583d9	net: fix races in page->_count manipulation This is illegal to use atomic_set(&page->_count, ...) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. The only case it is valid is when page->_count is 0 Fixes: `540eb7bf0b` ("net: Update alloc frag to reduce get/put page usage and recycle pages") Signed-off-by: Eric Dumaze <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:29 -04:00
Eric Dumazet	98226208c8	mlx4: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, ...) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Eric Dumazet	ec91698360	ixgbe: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Eric Dumazet	00cd5adb03	igb: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Eric Dumazet	42b0270b40	fm10k: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Sascha Hauer	1fadee0c36	net/phy: micrel: Add clock support for KSZ8021/KSZ8031 The KSZ8021 and KSZ8031 support RMII reference input clocks of 25MHz and 50MHz. Both PHYs differ in the default frequency they expect after reset. If this differs from the actual input clock, then register 0x1f bit 7 must be changed. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:35:13 -04:00
Alexander Duyck	5af7fb6e3e	flow-dissector: Fix alignment issue in __skb_flow_get_ports This patch addresses a kernel unaligned access bug seen on a sparc64 system with an igb adapter. Specifically the __skb_flow_get_ports was returning a be32 pointer which was then having the value directly returned. In order to prevent this it is actually easier to simply not populate the ports or address values when an skb is not present. In this case the assumption is that the data isn't needed and rather than slow down the faster aligned accesses by making them have to assume the unaligned path on architectures that don't support efficent unaligned access it makes more sense to simply switch off the bits that were copying the source and destination address/port for the case where we only care about the protocol types and lengths which are normally 16 bit fields anyway. Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:33:47 -04:00
Li RongQing	8ea6e345a6	net: filter: fix the comments 1. sk_run_filter has been renamed, sk_filter() is using SK_RUN_FILTER. 2. Remove wrong comments about storing intermediate value. 3. replace sk_run_filter with __bpf_prog_run for check_load_and_stores's comments Cc: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:11:51 -04:00
Li RongQing	1a9525f68e	Documentation: replace __sk_run_filter with __bpf_prog_run __sk_run_filter has been renamed as __bpf_prog_run, so replace them in comments Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:10:50 -04:00
David S. Miller	3ab52c6928	Merge branch 'macvlan' Jason Baron says: ==================== macvlan: optimize receive path So after porting this optimization to net-next, I found that the netperf results of TCP_RR regress right at the maximum peak of transactions/sec. That is as I increase the number of threads via the first argument to super_netperf, the number of transactions/sec keep increasing, peak, and then start decreasing. It is right at the peak, that I see a small regression with this patch (see results in patch 2/2). Without the patch, the ksoftirqd threads are the top cpu consumers threads on the system, since the extra 'netif_rx()', is queuing more softirq work, whereas with the patch, the ksoftirqd threads are below all of the 'netserver' threads in terms of their cpu usage. So there appears to be some interaction between how softirqs are serviced at the peak here and this patch. I think the test results are still supportive of this approach, but I wanted to be clear on my findings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:09:51 -04:00
jbaron@akamai.com	d1dd911930	macvlan: optimize the receive path The netif_rx() call on the fast path of macvlan_handle_frame() appears to be there to ensure that we properly throttle incoming packets. However, it would appear as though the proper throttling is already in place for all possible ingress paths, and that the call is redundant. If packets are arriving from the physical NIC, we've already throttled them by this point. Otherwise, if they are coming via macvlan_queue_xmit(), it calls either 'dev_forward_skb()', which ends up calling netif_rx_internal(), or else in the broadcast case, we are throttling via macvlan_broadcast_enqueue(). The test results below are from off the box to an lxc instance running macvlan. Once the tranactions/sec stop increasing, the cpu idle time has gone to 0. Results are from a quad core Intel E3-1270 V2@3.50GHz box with bnx2x 10G card. for i in {10,100,200,300,400,500}; do super_netperf $i -H $ip -t TCP_RR; done Average of 5 runs. trans/sec trans/sec (3.17-rc7-net-next) (3.17-rc7-net-next + this patch) ---------- ---------- 208101 211534 (+1.6%) 839493 850162 (+1.3%) 845071 844053 (-.12%) 816330 819623 (+.4%) 778700 789938 (+1.4%) 735984 754408 (+2.5%) Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:09:47 -04:00

1 2 3 4 5 ...

475625 Commits All Branches Search

475625 Commits

All Branches