OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David S. Miller	409f0a9014	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl3945-base.c	2009-02-07 02:52:44 -08:00
Jesse Brandeburg	593721833d	igb: remove dead code in transmit routine Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:18 -08:00
Alexander Duyck	86d5d38fa1	igb: update version number and copyright dates Update the version number to 1.3.16 and update copyright dates for 2009. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:17 -08:00
Alexander Duyck	265de40908	igb: fix two minor items found during code review This patch addresses two minor items I found while cleaning up the igb driver for our sourceforge version. The first clears the context index if we don't flag that we need it. The second item is that eims_other should be used instead of bit defines when setting all of the EICS bits prior to reset. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:16 -08:00
Alexander Duyck	04fe63583d	igb: update stats before doing reset in igb_down It was seen with repeated interface up/down testing that there was a large stray between the stats reported by the queues and the stats reported by the HW. It was found to be an issue in that hw stats were being reset without first being recorded. This change records the stats before wiping them from the system via the reset. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:16 -08:00
Alexander Duyck	450c87c8d2	igb: remove redundant count set and err_hw_init Remove the setting of ring->count variables from igb_probe as they are duplicating the same configuration that is done igb_alloc_queues. Remove the err_hw_init tag as it can be replaced by err_sw_init. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:15 -08:00
Alexander Duyck	8675737a9c	igb: remove disable_av variable from mac_info struct The disable_av variable is never used by the driver and provides no value as it is likely a leftover debugging variable. I have removed it and replaced the one spot that checked for it with a check for a valid address. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:15 -08:00
Alexander Duyck	fa4dfae0ce	igb: change pba size determination from if to switch statement As additional hardware is added to the igb driver it is easier to support the expansion via switch statements instead of using nested ifs. For this reason I am changing this to a switch statement. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:14 -08:00
Alexander Duyck	a8564f033e	igb: move get_hw_control within igb_resume. Move igb_get_hw_control up so that it is called just after the reset in igb_resume. This notifies the HW sooner that the driver is reassuming control of the device. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:14 -08:00
Alexander Duyck	4a3c6433e4	igb: don't read eicr when responding to legacy interrupts The interrupt handler was reading eicr and then doing nothing with the result. I have removed the variable and the register read since they provide no value to the legacy interrupt handler. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:13 -08:00
Alexander Duyck	28b0759c22	igb: remove unnecessary adapter->hw calls when just hw-> will do. There were several spots in the code making calls to adapter->hw when they could have just been accessing hw-> directly. I cleaned up the spots where this was visibly apparent. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:13 -08:00
Alexander Duyck	8a900862a2	igb: rename igb_update_mc_addr_list_82575 to not include the 82575 There isn't much point in having the _82575 hanging off the end of this function since there aren't any other version of this function running around within this driver. This also allows for a bit of whitespace cleanup due to a shorter function name. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:12 -08:00
Alexander Duyck	4b1a987736	igb: remove redundant timer updates and cleanup watchdog_task The igb watchdog task is modifying the watchdog timer twice duing a single run. It only needs to be called once to reschedule itself for 2 seconds from the last time it ran. In addition I removed the allocation of the mac_info structure since it is only called twice and is easier to access via the e1000_hw struct. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:10 -08:00
Alexander Duyck	eebbbdba5e	igb: cleanup igb_netpoll to be more friendly with napi & GRO This patch cleans up igb_netpoll so that it is more friendly with both the current napi and newly introduced GRO features. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:09 -08:00
Alexander Duyck	dda0e0834c	igb: add counter for dma out of sync errors Add a counter for dma out of sync errors reported via interrupt. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:08 -08:00
Alexander Duyck	2753f4cebf	igb: update testing done by ethtool Most of the code for the testing has pretty much become stale at this point and is need of update. This update just streamlines most of the code, widens the range of interrupt testing. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:08 -08:00
Alexander Duyck	7d8eb29e6e	igb: update feature flags supported in ethtool This driver is currently using HW_CSUM which is not correct. Update this to use the IP_CSUM and IPV6_CSUM flags. In addition consolidate the TSO flag setting. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:07 -08:00
Alexander Duyck	0fbe67af3e	igb: remove unused rx_hdr_split statistic This statistic is not used and so it is safe to remove Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:07 -08:00
Alexander Duyck	312c75aee7	igb: rename nvm ops All of the nvm ops have the tag _nvm added to the end which is redundant since all of the calls to the ops have to go through the nvm ops struct anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:06 -08:00
Alexander Duyck	a8d2a0c27f	igb: rename phy ops This patch renames write_phy_reg to write_reg and read_phy_reg to read_reg. It seems redundant to call out phy in an operation that is part of the phy_ops struct. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:05 -08:00
Alexander Duyck	40a70b3889	igb: read address from RAH/RAL instead of from EEPROM Instead of pulling the mac address from EEPROM it is easier to pull it from the RAL/RAH registers and then just copy it into the address structures. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:05 -08:00
Alexander Duyck	c1889bfe68	igb: make dev_spec a union and remove dynamic allocation This patch makes dev_spec a union and simplifies it so that it does not require dynamic allocation and freeing in the driver. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:04 -08:00
Alexander Duyck	4d6b725e4d	igb: add link check function Add a link check function to contain all activities related to verifying that the link is present. The current approach is a bit cludgy and needs to be cleaned up. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:04 -08:00
Alexander Duyck	aed5dec370	igb: remove check for needing an io port Since igb supports only pci-e nics and there is no plan to support any legacy pci parts in the driver there isn't really much need for checking to see if an io port is needed. In the unlikely event that we do begin supporting legacy pci parts then we can see about adding this code back to the driver. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:03 -08:00
Alexander Duyck	83b7180d0d	igb: move initialization of number of queues into set_interrupt_capability This patch moves the initialization of the number of queues into set_interrupt_capability. This allows the number of queues to increase in the unlikely event that the system initially fails to allocate enough msi-x interrupts, does a suspend/resume, and then can allocate enough interrupts on resume. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:02 -08:00
Alexander Duyck	db76176215	igb: move setting of buffsz out of repeated path in alloc_rx_buffers buffsz is being repeatedly set when allocaing buffers. Since this value should only need to be set once in the function I am moving it out of the looped portion of the path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:02 -08:00
Alexander Duyck	69d3ca5357	igb: optimize/refactor receive path While cleaning up the skb_over panic with small frames I found there was room for improvement in the ordering of operations within the rx receive flow. These changes will place the prefetch for the next descriptor to a point earlier in the rx path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:43:01 -08:00
David S. Miller	0b492fce3d	sunhme: Don't match PCI devices in SBUS probe. Unfortunately, the OF device tree nodes for SBUS and PCI hme devices have the same device node name on some systems. So if the name of the parent node isn't 'sbus', skip it. Based upon an excellent report and detective work by Meelis Roos and Eric Brower. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Meelis Roos <mroos@linux.ee>	2009-02-07 02:20:25 -08:00
Peter P Waskiewicz Jr	3e450669cc	ixgbe: Fix a set_num_queues() bug that can result in num_(r\|t)x_queues = 0 Now that our set_num_queues() routines for each feature are re-entrant, and can be called at any point, they shouldn't zero out the feature's indices or mask bits. Subsequent calls into those routines for those features can result in zero Rx and Tx queues being assigned, causing a panic later in driver reinitialization. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 02:16:59 -08:00
Ayaz Abdulla	2813ddd1bf	forcedeth: bump version to 63 This patch bumps the version up to 63 Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 00:25:18 -08:00
Ayaz Abdulla	daa91a9d24	forcedeth: recover error support This patch adds another type of recoverable error to the driver. It also modifies the sequence for recovery to include a mac reset and clearing of interrupts. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 00:25:00 -08:00
Ayaz Abdulla	c1086cda7d	forcedeth: ethtool tx csum fix This patch fixes the ethtool tx csum "set" command. A recent patch was submitted to remove HW_CSUM and use IP_CSUM instead. Therefore, the corresponding ethtool command should also be modified. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 00:24:39 -08:00
Ayaz Abdulla	b6e4405bf7	forcedeth: msi interrupt fix This patch fixes an issue with the suspend/resume cycle with msi interrupts. See bugzilla number 10487 for more details. The fix is to re-setup a private msi pci config offset field. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 00:24:15 -08:00
Ayaz Abdulla	cac1c52c36	forcedeth: mgmt unit interface This patch updates the logic used to communicate with the mgmt unit. It also adds a version check for a newer mgmt unit firmware. * Fixed udelay to schedule_timeout_uninterruptible Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-07 00:23:57 -08:00
Ondrej Zary	152abd139c	3c509: Fix resume from hibernation for PnP mode. From: Ondrej Zary <linux@rainbow-software.org> last year, I posted a patch which fixed hibernation on 3c509 cards. That was back in 2.6.24. It worked fine in 2.6.25. But then I stopped using hibernation (as it did not work with my new IT8212 RAID controller). Now I fixed it and noticed that 3c509 does not wake up properly anymore (in 2.6.28) - neither in PnP nor in ISA modes. ifconfig down/up makes the card work again in PnP mode. However, in ISA mode, ifconfig up ends with "No such device" error. Comparing the 3c509 driver between 2.6.25 and 2.6.28, there's only some statistics-related change. So the cause of the problem must be somewhere else. This patch makes the resume work in PnP mode, but it's still not enough for ISA mode. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 22:04:08 -08:00
Ilkka Virta	71822faa3b	sungem: Soft lockup in sungem on Netra AC200 when switching interface up From: Ilkka Virta <itvirta@iki.fi> In the lockup situation the driver seems to go off in an eternal storm of interrupts right after calling request_irq(). It doesn't actually do anything interesting in the interrupt handler. Since connecting the link afterwards works, something later in initialization must fix this. Looking at gem_do_start() and gem_open(), it seems that the only thing done while opening the device after the request_irq(), is a call to napi_enable(). I don't know what the ordering requirements are for the initialization, but I boldly tried to move the napi_enable() call inside gem_do_start() before the link state is checked and interrupts subsequently enabled, and it seems to work for me. Doesn't even break anything too obvious... Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 22:00:36 -08:00
Ivan Vecera	355423d084	r8169: Don't update statistics counters when interface is down Some Realtek chips (RTL8169sb/8110sb in my case) are unable to retrieve ethtool statistics when the interface is down. The process stays in endless loop in rtl8169_get_ethtool_stats. This is because these chips need to have receiver enabled (CmdRxEnb bit in ChipCmd register) that is cleared when the interface is going down. It's better to update statistics only when the interface is up and otherwise return copy of statistics grabbed when the interface was up (in rtl8169_close). It is interesting that PCI-E NICs (like 8168b/8111b...) are not affected. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 21:49:57 -08:00
Peter P Waskiewicz Jr	12207e498b	ixgbe: Defeature Tx Head writeback Tx Head writeback is causing multi-microsecond stalls on PCIe chipsets, due to partial cacheline writebacks. Removing this feature removes these issues. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 21:47:24 -08:00
Peter P Waskiewicz Jr	0ecc061d19	ixgbe: Update flow control state machine in link setup The flow control handling is overly complicated and difficult to maintain. This patch cleans up the flow control handling and makes it much more explicit. It also adds 1G flow control autonegotiation, for 1G copper links, 1G KX links, and 1G fiber links. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 21:46:54 -08:00
Yinghai Lu	394827913e	forcedeth: enable msix to default Impact: change default msix and napic can work again Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 01:31:12 -08:00
Yinghai Lu	033e97b24a	forcedeth: ck804 and mcp55 doesn't need timerirq Impact: cleanup so get less irq. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 01:30:56 -08:00
Yinghai Lu	0335ef5d59	forcedeth: disable irq at first before schedule rx Impact: clean up schedule it later after disable it. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 01:30:36 -08:00
Yinghai Lu	79d30a581f	forcedeth: don't clear nic_poll_irq too early Impact: fix bug for msix, we still need that flag to enable irq respectively Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 01:30:01 -08:00
Yinghai Lu	ddb213f076	forcedeth: make msi-x different name for rx-tx Impact: make /proc/interrupts could show more info which irq is rx or other for msi-x add three name fields for rx, tx, other Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 01:29:23 -08:00
Pablo Neira Ayuso	ff491a7334	netlink: change return-value logic of netlink_broadcast() Currently, netlink_broadcast() reports errors to the caller if no messages at all were delivered: 1) If, at least, one message has been delivered correctly, returns 0. 2) Otherwise, if no messages at all were delivered due to skb_clone() failure, return -ENOBUFS. 3) Otherwise, if there are no listeners, return -ESRCH. With this patch, the caller knows if the delivery of any of the messages to the listeners have failed: 1) If it fails to deliver any message (for whatever reason), return -ENOBUFS. 2) Otherwise, if all messages were delivered OK, returns 0. 3) Otherwise, if no listeners, return -ESRCH. In the current ctnetlink code and in Netfilter in general, we can add reliable logging and connection tracking event delivery by dropping the packets whose events were not successfully delivered over Netlink. Of course, this option would be settable via /proc as this approach reduces performance (in terms of filtered connections per seconds by a stateful firewall) but providing reliable logging and event delivery (for conntrackd) in return. This patch also changes some clients of netlink_broadcast() that may report ENOBUFS errors via printk. This error handling is not of any help. Instead, the userspace daemons that are listening to those netlink messages should resync themselves with the kernel-side if they hit ENOBUFS. BTW, netlink_broadcast() clients include those that call cn_netlink_send(), nlmsg_multicast() and genlmsg_multicast() since they internally call netlink_broadcast() and return its error value. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 23:56:36 -08:00
Alex Chiang	612e244c12	e1000e: normalize usage of serdes_has_link Cosmetic change to use struct e1000_mac_info.serdes_has_link consistently as the 'bool' that it's declared as. No functional change. Signed-off-by: Alex Chiang <achiang@hp.com> Acked-by: Jeff Kirsher <Jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 23:55:45 -08:00
Peter P Waskiewicz Jr	34b0368c68	ixgbe: Display EEPROM version in ethtool -i queries Currently ixgbe does not display the EEPROM version in ethtool -i, where other drivers do. The EEPROM version is located at offset 0x29. This patch adds support to display it. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 23:54:42 -08:00
Peter P Waskiewicz Jr	3201d3130e	ixgbe: Update link setup code to better support autonegotiation of speed The current code has some flaws in it when performing autonegotiation, especially on KX/KX4 links. This patch updates the code to better handle the autonegotiation states on link setup. The patch also removes a redundant link configuration call on driver load, and moves link configuration to the ->open() path. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 23:54:21 -08:00
Peter P Waskiewicz Jr	bc97114d3f	ixgbe: Refactor set_num_queues() and cache_ring_register() The current code to determine the number of queues the device will want on driver initialization is ugly and difficult to maintain. It also doesn't allow for easy expansion for future features or future hardware. This patch refactors these routines, and make them easier to deal with. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 23:53:59 -08:00
Herbert Xu	33dccbb050	tun: Limit amount of queued packets per device Unlike a normal socket path, the tuntap device send path does not have any accounting. This means that the user-space sender may be able to pin down arbitrary amounts of kernel memory by continuing to send data to an end-point that is congested. Even when this isn't an issue because of limited queueing at most end points, this can also be a problem because its only response to congestion is packet loss. That is, when those local queues at the end-point fills up, the tuntap device will start wasting system time because it will continue to send data there which simply gets dropped straight away. Of course one could argue that everybody should do congestion control end-to-end, unfortunately there are people in this world still hooked on UDP, and they don't appear to be going away anywhere fast. In fact, we've always helped them by performing accounting in our UDP code, the sole purpose of which is to provide congestion feedback other than through packet loss. This patch attempts to apply the same bandaid to the tuntap device. It creates a pseudo-socket object which is used to account our packets just as a normal socket does for UDP. Of course things are a little complex because we're actually reinjecting traffic back into the stack rather than out of the stack. The stack complexities however should have been resolved by preceding patches. So this one can simply start using skb_set_owner_w. For now the accounting is essentially disabled by default for backwards compatibility. In particular, we set the cap to INT_MAX. This is so that existing applications don't get confused by the sudden arrival EAGAIN errors. In future we may wish (or be forced to) do this by default. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 21:25:32 -08:00

1 2 3 4 5 ...

57010 Commits