OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Tejun Heo	06b86245c0	[PATCH] 03/05 move last_merge handlin into generic elevator code Currently, both generic elevator code and specific ioscheds participate in the management and usage of last_merge. This and the following patches move last_merge handling into generic elevator code. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2005-10-28 08:45:20 +02:00
Jens Axboe	1b47f531e2	[PATCH] generic dispatch fixes - Split elv_dispatch_insert() into two functions - Rename rq_last_sector() to rq_end_sector() Signed-off-by: Jens Axboe <axboe@suse.de>	2005-10-28 08:44:37 +02:00
Tejun Heo	8922e16cf6	[PATCH] 01/05 Implement generic dispatch queue Implements generic dispatch queue which can replace all dispatch queues implemented by each iosched. This reduces code duplication, eases enforcing semantics over dispatch queue, and simplifies specific ioscheds. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2005-10-28 08:44:24 +02:00
Chen, Kenneth W	20e5c81fcf	[patch] remove gendisk->stamp_idle field struct gendisk has these two fields: stamp, stamp_idle. Update to stamp_idle is always in sync with stamp and they are always the same. Therefore, it does not add any value in having two fields tracking same timestamp. Suggest to remove it. Also, we should only update gendisk stats with non-zero value. Advantage is that we don't have to needlessly calculate memory address, and then add zero to the content. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2005-10-28 08:15:30 +02:00
Justin Chen	551f8f0e87	[SERIAL] new hp diva console port Add the new ID 0x132a and configure the new PCI Diva console port. This device supports only 1 single console UART. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2005-10-24 22:16:38 +01:00
Bjorn Helgaas	add7b58e75	[SERIAL] support the Exsys EX-4055 4S four-port card Tested by Wolfgang Denk with this device: 00:0f.0 Network controller: PLX Technology, Inc. PCI <-> IOBus Bridge (rev 01) Subsystem: Exsys EX-4055 4S(16C550) RS-232 Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin A routed to IRQ 10 Region 0: Memory at 80100000 (32-bit, non-prefetchable) [size=128] Region 1: I/O ports at 7080 [size=128] Region 2: I/O ports at 7400 [size=32] 00:0f.0 Class 0280: 10b5:9050 (rev 01) Subsystem: d84d:4055 Results with this patch: Serial: 8250/16550 driver $Revision: 1.90 $ 32 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A PCI: Found IRQ 10 for device 0000:00:0f.0 ttyS4 at I/O 0x7400 (irq = 10) is a 16550A ttyS5 at I/O 0x7408 (irq = 10) is a 16550A ttyS6 at I/O 0x7410 (irq = 10) is a 16550A ttyS7 at I/O 0x7418 (irq = 10) is a 16550A Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2005-10-24 22:11:57 +01:00
Andrew Morton	8d3b35914a	[PATCH] inotify/idr leak fix Fix a bug which was reported and diagnosed by Stefan Jones <stefan.jones@churchillrandoms.co.uk> IDR trees include a cache of idr_layer objects. There's no way to destroy this cache, so when we discard an overall idr tree we end up leaking some memory. Add and use idr_destroy() for this. v9fs and infiniband also need to use idr_destroy() to avoid leaks. Or, we make the cache global, like radix_tree_preload(). Which is probably better. Later. Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-23 16:38:39 -07:00
Hugh Dickins	ac9b9c667c	[PATCH] Fix handling spurious page fault for hugetlb region This reverts commit `3359b54c8c` and replaces it with a cleaner version that is purely based on page table operations, so that the synchronization between inode size and hugetlb mappings becomes moot. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-20 09:02:07 -07:00
Yasunori Goto	281dd25cdc	[PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory This introduces a limit parameter to the core bootmem allocator; The new parameter indicates that physical memory allocated by the bootmem allocator should be within the requested limit. We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit, alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit is the only api used for swiotlb. The existing alloc_bootmem_low_pages() api could instead have been changed and made to pass right limit to the core allocator. But that would make the patch more intrusive for 2.6.14, as other arches use alloc_bootmem_low_pages(). We may be done that post 2.6.14 as a cleanup. With this, swiotlb gets memory within 4G for both x86_64 and ia64 arches. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Ravikiran G Thirumalai <kiran@scalex86.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-19 23:11:33 -07:00
Seth, Rohit	3359b54c8c	[PATCH] Handle spurious page fault for hugetlb region The hugetlb pages are currently pre-faulted. At the time of mmap of hugepages, we populate the new PTEs. It is possible that HW has already cached some of the unused PTEs internally. These stale entries never get a chance to be purged in existing control flow. This patch extends the check in page fault code for hugepages. Check if a faulted address falls with in size for the hugetlb file backing it. We return VM_FAULT_MINOR for these cases (assuming that the arch specific page-faulting code purges the stale entry for the archs that need it). Signed-off-by: Rohit Seth <rohit.seth@intel.com> [ This is apparently arguably an ia64 port bug. But the code won't hurt, and for now it fixes a real problem on some ia64 machines ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-19 13:56:27 -07:00
Zach Brown	4faa528528	[PATCH] aio: revert lock_kiocb() lock_kiocb() was introduced to serialize retrying and cancellation. In the process of doing so it tried to sleep waiting for KIF_LOCKED while holding the ctx_lock spinlock. Recent fixes have ensured that multiple concurrent retries won't be attempted for a given iocb. Cancel has other problems and has no significant in-tree users that have been complaining about it. So for the immediate future we'll revert sleeping with the lock held and will address proper cancellation and retry serialization in the future. Signed-off-by: Zach Brown <zach.brown@oracle.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-17 17:03:57 -07:00
Eric Dumazet	5ee832dbc6	[PATCH] rcu: keep rcu callback event counter This makes call_rcu() keep track of how many events there are on the RCU list, and cause a reschedule event when the list gets too long. This helps keep RCU event lists down. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-17 15:27:58 -07:00
Herbert Xu	b24d18aa74	[PATCH] list: add missing rcu_dereference on first element It seems that all the list_*_rcu primitives are missing a memory barrier on the very first dereference. For example, #define list_for_each_rcu(pos, head) \ for (pos = (head)->next; prefetch(pos->next), pos != (head); \ pos = rcu_dereference(pos->next)) It will go something like: pos = (head)->next prefetch(pos->next) pos != (head) do stuff We're missing a barrier here. pos = rcu_dereference(pos->next) fetch pos->next barrier given by rcu_dereference(pos->next) store pos Without the missing barrier, the pos->next value may turn out to be stale. In fact, if "do stuff" were also dereferencing pos and relying on list_for_each_rcu to provide the barrier then it may also break. So here is a patch to make sure that we have a barrier for the first element in the list. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-17 08:59:10 -07:00
Al Viro	688ce17b85	[PATCH]: highest_possible_processor_id() has to be a macro ... otherwise, things like alpha and sparc64 break and break badly. They define cpu_possible_map to something else in smp.h AFTER having included cpumask.h. If that puppy is a macro, expansion will happen at the actual caller, when we'd already seen #define cpu_possible_map ... and we will get the right thing used. As an inline helper it will be tokenized before we get to that define and that's it; no matter what we define later, it won't affect anything. We get modules with dependency on cpu_possible_map instead of the right symbol (phys_cpu_present_map in case of sparc64), or outright link errors if they are built-in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-16 00:17:33 -07:00
Tim Schmielau	e26148d934	[PATCH] Fix copy-and-paste error in BSD accounting Fix copy and paste error in jiffies_to_AHZ conversion which leads to wrong BSD accounting information on alpha and ia64 when CONFIG_BSD_PROCESS_ACCT_V3 is turned on. Also update comment to match reorganised header files. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-14 17:10:12 -07:00
David S. Miller	c8923c6b85	[NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering. Original patch by Harald Welte, with feedback from Herbert Xu and testing by S�bastien Bernard. EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus are numbered linearly. That is not necessarily true. This patch fixes that up by calculating the largest possible cpu number, and allocating enough per-cpu structure space given that. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-13 14:41:23 -07:00
Ben Dooks	afb997c616	[NETPOLL]: wrong return for null netpoll_poll_lock() When netpoll is not being used, the macro that defines the removed routing netpoll_poll_lock defines the return as zero, but the real routine returns a `void *` Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-12 15:12:21 -07:00
Pablo Neira Ayuso	3392315375	[NETFILTER] ctnetlink: allow userspace to change TCP state This patch adds the ability of changing the state a TCP connection. I know that this must be used with care but it's required to provide a complete conntrack creation via conntrack_netlink. So I'll document this aspect on the upcoming docs. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:23:28 -07:00
Harald Welte	a051a8f730	[NETFILTER]: Use only 32bit counters for CONNTRACK_ACCT Initially we used 64bit counters for conntrack-based accounting, since we had no event mechanism to tell userspace that our counters are about to overflow. With nfnetlink_conntrack, we now have such a event mechanism and thus can save 16bytes per connection. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:21:10 -07:00
Pablo Neira Ayuso	e1c73b78e3	[NETFILTER] ctnetlink: add one nesting level for TCP state To keep consistency, the TCP private protocol information is nested attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to access the TCP state information looks like here below: CTA_PROTOINFO CTA_PROTOINFO_TCP CTA_PROTOINFO_TCP_STATE instead of: CTA_PROTOINFO CTA_PROTOINFO_TCP_STATE Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 20:55:49 -07:00
Harald Welte	5bbc243aaf	[NETFILTER]: Add missing include to ip_conntrack_tuple.h Without this #include, __be16 is not defined and userspace programs will break. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 20:54:01 -07:00
Harald Welte	b3a91d037a	[NETFILTER] nat: remove bogus structure member When 'rustynat' was merged in 2.6.12, the use of the "helper" pointer of struct ipt_nat_info was obsoleted, but the pointer not removed from the struct. This patch removes the pointer, thereby yet again shrinking struct ip_conntrack. Discovered-by: Rusty Russell <rusty@netfilter.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 20:52:36 -07:00
Harald Welte	ebe0bbf06c	[NETFILTER] nfnetlink: use highest bit of nfa_type to indicate nested TLV As Henrik Nordstrom pointed out, all our efforts with "split endian" (i.e. host byte order tags, net byte order values) are useless, unless a parser can determine whether an attribute is nested or not. This patch steals the highest bit of nfattr.nfa_type to indicate whether the data payload contains a nested nfattr (1) or not (0). This will break userspace compatibility, but luckily no kernel with nfnetlink was released so far. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 20:52:19 -07:00
Harald Welte	46113830a1	[PATCH] Fix signal sending in usbdevio on async URB completion If a process issues an URB from userspace and (starts to) terminate before the URB comes back, we run into the issue described above. This is because the urb saves a pointer to "current" when it is posted to the device, but there's no guarantee that this pointer is still valid afterwards. In fact, there are three separate issues: 1) the pointer to "current" can become invalid, since the task could be completely gone when the URB completion comes back from the device. 2) Even if the saved task pointer is still pointing to a valid task_struct, task_struct->sighand could have gone meanwhile. 3) Even if the process is perfectly fine, permissions may have changed, and we can no longer send it a signal. So what we do instead, is to save the PID and uid's of the process, and introduce a new kill_proc_info_as_uid() function. Signed-off-by: Harald Welte <laforge@gnumonks.org> [ Fixed up types and added symbol exports ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-10 16:16:33 -07:00
Rafael J. Wysocki	3dd083255d	[PATCH] x86_64: Set up safe page tables during resume The following patch makes swsusp avoid the possible temporary corruption of page translation tables during resume on x86-64. This is achieved by creating a copy of the relevant page tables that will not be modified by swsusp and can be safely used by it on resume. The problem is that during resume on x86-64 swsusp may temporarily corrupt the page tables used for the direct mapping of RAM. If that happens, a page fault occurs and cannot be handled properly, which leads to the solid hang of the affected system. This leads to the loss of the system's state from before suspend and may result in the loss of data or the corruption of filesystems, so it is a serious issue. Also, it appears to happen quite often (for me, as often as 50% of the time). The problem is related to the fact that (at least) one of the PMD entries used in the direct memory mapping (starting at PAGE_OFFSET) points to a page table the physical address of which is much greater than the physical address of the PMD entry itself. Moreover, unfortunately, the physical address of the page table before suspend (i.e. the one stored in the suspend image) happens to be different to the physical address of the corresponding page table used during resume (i.e. the one that is valid right before swsusp_arch_resume() in arch/x86_64/kernel/suspend_asm.S is executed). Thus while the image is restored, the "offending" PMD entry gets overwritten, so it does not point to the right physical address any more (i.e. there's no page table at the address pointed to by it, because it points to the address the page table has been at during suspend). Consequently, if the PMD entry is used later on, and it _is_ used in the process of copying the image pages, a page fault occurs, but it cannot be handled in the normal way and the system hangs. In principle we can call create_resume_mapping() from swsusp_arch_resume() (ie. from suspend_asm.S), but then the memory allocations in create_resume_mapping(), resume_pud_mapping(), and resume_pmd_mapping() must be made carefully so that we use _only_ NosaveFree pages in them (the other pages are overwritten by the loop in swsusp_arch_resume()). Additionally, we are in atomic context at that time, so we cannot use GFP_KERNEL. Moreover, if one of the allocations fails, we should free all of the allocated pages, so we need to trace them somehow. All of this is done in the appended patch, except that the functions populating the page tables are located in arch/x86_64/kernel/suspend.c rather than in init.c. It may be done in a more elegan way in the future, with the help of some swsusp patches that are in the works now. [AK: move some externs into headers, renamed a function] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-10 08:36:46 -07:00
Al Viro	dd0fc66fb3	[PATCH] gfp flags annotations - part 1 - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-08 15:00:57 -07:00
Linus Torvalds	dcbd39a1f1	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-10-08 14:57:46 -07:00
David Howells	468ed2b0c8	[PATCH] Keys: Split key permissions checking into a .c file The attached patch splits key permissions checking out of key-ui.h and moves it into a .c file. It's quite large and called quite a lot, and it's about to get bigger with the addition of LSM support for keys... key_any_permission() is also discarded as it's no longer used. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-08 14:53:31 -07:00
Eric Kinzie	0f21ba7cc3	[ATM]: add support for LECS addresses learned from network From: Eric Kinzie <ekinzie@cmf.nrl.navy.mil> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-06 22:19:28 -07:00
Randy Dunlap	3d2aef6689	[TEXTSEARCH]: fix sparse gfp nocast warnings Fix nocast sparse warnings: include/linux/textsearch.h:165:57: warning: implicit cast to nocast type Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-04 22:45:14 -07:00
Randy Dunlap	17b6988563	[CONNECTOR]: fix sparse gfp nocast warnings Fix implicit nocast warnings in connector code: drivers/connector/connector.c:102:24: warning: implicit cast to nocast type drivers/connector/connector.c:114:45: warning: implicit cast to nocast type Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-04 22:41:16 -07:00
Randy Dunlap	7b5b3f3d82	[ATM]: fix sparse gfp nocast warnings Fix implicit nocast warnings in atm code: net/atm/atm_misc.c:35:44: warning: implicit cast to nocast type drivers/atm/fore200e.c:183:33: warning: implicit cast to nocast type Also use kzalloc() instead of kmalloc(). Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-04 22:38:44 -07:00
Alexey Dobriyan	ce0fe7e70a	[PATCH] bfs endianness annotations Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-04 13:22:01 -07:00
Herbert Xu	e5ed639913	[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl The following patch renames __in_dev_get() to __in_dev_get_rtnl() and introduces __in_dev_get_rcu() to cover the second case. 1) RCU with refcnt should use in_dev_get(). 2) RCU without refcnt should use __in_dev_get_rcu(). 3) All others must hold RTNL and use __in_dev_get_rtnl(). There is one exception in net/ipv4/route.c which is in fact a pre-existing race condition. I've marked it as such so that we remember to fix it. This patch is based on suggestions and prior work by Suzanne Wood and Paul McKenney. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:35:55 -07:00
Eric Dumazet	81c3d5470e	[INET]: speedup inet (tcp/dccp) lookups Arnaldo and I agreed it could be applied now, because I have other pending patches depending on this one (Thank you Arnaldo) (The other important patch moves skc_refcnt in a separate cache line, so that the SMP/NUMA performance doesnt suffer from cache line ping pongs) 1) First some performance data : -------------------------------- tcp_v4_rcv() wastes a lot of time in __inet_lookup_established() The most time critical code is : sk_for_each(sk, node, &head->chain) { if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! / } The sk_for_each() does use prefetch() hints but only the begining of "struct sock" is prefetched. As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far away from the begining of "struct sock", it has to bring into CPU cache cold cache line. Each iteration has to use at least 2 cache lines. This can be problematic if some chains are very long. 2) The goal ----------- The idea I had is to change things so that INET_MATCH() may return FALSE in 99% of cases only using the data already in the CPU cache, using one cache line per iteration. 3) Description of the patch --------------------------- Adds a new 'unsigned int skc_hash' field in 'struct sock_common', filling a 32 bits hole on 64 bits platform. struct sock_common { unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse; int skc_bound_dev_if; struct hlist_node skc_node; struct hlist_node skc_bind_node; atomic_t skc_refcnt; + unsigned int skc_hash; struct proto skc_prot; }; Store in this 32 bits field the full hash, not masked by (ehash_size - 1) Using this full hash as the first comparison done in INET_MATCH permits us immediatly skip the element without touching a second cache line in case of a miss. Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to sk_hash and tw_hash) already contains the slot number if we mask with (ehash_size - 1) File include/net/inet_hashtables.h 64 bits platforms : #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) ((((__u64 )&(inet_sk(__sk)->daddr)))== (__cookie)) && \ ((((__u32 )&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) 32bits platforms: #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) - Adds a prefetch(head->chain.first) in __inet_lookup_established()/__tcp_v4_check_established() and __inet6_lookup_established()/__tcp_v6_check_established() and __dccp_v4_check_established() to bring into cache the first element of the list, before the {read\|write}_lock(&head->lock); Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:13:38 -07:00
Herbert Xu	325ed82393	[NET]: Fix packet timestamping. I've found the problem in general. It affects any 64-bit architecture. The problem occurs when you change the system time. Suppose that when you boot your system clock is forward by a day. This gets recorded down in skb_tv_base. You then wind the clock back by a day. From that point onwards the offset will be negative which essentially overflows the 32-bit variables they're stored in. In fact, why don't we just store the real time stamp in those 32-bit variables? After all, we're not going to overflow for quite a while yet. When we do overflow, we'll need a better solution of course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 13:57:23 -07:00
Linus Torvalds	7d6322b465	Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-for-linus-2.6	2005-10-03 08:07:10 -07:00
Diego Calleja	fd2e54b35b	[PATCH] trivial #if -> #ifdef Use '#ifdef' consistently on __KERNEL__. This was reported as bug #5340 (isn't easier to send a fix than report the bug?!) Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-01 10:54:47 -07:00
Zach Brown	897f15fb58	[PATCH] aio: remove unlocked task_list test and resulting race Only one of the run or kick path is supposed to put an iocb on the run list. If both of them do it than one of them can end up referencing a freed iocb. The kick path could delete the task_list item from the wait queue before getting the ctx_lock and putting the iocb on the run list. The run path was testing the task_list item outside the lock so that it could catch ki_retry methods that return -EIOCBRETRY without putting the iocb on a wait queue and promising to call kick_iocb. This unlocked check could then race with the kick path to cause both to try and put the iocb on the run list. The patch stops the run path from testing task_list by requring that any ki_retry that returns -EIOCBRETRY must guarantee that kick_iocb() will be called in the future. aio_p{read,write}, the only in-tree -EIOCBRETRY users, are updated. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-30 12:41:17 -07:00
Linus Torvalds	4a8342d233	Revert task flag re-ordering, add comments Roland points out that the flags end up having non-obvious dependencies elsewhere, so revert `aa55a08687` and add some comments about why things are as they are. We'll just have to fix up the broken comparisons. Roland has a patch. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-29 15:18:21 -07:00
Oleg Nesterov	aa55a08687	[PATCH] fix TASK_STOPPED vs TASK_NONINTERACTIVE interaction do_signal_stop: for_each_thread(t) { if (t->state < TASK_STOPPED) ++sig->group_stop_count; } However, TASK_NONINTERACTIVE > TASK_STOPPED, so this loop will not count TASK_INTERRUPTIBLE \| TASK_NONINTERACTIVE threads. See also wait_task_stopped(), which checks ->state > TASK_STOPPED. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> [ We really probably should always use the appropriate bitmasks to test task states, not do it like this. Using something like #define TASK_RUNNABLE (TASK_RUNNING \| TASK_INTERRUPTIBLE \| \ TASK_UNINTERRUPTIBLE \| TASK_NONINTERACTIVE) and then doing "if (task->state & TASK_RUNNABLE)" or similar. But the ordering of the task states is historical, and keeping the ordering does make sense regardless. ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-29 09:05:52 -07:00
Linus Torvalds	eb693d2994	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-09-29 08:56:47 -07:00
David Howells	664cceb009	[PATCH] Keys: Add possessor permissions to keys [try #3 ] The attached patch adds extra permission grants to keys for the possessor of a key in addition to the owner, group and other permissions bits. This makes SUID binaries easier to support without going as far as labelling keys and key targets using the LSM facilities. This patch adds a second "pointer type" to key structures (struct key_ref ) that can have the bottom bit of the address set to indicate the possession of a key. This is propagated through searches from the keyring to the discovered key. It has been made a separate type so that the compiler can spot attempts to dereference a potentially incorrect pointer. The "possession" attribute can't be attached to a key structure directly as it's not an intrinsic property of a key. Pointers to keys have been replaced with struct key_ref 's wherever possession information needs to be passed through. This does assume that the bottom bit of the pointer will always be zero on return from kmem_cache_alloc(). The key reference type has been made into a typedef so that at least it can be located in the sources, even though it's basically a pointer to an undefined type. I've also renamed the accessor functions to be more useful, and all reference variables should now end in "_ref". Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-28 09:10:47 -07:00
Ben Dooks	2fab35d78f	[NET]: Fix GCC4 compile error: sysctl in linux/if_ether.h The following is generated when compiling a recent (2.6.14-rc2-git5) kernel configured for ARM, with GCC4. CC init/main.o In file included from include/linux/netdevice.h:29, from include/net/sock.h:48, from init/main.c:50: include/linux/if_ether.h:114: error: array type has incomplete element type It seems that if CONFIG_SYSCTL is not set, then the compiler will throw an error due to the definition of the ether_table[] array Attached is a solution to the problem Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-27 15:59:43 -07:00
David S. Miller	1f26dac320	[NET]: Add Sun Cassini driver. Written by Adrian Sun (asun@darksunrising.com). Ported to 2.6.x by Tom 'spot' Callaway <tcallawa@redhat.com>. Further cleaned up and integrated by David S. Miller Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-27 15:24:13 -07:00
Eric Dumazet	9356b8fc07	[NET]: Reorder some hot fields of struct net_device Place them on separate cache lines in SMP to lower memory bouncing between multiple CPU accessing the device. - One part is mostly used on receive path (including eth_type_trans()) (poll_list, poll, quota, weight, last_rx, dev_addr, broadcast) - One part is mostly used on queue transmit path (qdisc) (queue_lock, qdisc, qdisc_sleeping, qdisc_list, tx_queue_len) - One part is mostly used on xmit path (device) (xmit_lock, xmit_lock_owner, priv, hard_start_xmit, trans_start) 'features' is placed outside of these hot points, in a location that may be shared by all cpus (because mostly read) name_hlist is moved close to name[IFNAMSIZ] to speedup __dev_get_by_name() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-27 15:23:16 -07:00
Linus Torvalds	5c1f4cac6f	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-09-26 18:33:26 -07:00
Bagalkote, Sreenivas	c4a3e0a529	[SCSI] MegaRAID SAS RAID: new driver Signed-off-by: Sreenivas Bagalkote <Sreenivas.Bagalkote@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-26 17:32:44 -05:00
David S. Miller	56e9b26324	Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/llc-2.6	2005-09-26 15:29:31 -07:00
Harald Welte	188bab3ae0	[NETFILTER]: Fix invalid module autoloading by splitting iptable_nat When you've enabled conntrack and NAT as a module (standard case in all distributions), and you've also enabled the new conntrack netlink interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko. This causes a huge performance penalty, since for every packet you iterate the nat code, even if you don't want it. This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the iptables frontend (iptable_nat.ko). Threfore, ip_conntrack_netlink.ko will only pull ip_nat.ko, but not the frontend. ip_nat.ko will "only" allocate some resources, but not affect runtime performance. This separation is also a nice step in anticipation of new packet filters (nf-hipac, ipset, pkttables) being able to use the NAT core. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-26 15:25:11 -07:00

1 2 3 4 5 ...

1090 Commits