Speed up high call-rate workloads by caching the struct ip_map for the peer on
the connected struct svc_sock instead of looking it up in the ip_map cache
hashtable on every call. This helps workloads using AUTH_SYS authentication
over TCP.
Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e. recursive directory
listing) workload reading from an i386 RH9 install image (161480 regular files
in 10841 directories) on the server. That tree is small enough to fill in the
server's RAM so no disk traffic was involved. This setup gives a sustained
call rate in excess of 60000 calls/sec before being CPU-bound on the server.
Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
CPU, and ip_map_lookup() was taking 2.9%. This patch drops both contribution
into the profile noise.
Note that the above result overstates this value of this patch for most
workloads. The synthetic clients are all using separate IP addresses, so
there are 64 entries in the ip_map cache hash. Because the kernel measured
contained the bug fixed in commit
commit 1f1e030bf7
and was running on 64bit little-endian machine, probably all of those 64
entries were on a single chain, thus increasing the cost of ip_map_lookup().
With a modern kernel you would need more clients to see the same amount of
performance improvement. This patch has helped to scale knfsd to handle a
deployment with 2000 NFS clients.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the nfsd read-ahead params cache more SMP-friendly by changing the single
global list and lock into a fixed 16-bucket hashtable with per-bucket locks.
This reduces spinlock contention in nfsd_read() on read-heavy workloads on
multiprocessor servers.
Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients each doing 1K
streaming reads at full line rate. The server had 128 nfsd threads, which
sizes the RA cache at 256 entries, of which only a handful were used. Flat
profiling shows nfsd_read(), including the inlined nfsd_get_raparms(), taking
10.4% of each CPU. This patch drops the contribution from nfsd() to 1.71% for
each CPU.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The max possible is the maximum RPC payload. The default depends on amount of
total memory.
The value can be set within reason as long as no nfsd threads are currently
running. The value can also be ready, allowing the default to be determined
after nfsd has started.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The limit over UDP remains at 32K. Also, make some of the apparently
arbitrary sizing constants clearer.
The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
the rqstp. This allows it to be different for different protocols (udp/tcp)
and also allows it to depend on the servers declared sv_bufsiz.
Note that we don't actually increase sv_bufsz for nfs yet. That comes next.
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
.. by allocating the array of 'kvec' in 'struct svc_rqst'.
As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack. So we allocate it in 'struct
svc_rqst'.
However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union). So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256. This
means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.
struct svc_rqst contains two such arrays. However the there are never more
that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
needed.
The two arrays are for the pages holding the request, and the pages holding
the reply. Instead of two arrays, we can simply keep an index into where the
first reply page is.
This patch also removes a number of small inline functions that probably
server to obscure what is going on rather than clarify it, and opencode the
needed functionality.
Also remove the 'rq_restailpage' variable as it is *always* 0. i.e. if the
response 'xdr' structure has a non-empty tail it is always in the same pages
as the head.
check counters are initilised and incr properly
check for consistant usage of ++ etc
maybe extra some inlines for common approach
general review
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Magnus Maatta <novell@kiruna.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Arrgg.. We cannot 'lockd_up' before 'svc_addsock' as we don't know the
protocol yet.... So switch it around again and save the name of the created
sockets so that it can be closed if lock_up fails.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The refcount that nfsd holds on lockd is based on the number of open sockets.
So when we close a socket, we should decrement the ref (with lockd_down).
Currently when a socket is closed via writing to the portlist file, that
doesn't happen.
So: make sure we get an error return if the socket that was requested does is
not found, and call lockd_down if it was.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
nfsv2 needs the I_MUTEX_PARENT on the directory when creating a file too.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new multi block-write error reporting flag and properly tell the block
layer how much data was transferred before the error.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The MMC layer uses the standard work queue for doing card detection. As this
queue is shared with other crucial subsystems, the effects of a long (and
perhaps buggy) detection can cause the system to be unusable. E.g. the
keyboard stops working while the detection routine is running.
The solution is to add a specific mmc work queue to run the detection code in.
This is similar to how other subsystems handle detection (a full kernel
thread is the most common theme).
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some Ricoh controllers only respect a full reset when there is no card in the
slot. As we wait for the reset to complete, we must avoid even requesting
those resets on the buggy controllers.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sprinkle some mmiowb() where needed (writeX() before unlock()).
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Cc: Daniel Qarras <dqarras@yahoo.com>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Driver for TI Flash Media card reader. At present, only MMC/SD cards are
supported.
[akpm@osdl.org: cleanups, build fixes]
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Cc: Daniel Qarras <dqarras@yahoo.com>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix paren-placement / precedence bug breaking initialization for 1 MHz
clock mode.
Also fix comment spelling error, and fence-post (off-by-one) error on
symbol used in request_region.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=7242
Thanks alexander.krause@erazor-zone.de, dzpost@dedekind.net, for the
reports and patch test, and phelps@mantara.com for the independent patch
and verification.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: <alexander.krause@erazor-zone.de>
Cc: <dzpost@dedekind.net>
Cc: <phelps@mantara.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Roland Dreier wrote:
> The change "PCI: assign ioapic resource at hotplug" (commit
> 2318627965 in Linus's tree) makes
> networking stop working on my system (SuperMicro H8QC8 with four
> dual-core Opteron 885 CPUs). In particular, the on-board NIC stops
> working, probably because it gets assigned the wrong IRQ (225 in the
> non-working case, 217 in the working case)
>
> With that patch applied, e1000 doesn't work. Reverting just that
> patch (shown below) from Linus's latest tree fixes things for me.
>
The cause of this problem might be an wrong assumption that the 'start'
member of resource structure for ioapic device has non-zero value if the
resources are assigned by firmware. The 'start' member of ioapic device
seems not to be set even though the resources were actually assigned to
ioapic devices by firmware.
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Cc: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Not all architectures have a file named 'defconfig' (e.g. powerpc).
However the make TAGS and make tags targets search such files for tags,
causing an error message when they don't exist. This patch addresses the
problem by instructing xargs not to run the tags program if there are no
matching files.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since all callers dereference dir, we dont need this check. Coverity id
#337.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
pktcdvd: Rename a variable for better readability.
Signed-off-by: Thomas Maier <balagi@justmail.de>
Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
serial167, remove useless tty check
tty is dereferenced before it is checked to be non-NULL. Remove such
check.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
char, another tmp_buf cleanup
No need to allocate one page as a side buffer. It's no more used. Clean this
(de)allocs of this useless memory pages in char subtree.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- rename ____kmalloc to kmalloc_track_caller so that people have a chance
to guess what it does just from it's name. Add a comment describing it
for those who don't. Also move it after kmalloc in slab.h so people get
less confused when they are just looking for kmalloc - move things around
in slab.c a little to reduce the ifdef mess.
[penberg@cs.helsinki.fi: Fix up reversed #ifdef]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix kernel-doc and function declaration (missing "void") in
mm/page_alloc.c.
Add mm/page_alloc.c to kernel-api.tmpl in DocBook.
mm/page_alloc.c:2589:38: warning: non-ANSI function declaration of function 'remove_all_active_ranges'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Spotted by Hugh that hugetlb page is free'ed back to global pool before
performing any TLB flush in unmap_hugepage_range(). This potentially allow
threads to abuse free-alloc race condition.
The generic tlb gather code is unsuitable to use by hugetlb, I just open
coded a page gathering list and delayed put_page until tlb flush is
performed.
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Having min be a signed quantity means gcc can't turn high latency divides
into shifts. There happen to be two such divides for GFP_ATOMIC (ie.
networking, ie. important) allocations, one of which depends on the other.
Fixing this makes code smaller as a bonus.
Shame on somebody (probably me).
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While reading this I noticed that the contents of this document list
section "3.8 Command line dependency" but it doesn't exist in the document.
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't require that scripts/hdrcheck.sh be executable - shit happens...
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This allows numaq to properly align cpus to their given node during
boot. Pass logical apicid to apicid_to_node and allow the summit
sub-arch to use physical apicid (hard_smp_processor_id()).
Tested against numaq and summit based systems with no issues.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (39 commits)
Add missing maintainer countries in CREDITS
Fix bytes <-> kilobytes typo in Kconfig for ramdisk
fix a typo in Documentation/pi-futex.txt
BUG_ON conversion for fs/xfs/
BUG_ON() conversion in fs/nfsd/
BUG_ON conversion for fs/reiserfs
BUG_ON cleanups in arch/i386
BUG_ON cleanup in drivers/net/tokenring/
BUG_ON cleanup for drivers/md/
kerneldoc-typo in led-class.c
debugfs: spelling fix
rcutorture: Fix incorrect description of default for nreaders parameter
parport: Remove space in function calls
Michal Wronski: update contact info
Spelling fix: "control" instead of "cotrol"
reboot parameter in Documentation/kernel-parameters.txt
Fix copy&waste bug in comment in scripts/kernel-doc
remove duplicate "until" from kernel/workqueue.c
ite_gpio fix tabbage
fix file specification in comments
...
Fixed trivial path conflicts due to removed files:
arch/mips/dec/boot/decstation.c, drivers/char/ite_gpio.c
This is a small fix for a typo in Kconfig. The default value for the block
size is 1024 bytes not 1024 kilobytes.
Signed-off-by: Christian Borntraeger <borntrae@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch converts two if () BUG(); construct to BUG_ON();
which occupies less space, uses unlikely and is safer when
BUG() is disabled.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch converts an if () BUG(); construct to BUG_ON();
which occupies less space, uses unlikely and is safer when
BUG() is disabled.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch converts several if () BUG(); construct to BUG_ON();
which occupies less space, uses unlikely and is safer when
BUG() is disabled. S_ISREG() has no side effects, so the
conversion is safe.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This changes a couple of if() BUG(); constructs to
BUG_ON(); so it can be safely optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch converts one if() BUG(); to BUG_ON();
so it can be safely optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This changes two if() BUG(); usages to BUG_ON(); so people
can disable it safely.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Change debufs_create_file() to debugfs_create_file().
Signed-off-by: Komal Shah <komal_shah802003@yahoo.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
The comment for the nreaders parameter of rcutorture gives the default as
4*ncpus, but the value actually defaults to 2*ncpus; fix the comment.
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This removes the space in function calls in drivers/parport/daisy.c
Signed-off-by: Matthew Martin <lihnucks@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>