OpenCloudOS-Kernel/mm
NeilBrown 01408c4939 [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes
The problem is that when we write to a file, the copy from userspace to
pagecache is first done with preemption disabled, so if the source address is
not immediately available the copy fails *and* *zeros* *the* *destination*.

This is a problem because a concurrent read (which admittedly is an odd thing
to do) might see zeros rather that was there before the write, or what was
there after, or some mixture of the two (any of these being a reasonable thing
to see).

If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't cause an
error.

The first copying does not need to zero any uncopied bytes, and doing so
causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
so the simple expedient is to change copy_from_user_atomic to *not* zero out
bytes on failure.

The first of these two patches prepares for the change by fixing two places
which assume copy_from_user_atomic does zero the tail.  The two usages are
very similar pieces of code which copy from a userspace iovec into one or more
page-cache pages.  These are changed to remove the assumption.

The second patch changes __copy_from_user_inatomic* to not zero the tail.
Once these are accepted, I will look at similar patches of other architectures
where this is important (ppc, mips and sparc being the ones I can find).

This patch:

There is a problem with __copy_from_user_inatomic zeroing the tail of the
buffer in the case of an error.  As it is called in atomic context, the error
may be transient, so it results in zeros being written where maybe they
shouldn't be.

In the usage in filemap, this opens a window for a well timed read to see data
(zeros) which is not consistent with any ordering of reads and writes.

Most cases where __copy_from_user_inatomic is called, a failure results in
__copy_from_user being called immediately.  As long as the latter zeros the
tail, the former doesn't need to.  However in *copy_from_user_iovec
implementations (in both filemap and ntfs/file), it is assumed that
copy_from_user_inatomic will zero the tail.

This patch removes that assumption, so that after this patch it will
be safe for copy_from_user_inatomic to not zero the tail.

This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.

After this patch, all architectures that might disable preempt when
kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
This includes
 - powerpc
 - i386
 - mips
 - sparc

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
..
Kconfig [PATCH] Swapless page migration: modify core logic 2006-06-23 07:42:50 -07:00
Makefile [PATCH] uninline zone helpers 2006-03-27 08:44:48 -08:00
bootmem.c [PATCH] x86_64: Handle empty PXMs that only contain hotplug memory 2006-04-09 11:53:16 -07:00
fadvise.c [PATCH] sys_sync_file_range() 2006-03-31 12:18:54 -08:00
filemap.c [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes 2006-06-25 10:01:09 -07:00
filemap.h [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes 2006-06-25 10:01:09 -07:00
filemap_xip.c [PATCH] replace inode_update_time with file_update_time 2006-01-10 08:01:30 -08:00
fremap.c [PATCH] fix update_mmu_cache in fremap.c 2006-06-23 07:42:52 -07:00
highmem.c BUG_ON() Conversion in mm/highmem.c 2006-04-02 13:47:35 +02:00
hugetlb.c [PATCH] tightening hugetlb strict accounting 2006-06-23 07:42:48 -07:00
internal.h [PATCH] remove set_page_count() outside mm/ 2006-03-22 07:54:02 -08:00
madvise.c [PATCH] Fix MADV_REMOVE protection checking 2006-04-17 18:22:18 -07:00
memory.c [PATCH] add page_mkwrite() vm_operations method 2006-06-23 07:42:51 -07:00
memory_hotplug.c [PATCH] update vm_total_pages at memory hotadd 2006-06-23 07:42:52 -07:00
mempolicy.c [PATCH] page migration: Support a vma migration function 2006-06-25 10:00:55 -07:00
mempool.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2006-03-26 09:41:18 -08:00
migrate.c [PATCH] Allow migration of mlocked pages 2006-06-25 10:00:55 -07:00
mincore.c [PATCH] freepgt: sys_mincore ignore FIRST_USER_PGD_NR 2005-04-19 13:29:20 -07:00
mlock.c [PATCH] move capable() to capability.h 2006-01-11 18:42:13 -08:00
mmap.c [PATCH] add page_mkwrite() vm_operations method 2006-06-23 07:42:51 -07:00
mmzone.c [PATCH] uninline zone helpers 2006-03-27 08:44:48 -08:00
mprotect.c [PATCH] add page_mkwrite() vm_operations method 2006-06-23 07:42:51 -07:00
mremap.c [PATCH] move capable() to capability.h 2006-01-11 18:42:13 -08:00
msync.c [PATCH] Kill PF_SYNCWRITE flag 2006-06-23 17:10:39 +02:00
nommu.c [PATCH] overcommit: use totalreserve_pages for nommu 2006-04-11 06:18:32 -07:00
oom_kill.c [PATCH] mm: fix typos in comments in mm/oom_kill.c 2006-06-23 07:42:47 -07:00
page-writeback.c [PATCH] writeback: fix range handling 2006-06-23 07:42:49 -07:00
page_alloc.c [PATCH] cpuset: remove extra cpuset_zone_allowed check in __alloc_pages 2006-06-25 10:01:08 -07:00
page_io.c [PATCH] mm: split page table lock 2005-10-29 21:40:42 -07:00
pdflush.c [PATCH] pdflush: handle resume wakeups 2006-06-25 10:01:06 -07:00
prio_tree.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
readahead.c [PATCH] AOP_TRUNCATED_PAGE victims in read_pages() belong in the LRU 2006-06-25 10:00:54 -07:00
rmap.c [PATCH] Allow migration of mlocked pages 2006-06-25 10:00:55 -07:00
shmem.c [PATCH] migration: remove unnecessary PageSwapCache checks 2006-06-23 07:42:46 -07:00
slab.c [PATCH] slab: kmalloc, kzalloc comments cleanup and fix 2006-06-23 07:42:52 -07:00
slob.c [PATCH] mm/slob.c: for_each_possible_cpu(), not NR_CPUS 2006-04-19 09:13:49 -07:00
sparse.c [PATCH] sparsemem: record nid during memory present 2006-06-23 07:42:51 -07:00
swap.c [PATCH] percpu_counters: create lib/percpu_counter.c 2006-06-23 07:43:06 -07:00
swap_state.c BUG_ON() Conversion in mm/swap_state.c 2006-04-01 01:25:12 +02:00
swapfile.c [PATCH] read_mapping_page for address space 2006-06-23 07:43:02 -07:00
thrash.c [PATCH] temporarily disable swap token on memory pressure 2005-11-28 14:42:25 -08:00
tiny-shmem.c [PATCH] do_truncate() call fix in tiny-shmem.c 2006-01-12 09:08:49 -08:00
truncate.c [PATCH] Remove semi-softlockup from invalidate_mapping_pages 2006-06-23 07:43:07 -07:00
util.c [PATCH] slab: optimize constant-size kzalloc calls 2006-03-25 08:22:49 -08:00
vmalloc.c [PATCH] mm: introduce remap_vmalloc_range() 2006-06-23 07:42:49 -07:00
vmscan.c [PATCH] initialise total_memory() earlier 2006-06-23 07:42:52 -07:00