To print some values, like in the annotation code with invalid jump
offsets.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1vk0g5twas2ioswn1mmvnvwq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
New features:
- Improve ARM support in the annotation code, affecting 'perf annotate', 'perf
report' and live annotation in 'perf top' (Kim Phillips)
- Initial support for PowerPC in the annotation code (Ravi Bangoria)
- Skip repetitive scheduler function on the top of the stack in
'perf sched timehist' (Namhyung Kim)
Fixes:
- Fix maps resolution in libbpf (Eric Leblond)
- Get the kernel signature via /proc/version_signature, available on
ubuntu systems, to make sure bpf proggies works, as the one provided
via 'uname -r' doesn't (Wang Nan)
- Fix segfault in 'perf record' when running with suid and kptr_restrict
is 1 (Wang Nan)
Infrastructure:
- Support per-arch instruction tables, kept via a static or dynamic table
(Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYOFHxAAoJENZQFvNTUqpAUXIQAIM1QJdA7zBvGpAoZPZdzOQM
avQZjYN+QnW8aCAEClVph+JM7apXlp2h5u+gFmoOgzSdQlljaUqkJ2DDK26XbaTR
Qo1qu/FBGc/QHLlh+5kUVly+yoSxb6N9aXa+pUZzakbt4uiPkBBz+BoPaBK0c5xt
flLhP4x7x33AK6UWc4XlTgSZoYUEHTvrDKI3yCKAMcFs5YoNDwDFeFSp7zUXrhKT
5jjIUsW4ZAM7yACXhQsfDDb7OaVPFm7KHk0DwAG2H0p7oESRCXOU/09ZGFW9g3eo
zQWngNB72Ed721MuqyzRfjjGuzFpNCbCQiTcB+S3Fkoqcbr9tpohZ45FM8ntBgUK
jY62V8XHDHPB4jqM15uxa7NIjXsXCgUBKj9ukpbHdesblW+Cs+DyU7BAETKSrGDX
gjcBK0gXlV5+AYvTGlo7abK8pZAJA2nzORFKZg4TDO2yjRmYGqAcl9cTjQOsAgcB
QGiEK3KxLEfjWwcL+ebgscJUvTXbhazGX0ijS/HgE5VTc8wzaMDQB0bsBp51OZ6h
tBpujHg+5zqrORHSLy48Klu1k2MrHKlGP5gQrO3NwzDA1xcVZm5fYJWEdMnckalk
2rcbJgxHGevPXjNAX8kYV/Api3yiYnijpTL46PB63WW4zZydaYuu+824sE5ME77B
bafFeJqf34716vIFom6d
=46x8
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20161125' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Improve ARM support in the annotation code, affecting 'perf annotate', 'perf
report' and live annotation in 'perf top' (Kim Phillips)
- Initial support for PowerPC in the annotation code (Ravi Bangoria)
- Skip repetitive scheduler function on the top of the stack in
'perf sched timehist' (Namhyung Kim)
Fixes:
- Fix maps resolution in libbpf (Eric Leblond)
- Get the kernel signature via /proc/version_signature, available on
Ubuntu systems, to make sure BPF proggies works, as the one provided
via 'uname -r' doesn't (Wang Nan)
- Fix segfault in 'perf record' when running with suid and kptr_restrict
is 1 (Wang Nan)
Infrastructure changes:
- Support per-arch instruction tables, kept via a static or dynamic table
(Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is not correct to assimilate the elf data of the maps section to an
array of map definition. In fact the sizes differ. The offset provided
in the symbol section has to be used instead.
This patch fixes a bug causing a elf with two maps not to load
correctly.
Wang Nan added:
This patch requires a name for each BPF map, so array of BPF maps is not
allowed. This restriction is reasonable, because kernel verifier forbid
indexing BPF map from such array unless the index is a fixed value, but
if the index is fixed why not merging it into name?
For example:
Program like this:
...
unsigned long cpu = get_smp_processor_id();
int *pval = map_lookup_elem(&map_array[cpu], &key);
...
Generates bytecode like this:
0: (b7) r1 = 0
1: (63) *(u32 *)(r10 -4) = r1
2: (b7) r1 = 680997
3: (63) *(u32 *)(r10 -8) = r1
4: (85) call 8
5: (67) r0 <<= 4
6: (18) r1 = 0x112dd000
8: (0f) r0 += r1
9: (bf) r2 = r10
10: (07) r2 += -4
11: (bf) r1 = r0
12: (85) call 1
Where instruction 8 is the computation, 8 and 11 render r1 to an invalid
value for function map_lookup_elem, causes verifier report error.
Signed-off-by: Eric Leblond <eric@regit.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
[ Merge bpf_object__init_maps_name into bpf_object__init_maps.
Fix segfault for buggy BPF script Validate obj->maps ]
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-5-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 0b3c2264ae ("perf symbols: Fix kallsyms perf test on ppc64le")
refers struct symbol in probe_event.h, but forgets to include its
definition. Gcc will complain about it when that definition is not
added, by sheer luck, by some other header included before
probe_event.h.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch perf panics if kptr_restrict is set to 1 and perf is
owned by root with suid set:
$ whoami
wangnan
$ ls -l ./perf
-rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf
$ cat /proc/sys/kernel/kptr_restrict
1
$ cat /proc/sys/kernel/perf_event_paranoid
-1
$ ./perf record -a
Segmentation fault (core dumped)
$
The reason is that perf assumes it is allowed to read kptr from
/proc/kallsyms when euid is root, but in fact the kernel doesn't allow
reading kptr when euid and uid do not match with each other:
$ cp /bin/cat .
$ sudo chown root:root ./cat
$ sudo chmod u+s ./cat
$ cat /proc/kallsyms | grep do_fork
0000000000000000 T _do_fork <--- kptr is hidden even euid is root
$ sudo cat /proc/kallsyms | grep do_fork
ffffffff81080230 T _do_fork
See lib/vsprintf.c for kernel side code.
This patch fixes this problem by checking both uid and euid.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On ubuntu the internal kernel version code is different from what can
be retrived from uname:
$ uname -r
4.4.0-47-generic
$ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h
#define LINUX_VERSION_CODE 263192
#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
$ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h
#define UTS_RELEASE "4.4.0-47-generic"
#define UTS_UBUNTU_RELEASE_ABI 47
$ cat /proc/version_signature
Ubuntu 4.4.0-47.68-generic 4.4.24
The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but
`uname -r` reports 4.4.0.
This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become
an incorrect value, results in magic failure in BPF loading:
$ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls
event syntax error: './tools/perf/tests/bpf-script-example.c'
\___ Failed to load program for unknown reason
According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the
correct kernel version can be retrived through /proc/version_signature, which
is ubuntu specific.
This patch checks the existance of /proc/version_signature, and returns
version number through parsing this file instead of uname. Version string
is untouched (value returns from uname) because `uname -r` is required
to be consistence with path of kbuild directory in /lib/module.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For tracepoint events, callchains always contain certain functions.
Sometimes it'd be better to skip those functions as they have no value.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161124011114.7102-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By using arch->init() to set up some regular expressions to associate
ins_ops to ARM instructions, ditching that old table that has
instructions not present on ARM.
Take advantage of having an arch->init() to hide more arm specific stuff
from the common code, like the objdump details.
The regular expressions comes from a patch written by Kim Phillips.
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-77m7lufz9ajjimkrebtg5ead@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arches like ARM will want to use regular expressions when deciding what
instructions to associate with what ins_ops, provide infrastructure for
that.
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7dmnk9el2ipu3nxog092k9z5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some arches may want to dynamically populate the table using regular
expressions on the instruction names to associate them with a set of
parsing/formatting/etc functions (struct ins_ops), so provide a fallback
for when the ins__find() method fails.
That fall back will be able to resize the arch->instructions, setting
arch->nr_instructions appropriately, helper functions to associate an
ins_ops to an instruction name, growing the arch->instructions if needed
and resorting it are provided, all the arch specific callback needs to
do is to decide if the missing instruction should be added to
arch->instructions with a ins_ops association.
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-auu13yradxf7g5dgtpnzt97a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The disasm_line::name field is always equal to ins::name, being used
just to locate the instruction's ins_ops from the per-arch instructions
table.
Eliminate this duplication, nuking that field and instead make
ins__find() return an ins_ops, store it in disasm_line::ins.ops, and
keep just in disasm_line::ins.name what was in disasm_line::name, this
way we end up not keeping a reference to entries in the per-arch
instructions table.
This in turn will help supporting multiple ways to manage the per-arch
instructions table, allowing resorting that array, for instance, when
the entries will move after references to its addresses were made. The
same problem is avoided when one grows the array with realloc.
So architectures simply keeping a constant array will work as well as
architectures building the table using regular expressions or other
logic that involves resorting the table.
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is the second issue I noticed in reviewing the parisc TLB code.
The fic instruction may use either the instruction or data TLB in
flushing the instruction cache. Thus, on machines with a split TLB, we
should also flush the data TLB after setting up the temporary alias
registers.
Although this has no functional impact, I changed the pdtlb and pitlb
instructions to consistently use the index register %r0. These
instructions do not support integer displacements.
Tested on rp3440 and c8000.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Helge Deller <deller@gmx.de>
We are still troubled by occasional random segmentation faults and
memory memory corruption on SMP machines. The causes quite a few
package builds to fail on the Debian buildd machines for parisc. When
gcc-6 failed to build three times in a row, I looked again at the TLB
related code. I found a couple of issues. This is the first.
In general, we need to ensure page table updates and corresponding TLB
purges are atomic. The attached patch fixes an instance in pci-dma.c
where the page table update was not guarded by the TLB lock.
Tested on rp3440 and c8000. So far, no further random segmentation
faults have been observed.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Helge Deller <deller@gmx.de>
Drop the open-coded sched_clock() function and replace it by the provided
GENERIC_SCHED_CLOCK implementation. We have seen quite some hung tasks in the
past, which seem to be fixed by this patch.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Helge Deller <deller@gmx.de>
The kernel WARNs and then crashes today if wm8994_device_init() fails
after calling devm_regulator_bulk_get().
That happens because there are multiple devices involved here and the
order in which managed resources are freed isn't correct.
The regulators are added as children of wm8994->dev. Whereas,
devm_regulator_bulk_get() receives wm8994->dev as the device, though it
gets the same regulators which were added as children of wm8994->dev
earlier.
During failures, the children are removed first and the core eventually
calls regulator_unregister() for them. As regulator_put() was never done
for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at
WARN_ON(rdev->open_count);
And eventually it crashes from debugfs_remove_recursive().
--------x------------------x----------------
wm8994 3-001a: Device is not a WM8994, ID is 0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)
[<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)
[<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)
[<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)
[<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)
[<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)
[<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)
[<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)
[<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)
[<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)
[<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)
[<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)
[<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)
[<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)
[<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)
[<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)
[<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)
[<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)
[<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
[<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
[<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
[<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
[<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
[<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
[<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
[<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
[<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
---[ end trace 0919d3d0bc998260 ]---
[snip..]
Unable to handle kernel NULL pointer dereference at virtual address 00000078
pgd = c0004000
[00000078] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
task: ee874000 task.stack: ee878000
PC is at down_write+0x14/0x54
LR is at debugfs_remove_recursive+0x30/0x150
[snip..]
[<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)
[<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)
[<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)
[<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)
[<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)
[<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
[<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
[<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
[<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
[<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
[<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
[<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
[<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
[<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)
---[ end trace 0919d3d0bc998262 ]---
--------x------------------x----------------
Fix the kernel warnings and crashes by using regulator_bulk_get()
instead of devm_regulator_bulk_get() and explicitly freeing the supplies
in exit paths.
Tested on Exynos 5250, dual core ARM A15 machine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
The order in which resources were freed in wm8994_device_exit() isn't
correct. The regulators are removed before they are disabled.
Fix it by reordering code a bit, which makes it exact opposite of
wm8994_device_init() as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Since commit 4bcc595ccd ("printk: reinstate KERN_CONT for printing
continuation lines") the output from __do_page_fault on MIPS has been
pretty unreadable due to the lack of KERN_CONT markers. Use pr_cont
to provide the appropriate markers & restore the expected output.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14544/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The regmap devicetree binding documentation states that a native-endian
property should be supported as well as big-endian & little-endian,
however syscon in its duplication of the parsing of these properties
omits support for native-endian. Fix this by setting
REGMAP_ENDIAN_NATIVE when a native-endian property is found.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
This branch include patches of fixing a typo, accurate dsi frame rate,
and fixing null pointer dereference.
* 'mediatek-drm-fixes-2016-11-24' of https://github.com/ckhu-mediatek/linux.git-tags:
drm/mediatek: fix null pointer dereference
drm/mediatek: fixed the calc method of data rate per lane
drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE
With commit e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO") we
started using the ppp value 0b110 to map kernel readonly. But that
facility was only added as part of ISA 2.04. For earlier ISA version
only supported ppp bit value for readonly mapping is 0b011. (This
implies both user and kernel get mapped using the same ppp bit value for
readonly mapping.).
Update the code such that for earlier architecture version we use ppp
value 0b011 for readonly mapping. We don't differentiate between power5+
and power5 here and apply the new ppp bits only from power6 (ISA 2.05).
This keep the changes minimal.
This fixes issue with PS3 spu usage reported at
https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org
Fixes: e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: stable@vger.kernel.org # v4.7+
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Free memory mapping, if hdmi_probe is not successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Johan Hedberg says:
====================
pull request: bluetooth 2016-11-23
Sorry about the late pull request for 4.9, but we have one more
important Bluetooth patch that should make it to the release. It fixes
connection creation for Bluetooth LE controllers that do not have a
public address (only a random one).
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Should pass valid filter handle, not the netlink flags.
Fixes: 30a391a13a ("net sched filters: pass netlink message flags in event notification")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case the link change and EEE is enabled or disabled, always try to
re-negotiate this with the link partner.
Fixes: 450b05c15f ("net: dsa: bcm_sf2: add support for controlling EEE")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When busy polling while a link is down (during a link-flap test), TX
timeouts were observed as well as the following messages in the ring
buffer:
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free tx failed. rc:-1
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free rx failed. rc:-1
These were resolved by checking for link status and returning if link
was not up.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Rob Miller <rob.miller@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commits 93821778de ("udp: Fix rcv socket locking") and
f7ad74fef3 ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into
__udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite
was forgotten.
This leads to crashes if UDPlite header is pulled twice, which happens
starting from commit e6afc8ace6 ("udp: remove headers from UDP packets
before queueing")
Bug found by syzkaller team, thanks a lot guys !
Note that backlog use in UDP/UDPlite is scheduled to be removed starting
from linux-4.10, so this patch is only needed up to linux-4.9
Fixes: 93821778de ("udp: Fix rcv socket locking")
Fixes: f7ad74fef3 ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb")
Fixes: e6afc8ace6 ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are a few small USB fixes and new device ids for 4.9-rc7.
The majority of these fixes are in the musb driver, fixing a number of
regressions that have been reported but took a while to resolve. The
other fixes are all small ones, to resolve other reported minor issues.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iFYEABECABYFAlg3ANIPHGdyZWdAa3JvYWguY29tAAoJEDFH1A3bLfspE9wAmwaC
5MUY7ZvafHycRWZWr7GzMCsJAKC0HljXocIpk0hJxYLwAhcE+D+QCQ==
=dbqZ
-----END PGP SIGNATURE-----
Merge tag 'usb-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few small USB fixes and new device ids for 4.9-rc7.
The majority of these fixes are in the musb driver, fixing a number of
regressions that have been reported but took a while to resolve. The
other fixes are all small ones, to resolve other reported minor
issues.
All have been in linux-next for a while with no reported issues"
* tag 'usb-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: f_fs: fix wrong parenthesis in ffs_func_req_match()
phy: twl4030-usb: Fix for musb session bit based PM
usb: musb: Drop pointless PM runtime code for dsps glue
usb: musb: Add missing pm_runtime_disable and drop 2430 PM timeout
usb: musb: Fix PM for hub disconnect
usb: musb: Fix sleeping function called from invalid context for hdrc glue
usb: musb: Fix broken use of static variable for multiple instances
USB: serial: cp210x: add ID for the Zone DPMX
usb: chipidea: move the lock initialization to core file
Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad
Pull HID fixes from Jiri Kosina:
- DMA-on-stack fixes for a couple drivers, from Benjamin Tissoires
- small memory sanitization fix for sensor-hub driver, from Song
Hongyan
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hid-sensor-hub: clear memory to avoid random data
HID: rmi: make transfer buffers DMA capable
HID: magicmouse: make transfer buffers DMA capable
HID: lg: make transfer buffers DMA capable
HID: cp2112: make transfer buffers DMA capable
KVM was using arrays of size KVM_MAX_VCPUS with vcpu_id, but ID can be
bigger that the maximal number of VCPUs, resulting in out-of-bounds
access.
Found by syzkaller:
BUG: KASAN: slab-out-of-bounds in __apic_accept_irq+0xb33/0xb50 at addr [...]
Write of size 1 by task a.out/27101
CPU: 1 PID: 27101 Comm: a.out Not tainted 4.9.0-rc5+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[...]
Call Trace:
[...] __apic_accept_irq+0xb33/0xb50 arch/x86/kvm/lapic.c:905
[...] kvm_apic_set_irq+0x10e/0x180 arch/x86/kvm/lapic.c:495
[...] kvm_irq_delivery_to_apic+0x732/0xc10 arch/x86/kvm/irq_comm.c:86
[...] ioapic_service+0x41d/0x760 arch/x86/kvm/ioapic.c:360
[...] ioapic_set_irq+0x275/0x6c0 arch/x86/kvm/ioapic.c:222
[...] kvm_ioapic_inject_all arch/x86/kvm/ioapic.c:235
[...] kvm_set_ioapic+0x223/0x310 arch/x86/kvm/ioapic.c:670
[...] kvm_vm_ioctl_set_irqchip arch/x86/kvm/x86.c:3668
[...] kvm_arch_vm_ioctl+0x1a08/0x23c0 arch/x86/kvm/x86.c:3999
[...] kvm_vm_ioctl+0x1fa/0x1a70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: af1bae5497 ("KVM: x86: bump KVM_MAX_VCPU_ID to 1023")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
em_jmp_far and em_ret_far assumed that setting IP can only fail in 64
bit mode, but syzkaller proved otherwise (and SDM agrees).
Code segment was restored upon failure, but it was left uninitialized
outside of long mode, which could lead to a leak of host kernel stack.
We could have fixed that by always saving and restoring the CS, but we
take a simpler approach and just break any guest that manages to fail
as the error recovery is error-prone and modern CPUs don't need emulator
for this.
Found by syzkaller:
WARNING: CPU: 2 PID: 3668 at arch/x86/kvm/emulate.c:2217 em_ret_far+0x428/0x480
Kernel panic - not syncing: panic_on_warn set ...
CPU: 2 PID: 3668 Comm: syz-executor Not tainted 4.9.0-rc4+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[...]
Call Trace:
[...] __dump_stack lib/dump_stack.c:15
[...] dump_stack+0xb3/0x118 lib/dump_stack.c:51
[...] panic+0x1b7/0x3a3 kernel/panic.c:179
[...] __warn+0x1c4/0x1e0 kernel/panic.c:542
[...] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[...] em_ret_far+0x428/0x480 arch/x86/kvm/emulate.c:2217
[...] em_ret_far_imm+0x17/0x70 arch/x86/kvm/emulate.c:2227
[...] x86_emulate_insn+0x87a/0x3730 arch/x86/kvm/emulate.c:5294
[...] x86_emulate_instruction+0x520/0x1ba0 arch/x86/kvm/x86.c:5545
[...] emulate_instruction arch/x86/include/asm/kvm_host.h:1116
[...] complete_emulated_io arch/x86/kvm/x86.c:6870
[...] complete_emulated_mmio+0x4e9/0x710 arch/x86/kvm/x86.c:6934
[...] kvm_arch_vcpu_ioctl_run+0x3b7a/0x5a90 arch/x86/kvm/x86.c:6978
[...] kvm_vcpu_ioctl+0x61e/0xdd0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557
[...] vfs_ioctl fs/ioctl.c:43
[...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679
[...] SYSC_ioctl fs/ioctl.c:694
[...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
[...] entry_SYSCALL_64_fastpath+0x1f/0xc2
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: d1442d85cc ("KVM: x86: Handle errors when RIP is set during far jumps")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Otherwise each individual rotator char would be printed in a new line:
(...)
[ 0.642350] -
[ 0.644374] |
[ 0.646367] -
(...)
Signed-off-by: Nicolas Schichan <nicolas.schichan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When an ipv6 address has the tentative flag set, it can't be
used as source for egress traffic, while the associated route,
if any, can be looked up and even stored into some dst_cache.
In the latter scenario, the source ipv6 address selected and
stored in the cache is most probably wrong (e.g. with
link-local scope) and the entity using the dst_cache will
experience lack of ipv6 connectivity until said cache is
cleared or invalidated.
Overall this may cause lack of connectivity over most IPv6 tunnels
(comprising geneve and vxlan), if the first egress packet reaches
the tunnel before the DaD is completed for the used ipv6
address.
This patch bumps a new genid after that the IFA_F_TENTATIVE flag
is cleared, so that dst_cache will be invalidated on
next lookup and ipv6 connectivity restored.
Fixes: 0c1d70af92 ("net: use dst_cache for vxlan device")
Fixes: 468dfffcd7 ("geneve: add dst caching support")
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since MIPSr6 the Wired register is split into 2 fields, with the upper
16 bits of the register indicating a limit on the value that the wired
entry count in the bottom 16 bits of the register can take. This means
that simply reading the wired register doesn't get us a valid TLB entry
index any longer, and we instead need to retrieve only the lower 16 bits
of the register. Introduce a new num_wired_entries() function which does
this on MIPSr6 or higher and simply returns the value of the wired
register on older architecture revisions, and make use of it when
reading the number of wired entries.
Since commit e710d66683 ("MIPS: tlb-r4k: If there are wired entries,
don't use TLBINVF") we have been using a non-zero number of wired
entries to determine whether we should avoid use of the tlbinvf
instruction (which would invalidate wired entries) and instead loop over
TLB entries in local_flush_tlb_all(). This loop begins with the number
of wired entries, or before this patch some large bogus TLB index on
MIPSr6 systems. Thus since the aforementioned commit some MIPSr6 systems
with FTLBs have been prone to leaving stale address translations in the
FTLB & crashing in various weird & wonderful ways when we later observe
the wrong memory.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14557/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When loading the TX fifo to receive bytes on the I2C bus, we incorrectly
count the number of bytes:
rx_limit = dev->rx_fifo_depth - dw_readl(dev, DW_IC_RXFLR);
while (buf_len > 0 && tx_limit > 0 && rx_limit > 0) {
if (rx_limit - dev->rx_outstanding <= 0)
break;
rx_limit--;
dev->rx_outstanding++;
}
DW_IC_RXFLR indicates how many bytes are available to be read in the
FIFO, dev->rx_fifo_depth is the FIFO size, and dev->rx_outstanding is
the number of bytes that we've requested to be read so far, but which
have not been read.
Firstly, increasing dev->rx_outstanding and decreasing rx_limit and then
comparing them results in each byte consuming "two" bytes in this
tracking, so this is obviously wrong.
Secondly, the number of bytes that _could_ be received into the FIFO at
any time is the number of bytes we have so far requested but not yet
read from the FIFO - in other words dev->rx_outstanding.
So, in order to request enough bytes to fill the RX FIFO, we need to
request dev->rx_fifo_depth - dev->rx_outstanding bytes.
Modifying the code thusly results in us reaching the maximum number of
bytes outstanding each time we queue more "receive" operations, provided
the transfer allows that to happen.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Rather than reporting success for a short transfer due to interrupt
latency, report an error both to the caller, as well as to the kernel
log.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Stas Nichiporovich reports oops in nf_nat_bysource_cmp(), trying to
access nf_conn struct at address 0xffffffffffffff50.
This is the result of fetching a null rhash list (struct embedded at
offset 176; 0 - 176 gets us ...fff50).
The problem is that conntrack entries are allocated from a
SLAB_DESTROY_BY_RCU cache, i.e. entries can be free'd and reused
on another cpu while nf nat bysource hash access the same conntrack entry.
Freeing is fine (we hold rcu read lock); zeroing rhlist_head isn't.
-> Move the rhlist struct outside of the memset()-inited area.
Fixes: 7c96643519 ("netfilter: move nat hlist_head to nf_conn")
Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Otherwise, kernel panic will happen if the user does not specify
the related attributes.
Fixes: 0f3cd9b369 ("netfilter: nf_tables: add range expression")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
As Liping Zhang reports, after commit a8b1e36d0d ("netfilter: nft_dynset:
fix element timeout for HZ != 1000"), priv->timeout was stored in jiffies,
while set->timeout was stored in milliseconds. This is inconsistent and
incorrect.
Firstly, we already call msecs_to_jiffies in nft_set_elem_init, so
priv->timeout will be converted to jiffies twice.
Secondly, if the user did not specify the NFTA_DYNSET_TIMEOUT attr,
set->timeout will be used, but we forget to call msecs_to_jiffies
when do update elements.
Fix this by using jiffies internally for traditional sets and doing the
conversions to/from msec when interacting with userspace - as dynset
already does.
This is preferable to doing the conversions, when elements are inserted or
updated, because this can happen very frequently on busy dynsets.
Fixes: a8b1e36d0d ("netfilter: nft_dynset: fix element timeout for HZ != 1000")
Reported-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Anders K. Pedersen <akp@cohaesio.com>
Acked-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
I got offlist bug report about failing connections and high cpu usage.
This happens because we hit 'elasticity' checks in rhashtable that
refuses bucket list exceeding 16 entries.
The nat bysrc hash unfortunately needs to insert distinct objects that
share same key and are identical (have same source tuple), this cannot
be avoided.
Switch to the rhlist interface which is designed for this.
The nulls_base is removed here, I don't think its needed:
A (unlikely) false positive results in unneeded port clash resolution,
a false negative results in packet drop during conntrack confirmation,
when we try to insert the duplicate into main conntrack hash table.
Tested by adding multiple ip addresses to host, then adding
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
... and then creating multiple connections, from same source port but
different addresses:
for i in $(seq 2000 2032);do nc -p 1234 192.168.7.1 $i > /dev/null & done
(all of these then get hashed to same bysource slot)
Then, to test that nat conflict resultion is working:
nc -s 10.0.0.1 -p 1234 192.168.7.1 2000
nc -s 10.0.0.2 -p 1234 192.168.7.1 2000
tcp .. src=10.0.0.1 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1024 [ASSURED]
tcp .. src=10.0.0.2 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1025 [ASSURED]
tcp .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1234 [ASSURED]
tcp .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2001 src=192.168.7.1 dst=192.168.7.10 sport=2001 dport=1234 [ASSURED]
[..]
-> nat altered source ports to 1024 and 1025, respectively.
This can also be confirmed on destination host which shows
ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1024
ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1025
ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1234
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 870190a9ec ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The comparator works like memcmp, i.e. 0 means objects are equal.
In other words, when objects are distinct they are treated as identical,
when they are distinct they are allegedly the same.
The first case is rare (distinct objects are unlikely to get hashed to
same bucket).
The second case results in unneeded port conflict resolutions attempts.
Fixes: 870190a9ec ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>