Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gcc-7 warns:
In file included from arch/x86/tools/relocs_64.c:17:0:
arch/x86/tools/relocs.c: In function ‘process_64’:
arch/x86/tools/relocs.c:953:2: warning: argument 1 null where non-null expected [-Wnonnull]
qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from arch/x86/tools/relocs.h:6:0,
from arch/x86/tools/relocs_64.c:1:
/usr/include/stdlib.h:741:13: note: in a call to function ‘qsort’ declared here
extern void qsort
This happens because relocs16 is not used for ELF_BITS == 64,
so there is no point in trying to sort it.
Make the sort_relocs(&relocs16) call 32bit only.
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Link: http://lkml.kernel.org/r/20161215124513.GA289@x4
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
per_cpu_load_addr is only used for 64-bit relocations, but is
declared in both configurations of relocs.c - with different
types. This has undefined behaviour in general. GNU ld is
documented to use the larger size in this case, but other tools
may differ and some warn about this.
References: https://bugs.debian.org/748577
Reported-by: Michael Tautschnig <mt@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: 748577@bugs.debian.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1411561812.3659.23.camel@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch move the vsyscall_gtod_data handling out of vsyscall_64.c
into an additonal file vsyscall_gtod.c to make the functionality
available for x86 32 bit kernel.
It also adds a new vsyscall_32.c which setup the VVAR page.
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/1395094933-14252-2-git-send-email-stefani@seibold.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Improve the debuggability of relocations output. When trying to compare
the output between different linkers, it's handy to be able to see the
section names in output.
Signed-off-by: Michael Davidson <md@google.com>
Link: http://lkml.kernel.org/r/20140121203223.GA12649@www.outflux.net
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The GNU linker tries to put __per_cpu_load into the percpu area,
resulting in a lack of its relocation. Force this symbol to be
relocated. Seen starting with GNU ld 2.23 and later.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Cc: Cong Ding <dinggnu@gmail.com>
Link: http://lkml.kernel.org/r/20131016064314.GA2739@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The "gold" linker doesn't seem to put some additional per-cpu cases in
the right place. Add these to the per-cpu check. Without this, the kASLR
patch series fails to correctly apply relocations, and fails to boot.
Signed-off-by: Michael Davidson <md@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20131011013954.GA28902@www.outflux.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The __vvar_page relocation should actually be listed in S_REL instead
of S_ABS. Oddly, this didn't always cause things to break, presumably
because there are no users for relocation information on 64 bits yet.
[ hpa: Not for stable - new code in 3.10 ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130611185652.GA23674@www.outflux.net
Reported-by: Michael Davidson <md@google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This adds the ability to process relocations from the 64-bit kernel ELF,
if built with ELF_BITS=64 defined. The special case for the percpu area is
handled, along with some other symbols specific to the 64-bit kernel.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-4-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Instead of counting and then processing relocations, do it in a single
pass. This splits the processing logic into separate functions for
realmode and 32-bit (and paves the way for 64-bit). Also extracts helper
functions when emitting relocations.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-3-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
In preparation for making the reloc tool operate on 64-bit relocations,
generalize the structure names for easy recompilation via #defines.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-2-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Pull x86 trampoline rework from H. Peter Anvin:
"This code reworks all the "trampoline"/"realmode" code (various bits
that need to live in the first megabyte of memory, most but not all of
which runs in real mode at some point) in the kernel into a single
object. The main reason for doing this is that it eliminates the last
place in the kernel where we needed pages to be mapped RWX. This code
separates all that code into proper R/RW/RX pages."
Fix up conflicts in arch/x86/kernel/Makefile (mca removed next to reboot
code), and arch/x86/kernel/reboot.c (reboot code moved around in one
branch, modified in this one), and arch/x86/tools/relocs.c (mostly same
code came in earlier due to working around the ld bugs just before the
3.4 release).
Also remove stale x86-relocs entry from scripts/.gitignore as per Peter
Anvin.
* commit '61f5446169046c217a5479517edac3a890c3bee7': (36 commits)
x86, realmode: Move end signature into header.S
x86, relocs: When printing an error, say relative or absolute
x86, relocs: More relocations which may end up as absolute
x86, relocs: Workaround for binutils 2.22.52.0.1 section bug
xen-acpi-processor: Add missing #include <xen/xen.h>
acpi, bgrd: Add missing <linux/io.h> to drivers/acpi/bgrt.c
x86, realmode: Change EFER to a single u64 field
x86, realmode: Move kernel/realmode.c to realmode/init.c
x86, realmode: Move not-common bits out of trampoline_common.S
x86, realmode: Mask out EFER.LMA when saving trampoline EFER
x86, realmode: Fix no cache bits test in reboot_32.S
x86, realmode: Make sure all generated files are listed in targets
x86, realmode: build fix: remove duplicate build
x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline
x86, realmode: fixes compilation issue in tboot.c
x86, realmode: move relocs from scripts/ to arch/x86/tools
x86, realmode: header for trampoline code
x86, realmode: flattened rm hierachy
x86, realmode: don't copy real_mode_header
x86, realmode: fix 64-bit wakeup sequence
...
The symbol jiffies is created in the linker script as an alias to
jiffies_64. Unfortunately this is done outside any section, and
apparently GNU ld 2.21 doesn't carry the section with it, so we end up
with an absolute symbol and therefore a broken kernel.
Add jiffies and jiffies_64 to the whitelist.
The most disturbing bit with this discovery is that it shows that we
have had multiple linker bugs in this area crossing multiple
generations, and have been silently building bad kernels for some time.
Link: http://lkml.kernel.org/r/20120524171604.0d98284f3affc643e9714470@canb.auug.org.au
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org> v3.4
As noted in checkin:
a3e854d95 x86, relocs: Workaround for binutils 2.22.52.0.1 section bug
ld version 2.22.52.0.[12] can incorrectly promote relative symbols to
absolute, if the output section they appear in is otherwise empty.
Since checkin:
6520fe55 x86, realmode: 16-bit real-mode code support for relocs tool
we actually check for this and error out rather than silently creating
a kernel which will malfunction if relocated.
Ingo found a configuration in which __start_builtin_fw triggered the
warning.
Go through the linker script sources and look for more symbols that
could plausibly get bogusly promoted to absolute, and add them to the
whitelist.
In general, if the following error triggers:
Invalid absolute R_386_32 relocation: <symbol>
... then we should verify that <symbol> is really meant to be
relocated, and add it and any related symbols manually to the S_REL
regexp.
Please note that 6520fe55 does not introduce the error, only the check
for the error -- without 6520fe55 this version of ld will simply
produce a corrupt kernel if CONFIG_RELOCATABLE is set on x86-32.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org> v3.4
When the relocs tool throws an error, let the error message say if it
is an absolute or relative symbol. This should make it a lot more
clear what action the programmer needs to take and should help us find
the reason if additional symbol bugs show up.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from
section-relative to absolute if they are in a section of zero length.
This turns the symbols __init_begin and __init_end into absolute
symbols. Let the relocs program know that those should be treated as
relative symbols.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
A new option is added to the relocs tool called '--realmode'.
This option causes the generation of 16-bit segment relocations
and 32-bit linear relocations for the real-mode code. When
the real-mode code is moved to the low-memory during kernel
initialization, these relocation entries can be used to
relocate the code properly.
In the assembly code 16-bit segment relocations must be relative
to the 'real_mode_seg' absolute symbol. Linear relocations must be
relative to a symbol prefixed with 'pa_'.
16-bit segment relocation is used to load cs:ip in 16-bit code.
Linear relocations are used in the 32-bit code for relocatable
data references. They are declared in the linker script of the
real-mode code.
The relocs tool is moved to arch/x86/tools/relocs.c, and added new
target archscripts that can be used to build scripts needed building
an architecture. be compiled before building the arch/x86 tree.
[ hpa: accelerating this because it detects invalid absolute
relocations, a serious bug in binutils 2.22.52.0.x which currently
produces bad kernels. ]
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-2-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
When the relocs tool throws an error, let the error message say if it
is an absolute or relative symbol. This should make it a lot more
clear what action the programmer needs to take.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from
section-relative to absolute if they are in a section of zero length.
This turns the symbols __init_begin and __init_end into absolute
symbols. Let the relocs program know that those should be treated as
relative symbols.
This bug is exposed by checkin
433de739bb x86, realmode: 16-bit real-mode code support for relocs tool
only in the sense that that checkin changes the relocs tool to report
an error instead of silently generating a kernel which is broken if
relocated.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from
section-relative to absolute if they are in a section of zero length.
This turns the symbols __init_begin and __init_end into absolute
symbols. Let the relocs program know that those should be treated as
relative symbols.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Moved relocs tool from scripts/ to arch/x86/tools because
it is architecture specific script. Added new target archscripts
that can be used to build scripts needed building an architecture.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-22-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Michal Marek <mmarek@suse.cz>