The analyzer reports a shift by a negative value in the constructor. The bug can
be easily triggered by calling std::random_shuffle on a vector
(<rdar://problem/19658126>).
(The shift by a negative value is reported because __w0_ gets constrained to
63 by the conditions along the path:__w0_ < _WDt && __w0_ >= _WDt-1,
where _WDt is 64. In normal execution, __w0_ is not 63, it is 1 and there is
no overflow. The path is infeasible, but the analyzer does not know about that.)
llvm-svn: 256886
Summary:
This patch adds support for the clang multi-stage bootstrapping to support PGO profdata generation, and can build a 2 or 3 stage compiler.
With this patch applied you can configure your build directory with the following invocation of CMake:
cmake -G <generator> -C <path_to_clang>/cmake/caches/PGO.cmake <source dir>
After configuration the following additional targets will be generated:
stage2-instrumented:
Builds a stage1 x86 compiler, runtime, and required tools (llvm-config, llvm-profdata) then uses that compiler to build an instrumented stage2 compiler.
stage2-instrumented-generate-profdata:
Depends on "stage2-instrumented" and will use the instrumented compiler to generate profdata based on the training files in <clang>/utils/perf-training
stage2:
Depends on "stage2-instrumented-generate-profdata" and will use the stage1 compiler with the stage2 profdata to build a PGO-optimized compiler.
stage2-check-llvm:
Depends on stage2 and runs check-llvm using the stage3 compiler.
stage2-check-clang:
Depends on stage2 and runs check-clang using the stage3 compiler.
stage2-check-all:
Depends on stage2 and runs check-all using the stage3 compiler.
stage2-test-suite:
Depends on stage2 and runs the test-suite using the stage3 compiler (requires in-tree test-suite).
Reviewers: bogner, silvas, chandlerc
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15584
llvm-svn: 256873
Due to the SGPR init bug, every program claims to use the same number
of SGPRs anyway, so there's no point in trying to shift those registers
down from their initial spot of reservation.
Add a test that uses VGPR spilling and blocks most SGPRs from being used for
the scratch resource register. Previously, this would run into an assertion.
Differential Revision: http://reviews.llvm.org/D15724
llvm-svn: 256870
Summary:
Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations.
This is one such opportunity.
Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15894
llvm-svn: 256868
Summary:
Hi Rafael,
Would you be able to review this patch, please?
(Clang part of the patch is D15832).
When clang runs an external tool, e.g. a linker, it may create a command line that exceeds the length limit.
Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS
limitation. There are two problems in this function that may cause exceeding of the limit:
1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program
path to the command line when it runs the program.
2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for *each* argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails.
Reviewers: rafael, ygao, probinson
Subscribers: asl, llvm-commits
Differential Revision: http://reviews.llvm.org/D15831
llvm-svn: 256866
Summary:
LLVM part of the patch is D15831.
When clang runs an external tool such as a linker it may create a command line that exceeds the length limit.
Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS
limitation. There are two problems in this function that may cause exceeding of the limit:
1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program
path to the command line when it runs the program.
2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for *each* argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails.
Reviewers: rafael, asl
Subscribers: asl, llvm-commits
Differential Revision: http://reviews.llvm.org/D15832
llvm-svn: 256865
This patch adds support the command 'source info' as follows:
(lldb) help source info
Display source line information (as specified) based on the current executable's
debug info.
Syntax: source info <cmd-options>
Command Options Usage:
source info [-c <count>] [-s <shlib-name>] [-f <filename>] [-l <linenum>] [-e <linenum>]
source info [-c <count>] [-s <shlib-name>] [-n <symbol>]
source info [-c <count>] [-a <address-expression>]
-a <address-expression> ( --address <address-expression> )
Lookup the address and display the source information for the corresponding
file and line.
-c <count> ( --count <count> )
The number of line entries to display.
-e <linenum> ( --end-line <linenum> )
The line number at which to stop displaying lines.
-f <filename> ( --file <filename> )
The file from which to display source.
-l <linenum> ( --line <linenum> )
The line number at which to start the displaying lines.
-n <symbol> ( --name <symbol> )
The name of a function whose source to display.
-s <shlib-name> ( --shlib <shlib-name> )
Look up the source in the given module or shared library (can be specified
more than once).
For example:
(lldb) source info --file x.h
Lines for file x.h in compilation unit x.cpp in `x
[0x0000000100000d00-0x0000000100000d10): /Users/dawn/tmp/./x.h:10
[0x0000000100000d10-0x0000000100000d1b): /Users/dawn/tmp/./x.h:10
The new options are used to fix the MI command:
-symbol-list-lines <file>
which didn't work for header files because it called:
target modules dump line-table <file>
which only dumps line tables for a compilation unit.
The patch also fixes a bug in the error reporting when no files were supplied to the command. Previously you'd get:
(lldb) target modules dump line-table
error:
Syntax:
error: no source filenames matched any command arguments
Now you get:
error: file option must be specified.
Reviewed by: clayborg, jingham, ki.stfu
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D15593
llvm-svn: 256863
Summary: This change enables clang to automatically link binaries built with the -fprofile-instr-generate against the clang_rt.profile-i386.lib library.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15833
llvm-svn: 256855
We queried hasFP before we hit ExpandISelPseudos. ExpandISelPseudos
manipulated state that hasFP relied on, potentially changing the result
after it has been queried elsewhere.
While I am not aware of any particular bug due to this state of affairs,
it seems best to avoid it entirely by changing the state during DAG
construction.
llvm-svn: 256849
Summary: This change configures Windows builds to build the complier-rt profile support library (clang_rt.profile-i386.lib). Windows API incompatibilities in the compiler-rt profile lib are also fixed.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15830
llvm-svn: 256848
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
void *addr;
char *name;
int64_t size;
};
```
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives.
Entry 3) a unique ID of the file where the input source file that contain the target region lives.
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao
Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits
Differential Revision: http://reviews.llvm.org/D12614
llvm-svn: 256842
When not all instructions have a scheduling class,
the error message now provides a possible solution.
Differential Revision: http://reviews.llvm.org/D15854
llvm-svn: 256839
An undecorated function designator implies taking the address of a function,
which is illegal in OpenCL. Implementing a check for this earlier to allow
the error to be reported even in the presence of other more obvious errors.
Patch by Neil Hickey!
http://reviews.llvm.org/D15691
llvm-svn: 256838