Summary:
This is a revised version of D13974, and the following quoted summary are from D13974
"This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch."
D13974 was committed but failed one lnt test. The bug was that we only checked the condition from loop exit's incoming block was a loop invariant. But there could be another condition from loop header to that incoming block not being a loop invariant. This would produce miscompiled code.
This patch fixes the issue by checking if the incoming block is loop header, and if not, don't perform the rewrite. The could be further improved by recursively checking all conditions leading to loop exit block, but I'd like to check in this simple version first and improve it with future patches.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16570
llvm-svn: 258912
Most of the time we only hit the small case, so it is beneficial to pull
it out of the insert_imp() implementation. This improves compile time
at least for non-LTO builds.
Differential Revision: http://reviews.llvm.org/D16619
llvm-svn: 258908
SimplifyCFG tries to turn complex branch conditions into a switch.
Some of it's logic attempts to reason about bitwise arithmetic produced
by InstCombine. InstCombine can turn things like (X == 2) || (X == 3)
into (X & 1) == 2 and so SimplifyCFG tries to detect when this occurs so
that it can produce a switch instruction.
However, the legality checking was not sufficient to determine whether
or not this had occured. Correctly check this case by requiring that
the right-hand side of the comparison be a power of two.
This fixes PR26323.
llvm-svn: 258904
The test tries to produce a large preprocessed output to the console, and checks
that we do not see any unexpected fatal errors.
The test is not enabled unless a lit parameter "--param enable_console=1" is
passed on the command line to lit.py.
llvm-svn: 258902
When no device name is specified, default to kaveri
for HSA since SI is not supported and it woud fail.
Default to "tahiti" instead of "SI" since these are
effectively the same, and tahiti is an actual device.
Move default device handling to the TargetMachine
rather than the AMDGPUSubtarget. The module ISA version
is computed from the device name provided with the target
machine, so the attributes printed by the AsmPrinter were
inconsistent with those computed in the subtarget.
Also remove DevName field from subtarget since it's redundant
with getCPU() in the superclass.
llvm-svn: 258901
Per discussion with Eric and Joerg, this commit removes -Wpadded to
silence the warning about the padding inserted at the tail of struct
_Rep_base.
rdar://problem/23932550
llvm-svn: 258900
If a lit test has a RUN line that includes a redirection to "/dev/tty", the
redirection goes to the special device file corresponding to the console. It
is /dev/tty on UNIX-like systems and "CON" on Windows.
This patch is needed to implement a test like PR25717 (caused by the size limit
of the Windows system call WriteConsole() prior to Windows 8) where the test
only breaks when outputing to the console and won't fail if using a pipe.
llvm-svn: 258898
This brings the compile time of Function.cpp from ~40s down to ~4s for
me locally. It also shaves off about 400KB of object file size in a
release+asserts build.
I also realized that the AMDGPU backend does not have any GCC builtin
names to match, so the extra lookup was a no-op. I removed it to silence
a zero-length string table array warning. There should be no functional
change here.
This change really ends the story of PR11951.
llvm-svn: 258897
Previously the ObjC Dealloc Checker only checked classes with ivars, not
retained properties, which caused three bugs:
- False positive warnings about a missing -dealloc method in classes with only
ivars.
- Missing warnings about a missing -dealloc method on classes with only
properties.
- Missing warnings about an over-released or under-released ivar associated with
a retained property in classes with only properties.
The fix is to check only classes with at least one retained synthesized
property.
This also exposed a bug when reporting an over-released or under-released
property that did not contain a synthesize statement. The checker tried to
associate the warning with an @synthesize statement that did not exist, which
caused an assertion failure in debug builds. The fix is to fall back to the
@property statement in this case.
A patch by David Kilzer!
Part of rdar://problem/6927496
Differential Revision: http://reviews.llvm.org/D5023
llvm-svn: 258896
This reverts commit r258504.
This commit breaks (at least) sparc-rtems -- the OS (RTEMS) used to
override UserLabelPrefix to "", despite the arch (SPARC) having set it
to "_". Now, the OS doesn't override anymore, but the arch sets it to
"_", resulting in the wrong value. I expect this probably breaks other
OSes that overrode to "" before, as well. (Clearly we have some missing
test cases, here...)
llvm-svn: 258894
Here, sed is used to prepare object files for comparison via cmp. On my Darwin
15.4.0 machine, LC_CTYPE is set to UTF-8 (by default, I believe). Under these
circumstances, anything sed is made to read will be treated as UTF-8, prompting
it to signal an error if it is not, like so:
% sed s/a/b/ <(head -n1 /dev/random) >/dev/null; echo $?
sed: RE error: illegal byte sequence
1
%
To make sed work as expected, I need to set LC_CTYPE to C:
% env LC_CTYPE=C sed s/a/b/ <(head -n1 /dev/random) >/dev/null; echo $?
0
%
Without this change, sed will exit with an error for every single file that it
compares between phase 2 and phase 3, thereby making it look as if the
differences were far larger than they are.
Patch by Elias Pipping!
Differential Revision: http://reviews.llvm.org/D16548
llvm-svn: 258891
Summary:
This patch is similar to the <list> fix but it has a few differences. This patch doesn't use a `__link_pointer` typedef because we don't need to change the linked list pointers because `forward_list` never stores a `__forward_begin_node` in the linked list itself.
The issue with `forward_list` is that the iterators store pointers to `__forward_list_node` and not `__forward_begin_node`. This is incorrect because `before_begin()` and `cbefore_begin()` return iterators that point to a `__forward_begin_node`. This means we incorrectly downcast the `__forward_begin_node` pointer to a `__node_pointer`. This downcast itself is sometimes UB but it cannot be safely removed until ABI v2. The more common cause of UB is when we deference the downcast pointer. (for example `__ptr_->__next_`). This can be fixed without an ABI break by upcasting `__ptr_` before accessing it.
The fix is as follows:
1. Introduce a `__iter_node_pointer` typedef that works similar to `__link_pointer` in the last patch. In ABI v2 it is always a typedef for `__begin_node_pointer`.
2. Change the `__before_begin()` method to return the correct pointer type (`__begin_node_pointer`),
Previously it incorrectly downcasted the `__forward_begin_node` to a `__node_pointer` so it could be used to constructor the iterator types.
3. Change `__forward_list_iterator` and `__forward_list_const_iterator` in the following way:
1. Change `__node_pointer __ptr_;` member to have the `__iter_node_pointer` type instead.
2. Add additional private constructors that accept `__begin_node_pointer` in addition to `__node_pointer` and then correctly cast them to the stored `__iter_node_pointer` type.
3. Add `__get_begin()` and `__get_node_unchecked()` accessor methods that correctly cast `__ptr_` to the expected pointer type. `__get_begin()` is always safe to use and should be
preferred. `__get_node_unchecked()` can only be used on a deferencible iterator.
4. Replace direct access to `__forward_list_iterator::__ptr_` with the safe accessor methods.
Reviewers: mclow.lists, EricWF
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15836
llvm-svn: 258888
After r251874, readonly properties that are shadowed by a readwrite property
in a class extension no longer have an instance variable, which caused the body
farm to not synthesize getters. Now, if a readonly property does not have an
instance variable look for a shadowing property and try to get the instance
variable from there.
rdar://problem/24060091
llvm-svn: 258886
Summary:
NVVM doesn't have a standard library, as currently implemented, so this
just isn't going to work. I'd like to revisit this, since it's hiding
opportunities for optimization, but correctness comes first.
Thank you to hfinkel for pointing me in the right direction here.
Reviewers: tra
Subscribers: echristo, jhen, llvm-commits, hfinkel
Differential Revision: http://reviews.llvm.org/D16604
llvm-svn: 258884
at least as big as the mach header to be identified as a Mach-O file and
make sure smaller files are not identified as a Mach-O files but as
unknown files. Also fix identify_magic() so it looks at all 4 bytes of
the filetype field when determining the type of the Mach-O file.
Then fix the macho-invalid-header test case to check that it is an
unknown file and make sure it does not get the error for
object_error::parse_failed. And also update the unit tests.
llvm-svn: 258883
AvailableValue is the part that represents the potential rematerialization. AvailableValueInBlock is simply a pair of an AvailableValue and a BB which we might materialize it in.
This is motivated by http://reviews.llvm.org/D16608. The intent is that we'll have a single function which handles the local case which both local and non-local will use to identify available values. Once that's done, the local case can rematerialize at the use site and the non-local case can do the SSA construction as it does currently.
llvm-svn: 258882
CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.
Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing
Differential Revision: http://reviews.llvm.org/D16593
llvm-svn: 258880
This change enables diagnostics when the target address for a CFI
check is out of bounds of any known library, or even not in the
limits of the address space. This happens when casting pointers to
uninitialized memory.
Ubsan code does not yet handle some of these situations correctly,
so it is still possible to see a segmentation fault instead of a
proper diagnostic message once in a while.
llvm-svn: 258879
Clang's CodeGen has several paths which end up invoking or calling a
function. The one that we used for calls to __RTtypeid did not
appropriately annotate the call with a funclet bundle.
This fixes PR26329.
llvm-svn: 258877
The AMDGPU backend was the last user of the old StringMatcher
recognition code. Move it over to the new lookupLLVMIntrinsicName
funciton, which is now improved to handle all of the interesting edge
cases exposed by AMDGPU intrinsic names.
llvm-svn: 258875
I broke the documentation builds when I deleted the MakefileGuide as part of the autoconf removal. At some point I'll need to do a more in-depth pass updating the documentation to remove references to the old build system.
llvm-svn: 258873
For implcit barriers in simple parallel for loops, the order of the OMPT events
was wrong. The barrier_{begin,end} events came after the implcit_task_end
event for the implcit barrier at the end of the parallel region. This is wrong
because the implicit task executes the barrier before ending. This patch fixes
the order of the event: It will be triggerd now just before
__kmp_pop_current_task_from_thread() is called.
Patch by Tim Cramer
Differential Revision: http://reviews.llvm.org/D16347
llvm-svn: 258866