Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.
Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.
This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.
Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.
Differential Revision: http://reviews.llvm.org/D7585
llvm-svn: 228923
Otherwise we will always select the generic version for e.g. unsigned
long if uint64_t is typedef'd to 'unsigned long long'. Also remove
enable_if hacks in favor of static_assert.
llvm-svn: 228921
The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.
Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.
Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599
llvm-svn: 228920
Previously the test did not have a RUN: prefix for the clang command.
In addition it was leaving behind a tmp file with no permissions causing issues when
deleting the build directory on Windows.
Differential Revision: http://reviews.llvm.org/D7534
llvm-svn: 228919
Partially revert r223927 because LLVM gained support for 128-bit integers
in r227089. Modify and keep the tests that verify the definition of the
macro __SIZEOF_INT128__ for MIPS64 BE & LE in the preprocessor.
llvm-svn: 228918
I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example:
$ ls -l old/modules_unittests new/modules_unittests
-rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests
-rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests
$ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
239871
$ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
148365
http://reviews.llvm.org/D7069
llvm-svn: 228917
Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits.
KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits.
I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB.
There are some cases where i1 is in the mask register and the upper bits are already zeroed.
Then KORTESTW is the better solution, but it is subject for optimization.
Meanwhile, I'm fixing the correctness issue.
llvm-svn: 228916
This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size.
We go over all calls in the MachineFunction and compute:
a) For each callsite that can not use pushes, the penalty of not having a reserved call frame.
b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack).
Differential Revision: http://reviews.llvm.org/D7561
llvm-svn: 228915
Without this change we get linker errors such as:
undefined reference to `llvm::dbgs()'
We only conditionally link in these libraries, as in BUILD_SHARED_LIBS=OFF mode,
linking in these libraries causes such functions (and especially global options)
to be defined twice. The "solution" I choose is most likely not ideal, but seems
to work. If any cmake specialist can suggest a better approach, this would be
appreciated.
We also drop a .c file that is not needed as it caused linker errors as well.
llvm-svn: 228914
We used to do this DAG combine, but it's not always correct:
If the first fp_round isn't a value preserving truncation, it might
introduce a tie in the second fp_round, that wouldn't occur in the
single-step fp_round we want to fold to.
In other words, double rounding isn't the same as rounding.
Differential Revision: http://reviews.llvm.org/D7571
llvm-svn: 228911
both a user process dyld and for a kernel binary -- we
will decide which to prefer after one or both have been
located.
It would be faster to stop the search thorugh the core
segments one we've found a dyld/kernel binary - but that
may trick us into missing the one we would prefer.
<rdar://problem/19806413>
llvm-svn: 228910
On PowerPC, and maybe some other architectures, 'char' is unsigned. Comparing
an unsigned char with a signed int (-1) is always false. To fix this, down-cast
EOF to a char.
llvm-svn: 228909
Summary: Coverity warns that unsigned >= 0 is always true, and k_first_gpr_powerpc happens to be 0. Quiet Coverity by changing that comparison instead to a static_assert(), in case things change in the future.
Reviewers: emaste
Reviewed By: emaste
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D7576
llvm-svn: 228908
Use a wrapper function for symbol. Any undefined reference to symbol will be
resolved to "__wrap_symbol". Any undefined reference to "__real_symbol" will be
resolved to symbol.
This can be used to provide a wrapper for a system function. The wrapper
function should be called "__wrap_symbol". If it wishes to call the system
function, it should call "__real_symbol".
Here is a trivial example:
void * __wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls
to "malloc" will call the function "__wrap_malloc" instead. The call to
"__real_malloc" in "__wrap_malloc" will call the real "malloc" function.
llvm-svn: 228906
This adds the LinkingContext parameter to the ELFReader. Previously the flags in
that were needed in the Context was passed to the ELFReader, this made it very
hard to access data structures in the LinkingContext when reading an ELF file.
This change makes the ELFReader more flexible so that required parameters can be
grabbed directly from the LinkingContext.
Future patches make use of the changes.
There is no change in functionality though.
llvm-svn: 228905
Convert the register saving code to use an explicit memcpy rather than the
implicit memcpy from the assignment. This avoids warnings from -Wcast-qual on
GCC and makes the code more explicit. Furthermore, use sizeof to calculate the
offsets rather than adding magic numbers, improving legibility of the code.
NFC.
llvm-svn: 228904
Ideally, we would do something like inline __declspec(dllexport) to ensure that
the symbol was inlined within libunwind as well as emitted into the final DSO.
This simply moves the definition out of the header to ensure that the *public*
interfaces are defined and exported into the final DSO.
This change also has "gratuitous" code movement so that the EHABI and generic
implementations are co-located making it easier to find them.
The movement from the header has one minor change introduced into the code:
additional tracing to mirror the behaviour of the non-EHABI interfaces.
llvm-svn: 228903
If the linker is gcc (the default for Generic_ELF toolchains), we end up
passing most of the arguments to the linker.
Some tests were failing to account for this in their usage of *-NOT: lines
and would fail if compiled with
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-unknown-unknown
llvm-svn: 228902
We would crash if we couldn't locate a Function that either Location's
Value belonged to. Now we just print out a debug message and return
conservatively.
llvm-svn: 228901
The dumpbin tool in the MSVC toolchain cannot handle an executable created
by LLD if the executable contains a long section name.
In PE/COFF, a section name is stored to a section table entry. Because the
section name field in the table is only 8 byte long, a name longer than
that is stored to the string table and the offset in the string table is
stored to the section table entry instead.
In order to look up a string from the string table, tools need to handle
the symbol table, because the string table is defined as it immediately
follows the symbol table.
And seems the dumpbin doesn't like zero-length symbol table.
This patch teaches LLD how to emit a dummy symbol table. The dummy table
has one dummy entry in it.
llvm-svn: 228900
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.
The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.
Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.
I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.
I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.
llvm-svn: 228899
Summary:
This patch installs an InlineAsmDiagnosticsHandler to avoid the crash
report when the input is bitcode and the bitcode contains invalid inline
assembly. The handler will simply print the same error message that will
print from the backend.
Add CHECK in test-case
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, cfe-commits
Differential Revision: http://reviews.llvm.org/D7568
llvm-svn: 228898
always use the normal copy-initialization rules. Remove a special case that
tries to stay within the list initialization checker here; that makes us do the
wrong thing when list-initialization of an aggregate would not perform
aggregate initialization.
llvm-svn: 228897
based on whether "redundant" braces are ever reasonable as part of the
initialization of the entity, rather than whether the initialization is
"top-level". In passing, add a warning flag for it.
llvm-svn: 228896
On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".
P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
-18.2821% +/- 8.06481%
llvm-svn: 228895
There was a test in the test suite that was triggering the backtrace logging output that requested that the client pass an execution context. Sometimes we need the process for Objective C types because our static notion of the type might not align with the reality when being run in a live runtime.
Switched from an "ExecutionContext *" to an "ExecutionContextScope *" for greater ease of use.
llvm-svn: 228892
This reverts commit 228874. For some reason users reported
seeing Clang taking up 25+GB of memory and bringing down
machines with this change. Reverting until we figure it out.
llvm-svn: 228890
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.
llvm-svn: 228885
Summary:
google-global-names-in-headers flags global namespace pollution in header files.
Right now it only triggers on using declarations and directives.
Reviewers: alexfh
Subscribers: curdeius
Differential Revision: http://reviews.llvm.org/D7563
llvm-svn: 228875
For Windows, filename_pos() tries to find the filename by
searching for separators after the last :. Instead, it should
really check for the only location that a : is valid, which is
in the second character, and search for separators after that.
llvm-svn: 228874
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.
Reviewers: mcrosier
Reviewed By: mcrosier
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7286
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 228872
and even before that, it was never implemented. Just define it to zero
instead, so compiler-rt can compile on FreeBSD 11 and later.
Differential Revision: http://reviews.llvm.org/D7485
llvm-svn: 228871