Some conditional branch instructions generated by this pass are checking
the wrong condition code. The instructions TBZ and TBNZ are transformed
into B.GE and B.LT instead of B.PL and B.MI respectively. They should
only be checking the Negative bit.
Differential Revision: https://reviews.llvm.org/D34743
llvm-svn: 306550
Summary: It used to always call into the RealFileSystem before.
Reviewers: bkramer, krasimir, klimek, bruno
Reviewed By: klimek
Subscribers: bruno, cfe-commits
Differential Revision: https://reviews.llvm.org/D34469
llvm-svn: 306549
The current heuristic in isProfitableToIfCvt assumes we have a branch predictor,
and so gives the wrong answer in some cases when we don't. This patch adds a
subtarget feature to indicate that a subtarget has no branch predictor, and
changes the heuristic in isProfitableToiIfCvt when it's present. This gives a
slight overall improvement in a set of embedded benchmarks on Cortex-M4 and
Cortex-M33.
Differential Revision: https://reviews.llvm.org/D34398
llvm-svn: 306547
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.
Reviewers: hfinkel
Subscribers: jholewinski, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D34707
llvm-svn: 306541
This patch aims to implement the option of allocating new arrays created
by polly on heap instead of stack. To enable this option, a key named
'allocation' must be written in the imported json file with the value
'heap'.
We need such a feature because in a next iteration, we will implement a
mechanism of maximal static expansion which will need a way to allocate
arrays on heap. Indeed, the expansion is very costly in terms of memory
and doing the allocation on stack is not worth considering.
The malloc and the free are added respectively at polly.start and
polly.exiting such that there is no use-after-free (for instance in case
of Scop in a loop) and such that all memory cells allocated with a
malloc are free'd when we don't need them anymore.
We also add :
- In the class ScopArrayInfo, we add a boolean as member called IsOnHeap
which represents the fact that the array in allocated on heap or not.
- A new branch in the method allocateNewArrays in the ISLNodeBuilder for
the case of heap allocation. allocateNewArrays now takes a BBPair
containing polly.start and polly.exiting. allocateNewArrays takes this
two blocks and add the malloc and free calls respectively to
polly.start and polly.exiting.
- As IntPtrTy for the malloc call, we use the DataLayout one.
To do that, we have modified :
- createScopArrayInfo and getOrCreateScopArrayInfo such that it returns
a non-const SAI, in order to be able to call setIsOnHeap in the
JSONImporter.
- executeScopConditionnaly such that it return both start block and end
block of the scop, because we need this two blocs to be able to add
the malloc and the free calls at the right position.
Differential Revision: https://reviews.llvm.org/D33688
llvm-svn: 306540
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.
Majority of the changes in this patch:
1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.
These changes are target independent and described below.
Changed CFI instructions so that they:
1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal
Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.
Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D18046
llvm-svn: 306529
The original patch was an improvement to IR ValueTracking on non-negative
integers. It has been checked in to trunk (D18777, r284022). But was disabled by
default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
Reviewers: reames
Differential Revision: https://reviews.llvm.org/D34101
Patch by: Olga Chupina <olga.chupina@intel.com>
llvm-svn: 306528
This is PR33596. Previously LLD would crash
because BYTE command synthesized output section,
but it was not assigned to Sec member of OutputSectionCommand.
Behaviour of -script and -r combination is not well defined,
but it seems after this change LLD naturally inherits behavior of
GNU linkers - creates output section requested in script and does not
crash anymore.
Differential revision: https://reviews.llvm.org/D34676
llvm-svn: 306527
This fixes PR33598.
Size field for undefined symbols is not significant.
Setting it to fixed value, like zero, may be useful though.
For example when we have 2 DSO's, like in this PR, if lower level DSO may
change slightly (in part of some symbol's st_size) and higher-level DSO is
rebuilt, then tools that monitoring checksum of high level DSO file can notice
it and trigger cascade of some other unnecessary actions.
If we set st_size to zero, that can be avoided.
Differential revision: https://reviews.llvm.org/D34673
llvm-svn: 306526
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.
This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.
Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D33186
llvm-svn: 306525
Summary:
This commit adds the tests for clamp pattern as a prerequisite of
D33186 to make the impact of that fix more clear and also to document
current behavior.
Reviewers: spatel, jmolloy
Reviewed By: spatel
Subscribers: n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D34350
llvm-svn: 306524
When -ffunction-sections and ARM C++ exceptions are used each .text.suffix
section will have at least one .ARM.exidx.suffix section and may have an
additional .ARM.extab.suffix section if the unwinding instructions are too
large to inline into the .ARM.exidx table entry. For a large program without
a linker script this can lead to a large number of section header table
entries that can increase the size of the ELF file.
This change introduces a default rule for .ARM.extab.* to be placed in
a single output section called .ARM.extab . This follows the behavior of
ld.gold and ld.bfd.
fixes pr33407
Differential Revision: https://reviews.llvm.org/D34678
llvm-svn: 306522
Summary:
instead of using a boolean to differentiate between the two section
types, use an enum to make the intent clearer.
I also remove the RegisterKind argument from the constructor, as this
can be deduced from the Type argument.
Reviewers: clayborg, jasonmolenda
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D34681
llvm-svn: 306521
Fix crash in clang when an array of unknown bounds of an incomplete type is passed to __has_trivial_destructor.
Patch by Puneetha
https://reviews.llvm.org/D34198
llvm-svn: 306519
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"
Original commit message:
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306517
Summary:
This patch implements support for Intel(R) Processor Trace
in lldb server. The changes have support for
starting/stopping and reading the trace data. The code
is only available on Linux versions where the perf
attributes for aux buffers are available.
The patch also consists of Unit tests for testing the
core buffer reading function.
Reviewers: lldb-commits, labath, clayborg, zturner, tberghammer
Reviewed By: labath, clayborg
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D33674
llvm-svn: 306516
The benchmarking summarized in
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113525.html showed
this is beneficial for a wide range of cores.
As is to be expected, quite a few small adaptations are needed to the
regressions tests, as the difference in scheduling results in:
- Quite a few small instruction schedule differences.
- A few changes in register allocation decisions caused by different
instruction schedules.
- A few changes in IfConversion decisions, due to a difference in
instruction schedule and/or the estimated cost of a branch mispredict.
llvm-svn: 306514
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306512
error message
CMakeFiles/llvm-lto2.dir/llvm-lto2.cpp.o: In function `dumpSymtab(int, char**)':
llvm-lto2.cpp:(.text._ZL10dumpSymtabiPPc+0x238): undefined reference to `llvm::getBitcodeFileContents(llvm::MemoryBufferRef)'
collect2: error: ld returned 1 exit status
llvm-svn: 306507
A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive
invocations, and we don't create a ConstantInt if 'true' branch is taken.
Differential Revision: https://reviews.llvm.org/D34672
llvm-svn: 306503
Summary:
This change introduces two files that show exaples of the
always/never instrument files that can be provided to clang. We don't
add these as defaults yet in clang, which we can do later on (in a
separate change).
We also add a test that makes sure that these apply in the compiler-rt
project tests, and that changes in clang don't break the expectations in
compiler-rt.
Reviewers: pelikan, kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34669
llvm-svn: 306502
That is pretty common for clang to produce code like
(shl %x, (and %amt, 31)). In this situation we can still perform
trunc (shl) into shl (trunc) conversion given the known value
range of shift amount.
Differential Revision: https://reviews.llvm.org/D34723
llvm-svn: 306499