Patch fixes bug with codegen for lastprivate loop counters. Also it may
improve performance for lastprivates calculations in some cases.
llvm-svn: 261209
This avoids a operator precedence warning for mixing + and | in an
expression. I checked that this matches the definition in the Split
DWARF proposal.
Patch by Cong Liu!
Differential Revision: http://reviews.llvm.org/D17375
llvm-svn: 261207
SUMMARY:
This patch implements ArchSpec::GetClangTargetCPU() that provides string representing current architecture as a target CPU.
This string is then passed to tools like clang so that they generate correct code for that target.
Reviewers: clayborg, zturner
Subscribers: mohit.bhakkad, sagar, jaydeep, lldb-commits
Differential Revision: http://reviews.llvm.org/D17022
llvm-svn: 261206
* Generate artificial symbol names from eh_fame during symbol parsing
so these symbols are already present when we calcualte the size of
the symbols where 0 is specified.
* Fix symbol size calculation for the last symbol in the file where
it have to last until the end of the parent section.
This is the re-commit of the original change after fixing some test
failures on OSX.
Differential revision: http://reviews.llvm.org/D16996
llvm-svn: 261205
to force debug build and hopefully enable more precise warnings.
Static Analyzer is much more efficient when built in debug mode
(-UNDEBUG) so we advice users to enable it manually. This may be
inconvenient in case of large complex projects (think about Linux
distros e.g. Android or Tizen). This patch adds a flag to scan-build
which inserts -UNDEBUG automatically.
Differential Revision: http://reviews.llvm.org/D16200
llvm-svn: 261204
convert one test to use this.
This is a particularly significant milestone because it required
a working per-function AA framework which can be queried over each
function from within a CGSCC transform pass (and additionally a module
analysis to be accessible). This is essentially *the* point of the
entire pass manager rewrite. A CGSCC transform is able to query for
multiple different function's analysis results. It works. The whole
thing appears to actually work and accomplish the original goal. While
we were able to hack function attrs and basic-aa to "work" in the old
pass manager, this port doesn't use any of that, it directly leverages
the new fundamental functionality.
For this to work, the CGSCC framework also has to support SCC-based
behavior analysis, etc. The only part of the CGSCC pass infrastructure
not sorted out at this point are the updates in the face of inlining and
running function passes that mutate the call graph.
The changes are pretty boring and boiler-plate. Most of the work was
factored into more focused preperatory patches. But this is what wires
it all together.
llvm-svn: 261203
In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool.
The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly.
llvm-svn: 261201
analysis passes, support pre-registering analyses, and use that to
implement parsing and pre-registering a custom alias analysis pipeline.
With this its possible to configure the particular alias analysis
pipeline used by the AAManager from the commandline of opt. I've updated
the test to show this effectively in use to build a pipeline including
basic-aa as part of it.
My big question for reviewers are around the APIs that are used to
expose this functionality. Are folks happy with pass-by-lambda to do
pass registration? Are folks happy with pre-registering analyses as
a way to inject customized instances of an analysis while still using
the registry for the general case?
Other thoughts of course welcome. The next round of patches will be to
add the rest of the alias analyses into the new pass manager and wire
them up here so that they can be used from opt. This will require
extending the (somewhate limited) functionality of AAManager w.r.t.
module passes.
Differential Revision: http://reviews.llvm.org/D17259
llvm-svn: 261197
Our support for C++ EH is sufficiently good that it makes sense to
enable support for it out of the box.
While we are here, update the MSVCCompatibility doc.
llvm-svn: 261195
There seems to be a difference between 2.12.1 and 2.12.2 in 64-bit build.
Tested on Scientific Linux 6.6, based on RHEL.
Differential Revision: http://reviews.llvm.org/D17190
llvm-svn: 261193
Clang implements an enable_if attribute as an extension. Hook up `-Wpedantic`
to issue an extension usage warning when __enable_if__ is used.
llvm-svn: 261192
While we still do want reducible control flow, the RequiresStructuredCFG
flag imposes more strict structure constraints than WebAssembly wants.
Unsetting this flag enables critical edge splitting and tail merging.
Also, disable TailDuplication explicitly, as it doesn't support virtual
registers, and was previously only disabled by the RequiresStructuredCFG
flag.
llvm-svn: 261190
The commit breaks stage2 compilation on PowerPC. Reverting for now while
this is analyzed. I also have to revert the LiveIntervalTest for now as
that depends on this commit.
Revert "LiveIntervalAnalysis: Remove LiveVariables requirement"
This reverts commit r260806.
Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning."
This reverts commit r260931.
Revert "Fix typo in LiveIntervalTest"
This reverts commit r260907.
Revert "Add unittest for LiveIntervalAnalysis::handleMove()"
This reverts commit r260905.
llvm-svn: 261189
Changes:
- Added disassembler project
- Fixed all decoding conflicts in .td files
- Added DecoderMethod=“NONE” option to Target.td that allows to
disable decoder generation for an instruction.
- Created decoding functions for VS_32 and VReg_32 register classes.
- Added stubs for decoding all register classes.
- Added several tests for disassembler
Disassembler only supports:
- VI subtarget
- VOP1 instruction encoding
- 32-bit register operands and inline constants
[Valery]
One of the point that requires to pay attention to is how decoder
conflicts were resolved:
- Groups of target instructions were separated by using different
DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate
approach.
- There were conflicts in IMAGE_<> instructions caused by two
different reasons:
1. dmask wasn’t specified for the output (fixed)
2. There are image instructions that differ only by the number of
the address components but have the same encoding by the HW spec. The
actual number of address components is determined by the HW at runtime
using image resource descriptor starting from the VGPR encoded in an
IMAGE instruction. This means that we should choose only one instruction
from conflicting group to be the rule for decoder. I didn’t find the way
to disable decoder generation for an arbitrary instruction and therefore
made a onelinear fix to tablegen generator that would suppress decoder
generation when DecoderMethod is set to “NONE”. This is a change that
should be reviewed and submitted first. Otherwise I would need to
specify different DecoderNamespace for every instruction in the
conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not
used in other targets.
3. IMAGE_GATHER decoder generation is for now disabled and to be
done later.
[/Valery]
Patch By: Sam Kolton
Differential Revision: http://reviews.llvm.org/D16723
llvm-svn: 261185
These passes are optimizations, and should be disabled when not
optimizing.
Also create an MCCodeGenInfo so the opt level is correctly plumbed to
the backend pass manager.
Also remove the command line flag for disabling register coloring;
running llc with -O0 should now be useful for debugging, so it's not
necessary.
Differential Revision: http://reviews.llvm.org/D17327
llvm-svn: 261176
After r261154, we were only clearing flags if the known-zero register was
originally live-in to the basic block, but we have to do it even if not when
more than one COPY has been eliminated, otherwise the user of the first COPY
may still have <kill> marked.
E.g.
BB#N:
%X0 = COPY %XZR
STRXui %X0<kill>, <fi#0>
%X0 = COPY %XZR
STRXui %X0<kill>, <fi#1>
We can eliminate both copies, X0 is not live-in, but we must clear the kill on
the first store.
Unfortunately, I've been unable to come up with a non-fragile test for this.
I've only seen it in the wild with regalloc-created spills, and attempts to
reproduce that in a reasonable way run afoul of COPY coalescing. Even volatile
asm clobbers were moved around. Should fix the aarch64 bot though.
llvm-svn: 261175
Also implements the PDBSymbolCompilandEnv::getValue() method,
which until now had been unimplemented specifically because
variant did not support string values.
llvm-svn: 261173
We prematurely ended the line at the null byte which caused us to crash
down stream because we tried to reason about columns beyond the end of
the line.
llvm-svn: 261171
described by an immediate.
Found via http://reviews.llvm.org/D16867
Thanks to Paul Robinson for pointing this out.
<rdar://problem/24456528>
llvm-svn: 261168
This code was doing the right thing for the iOS simulator, but not other simulator platforms
Fix it by making the warning not happen for all platforms whose name ends in "-simulator"
Since this code lives in AppleObjCRuntimeV2.cpp, this already only applies to Apple platforms by definition, so I am not too worried about conflicts with other vendors
llvm-svn: 261165
An optional nopartial can be placed after the platform name.
int bar() __attribute__((availability(macosx,nopartial,introduced=10.12))
When deploying back to a platform version prior to when the declaration was
introduced, with 'nopartial', Clang emits an error specifying that the function
is not introduced yet; without 'nopartial', the behavior stays the same: the
declaration is `weakly linked`.
A member is added to the end of AttributeList to save the location of the
'nopartial' keyword. A bool member is added to AvailabilityAttr.
The diagnostics for 'nopartial' not-yet-introduced is handled in the same way as
we handle unavailable cases.
Reviewed by Doug Gregor and Jordan Rose.
rdar://23791325
llvm-svn: 261163
thousands of modules, each of which declares the same namespace, linearly
scanning the redecl chain looking for a visible declaration (once for each leaf
module, for each use) performs very poorly. Namespace visibility can only
decrease when we leave a module during a module build step, and we never care
*which* visible declaration of a namespace we find, so we can cache this very
effectively.
This results in a 35x speedup on one of our internal build steps (2m -> 3.5s),
but is hard to unit test because it requires a very large number of modules.
Ideas for a test appreciated! No functionality change intended other than the
speedup.
llvm-svn: 261161