Commit Graph

138883 Commits

Author SHA1 Message Date
Fangrui Song d6fadc49e3 [gcov] Process .gcda immediately after the accompanying .gcno instead of doing all .gcda after all .gcno
i.e. change the work flow from

* .gcno for function A
* .gcno for function B
* .gcno for function C
* .gcda for function A
* .gcda for function B
* .gcda for function C

to

* .gcno for function A
* .gcda for function A
* .gcno for function B
* .gcda for function B
* .gcno for function C
* .gcda for function C

Currently there is duplicate logic in .gcno & .gcda processing: how functions
are filtered, which edges are instrumented, etc. This refactor enables simplification.

Since we always process .gcno, in -fprofile-arcs -fno-test-coverage mode,
__llvm_internal_gcov_emit_function_args.0 will have non-zero checksums.
2020-09-12 13:53:03 -07:00
Fangrui Song 7d3825ed95 Revert "[gcov] emitProfileArcs: iterate over GCOVFunction's instead of Function's to avoid duplicated filtering"
This reverts commit 412c9c0bf2.
2020-09-12 12:34:43 -07:00
Fangrui Song 412c9c0bf2 [gcov] emitProfileArcs: iterate over GCOVFunction's instead of Function's to avoid duplicated filtering 2020-09-12 12:21:32 -07:00
Fangrui Song c55c14837e [gcov] Clean up by getting llvm.dbg.cu earlier 2020-09-12 12:21:32 -07:00
Craig Topper ad3d6f993d [SelectionDAG][X86][ARM][AArch64] Add ISD opcode for __builtin_parity. Expand it to shifts and xors.
Clang emits (and (ctpop X), 1) for __builtin_parity. If ctpop
isn't natively supported by the target, this leads to poor codegen
due to the expansion of ctpop being more complex than what is needed
for parity.

This adds a DAG combine to convert the pattern to ISD::PARITY
before operation legalization. Type legalization is updated
to handled Expanding and Promoting this operation. If after type
legalization, CTPOP is supported for this type, LegalizeDAG will
turn it back into CTPOP+AND. Otherwise LegalizeDAG will emit a
series of shifts and xors followed by an AND with 1.

I've avoided vectors in this patch to avoid more legalization
complexity for this patch.

X86 previously had a custom DAG combiner for this. This is now
moved to Custom lowering for the new opcode. There is a minor
regression in vector-reduce-xor-bool.ll, but a follow up patch
can easily fix that.

Fixes PR47433

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87209
2020-09-12 11:42:18 -07:00
Florian Hahn e082dee2b5 [DSE] Bail out on MemoryPhis when deleting stores at end of function.
When deleting stores at the end of a function, we have to do PHI
translation, otherwise we might miss reads in different iterations of a
loop. See multiblock-loop-carried-dependence.ll for details.

This fixes a mis-compile and surprisingly also increases the number of
eliminated stores from 26047 to 26572 for MultiSource/SPEC2000/SPEC2006
on X86 with -O3 -flto. This is most likely because we save budget by not
exploring through MemoryPhis, which are less likely to result in valid
candidates for elimination.

The issue was reported post-commit for fb109c42d9.
2020-09-12 19:05:59 +01:00
David Green 74760bb00f [LV][ARM] Add preferInloopReduction target hook.
This allows the backend to tell the vectorizer to produce inloop
reductions through a TTI hook.

For the moment on ARM under MVE this means allowing integer add
reductions of the correct size. In the future this can include integer
min/max too, under -Os.

Differential Revision: https://reviews.llvm.org/D75512
2020-09-12 17:47:04 +01:00
Paul C. Anagnostopoulos 8ce75e2778 TableGen: change a couple of member names to clarify their use. 2020-09-12 12:21:36 -04:00
Simon Pilgrim 3170d54842 [InstCombine][X86] Covert masked load/stores with (sign extended) bool vector masks to generic intrinsics.
As detailed on PR11210, if the mask is known to come from a (sign extended) bool vector (e.g. comparisons) then we can represent with a generic masked load/store without losing anything.

We already do something similar for BLENDV -> SELECT conversion.
2020-09-12 15:09:28 +01:00
Evgeny Leviant 2e61cd1295 [MachineScheduler] Fix operand scheduling for pre/post-increment loads
Differential revision: https://reviews.llvm.org/D87557
2020-09-12 16:53:12 +03:00
Tyker 78de7297ab Reland [AssumeBundles] Use operand bundles to encode alignment assumptions
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.
2020-09-12 15:36:06 +02:00
David Green 6cfd38d03d [ARM] Fixup single source mla reductions.
This fixes a complication on top of D87276. If we are sign extending
around a mul with the two operands that are the same, instcombine will
helpfully convert one of the sext to a zext. Reverse that so that we
again generate a reduction.

Differnetial Revision: https://reviews.llvm.org/D87287
2020-09-12 14:31:26 +01:00
Sanjay Patel 3a8ea8609b [Intrinsics] define semantics for experimental fmax/fmin vector reductions
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

This is hopefully the final remaining showstopper before we can remove
the 'experimental' from the reduction intrinsics.

No behavior was specified for the FP min/max reductions, so we have a
mess of different interpretations.

There are a few potential options for the semantics of these max/min ops.
I think this is the simplest based on current behavior/implementation:
make the reductions inherit from the existing llvm.maxnum/minnum intrinsics.
These correspond to libm fmax/fmin, and those are similar to the (now
deprecated?) IEEE-754 maxNum/minNum functions (NaNs are treated as missing
data). So the default expansion creates calls to libm functions.

Another option would be to inherit from llvm.maximum/minimum (NaNs propagate),
but most targets just crash in codegen when given those nodes because no
default expansion was ever implemented AFAICT.

We could also just assume 'nnan' semantics by default (we are already
assuming 'nsz' semantics in the maxnum/minnum intrinsics), but some targets
(AArch64, PowerPC) support the more defined behavior, so it doesn't make much
sense to not allow a tighter spec. Fast-math-flags (nnan) can be used to
loosen the semantics.

(Note that D67507 was proposed to update the LangRef to acknowledge the more
recent IEEE-754 2019 standard, but that patch seems to have stalled. If we do
update based on the new standard, the reduction instructions can seamlessly
inherit from whatever updates are made to the max/min intrinsics.)

x86 sees a regression here on 'nnan' tests because we have underlying,
longstanding bugs in FMF creation/propagation. Those need to be fixed apart
from this change (for example: https://llvm.org/PR35538). The expansion
sequence before this patch may not have been correct.

Differential Revision: https://reviews.llvm.org/D87391
2020-09-12 09:10:28 -04:00
Simon Pilgrim 50ee0b99ec [InstCombine][X86] getNegativeIsTrueBoolVec - use ConstantExpr evaluators. NFCI.
Don't do this manually, we can just use the ConstantExpr evaluators to do it more tidily for us.
2020-09-12 13:58:58 +01:00
David Green c437446d90 [ARM] Recognize "double extend" reduction patterns
We can sometimes get code that does:
  xe = zext i16 x to i32
  ye = zext i16 y to i32
  m = mul i32 xe, ye
  me = zext i32 m to i64
  r = vecreduce.add(me)
This "double extend" can trip up the reduction identification, but
should give identical results.

This extends the pattern matching to handle them.

Differential Revision: https://reviews.llvm.org/D87276
2020-09-12 13:51:42 +01:00
Nikita Popov 36e2e2e12e [InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322)
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.

The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:

 * Export the SimplifyWithOpReplaced API from InstSimplify, with an
   added AllowRefinement flag. For replacements inside the TrueVal
   we don't actually care whether refinement occurs or not, the
   replacement is always legal. This part of the transform is now
   done in InstSimplify only. (It should be noted that the current
   AllowRefinement check is not sufficient -- that's an issue we
   need to address separately.)
 * Change the InstCombine fold to work by temporarily dropping
   poison generating flags, running the fold and then restoring the
   flags if it didn't work out. This will ensure that the InstCombine
   fold is correct as long as the InstSimplify fold is correct.

Differential Revision: https://reviews.llvm.org/D87445
2020-09-12 14:45:06 +02:00
Simon Pilgrim 35dc91aee2 [X86][SSE] lowerShuffleAsDecomposedShuffleBlend - support decomposed unpacks for some vXi8/vXi16 cases
Follow up to D86429 to handle the remaining regressions.

This patch generalizes lowerShuffleAsDecomposedShuffleBlend to lowerShuffleAsDecomposedShuffleMerge, and attempts to use an UNPCKL shuffle mask instead of a blend for the cases where the inputs are coming from alternating vXi8/vXi16 sources. Technically they don't have to be alternating (just as long as they can fit into a lower lane half for the unpack) but I didn't find as many general cases and it needed a lot more of the function to be altered.

For vXi32/vXi64 cases this could still be beneficial but in most cases the existing permute+blend approach was better.

Differential Revision: https://reviews.llvm.org/D87405
2020-09-12 13:39:33 +01:00
Jianzhou Zhao 0ece51c60c Add raw_fd_stream that supports reading/seeking/writing
This is used by https://reviews.llvm.org/D86905 to support bitcode
writer's incremental flush.
2020-09-12 07:34:19 +00:00
QingShan Zhang 0680a3d56d [Power10] Enable the heuristic for Power10 and switch the sched model
with P9 Model

Enable the pre-ra and post-ra scheduler strategy for Power10 as we want
to customize the heuristic later. And switch the scheduler model with P9
model before P10 Model is available. The NoSchedModel is modelled as
in-order cpu and the pre-ra scheduler is not bi-directional which will
have big impact on the scheduler.

Reviewed By: jji

Differential Revision: https://reviews.llvm.org/D86865
2020-09-12 02:49:47 +00:00
QingShan Zhang 528554c39b [PowerPC] Set the mayRaiseFPException for FCMPUS/FCMPUD
From ISA, fcmpu will raise the Floating-Point Invalid Operation
Exception (SNaN) if either of the operands is a Signaling NaN by setting
the bit VXSNAN. But the instruction description didn't set the
mayRaiseFPException which might have impact on the scheduling or some
backend optimization.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D83937
2020-09-12 02:42:22 +00:00
Yuanfang Chen ad99e34c59 Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"
This reverts commit 31ecf8d29d.
This reverts commit 3fdaa8602a.

There is laying violation for Target->CodeGen.
2020-09-11 18:52:32 -07:00
Eli Friedman d751f86189 [ConstantFold] Make areGlobalsPotentiallyEqual less aggressive.
In particular, we shouldn't make assumptions about globals which are
unnamed_addr: we can fold them together with other globals.

Also while I'm here, use isInterposable() instead of trying to
explicitly name all the different kinds of weak linkage.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47090

Differential Revision: https://reviews.llvm.org/D87123
2020-09-11 17:23:08 -07:00
Eli Friedman 37f2776d1a [ConstantFold] Fold binary arithmetic on scalable vector splats.
It's a nice simplification, and it confuses instcombine if we don't do
it.

Differential Revision: https://reviews.llvm.org/D87422
2020-09-11 16:41:58 -07:00
Yuanfang Chen 31ecf8d29d [NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline
Following up on D67687.
Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html

`CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences.
- Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`)
- `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design.
- `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D83608
2020-09-11 16:41:17 -07:00
Fangrui Song 45d0343900 [MC] Allow .org directives in SHT_NOBITS sections
This is used by kvm-unit-tests and can be trivially supported.
2020-09-11 15:12:42 -07:00
Matt Arsenault 382b2b1b51 RegAllocFast: Fix typo in comment 2020-09-11 18:06:14 -04:00
Matt Arsenault e21bb31eb6 CodeGen: Require SSA to run PeepholeOptimizer 2020-09-11 18:03:04 -04:00
Lang Hames 7dcd0042e8 Re-apply "[ORC] Make MaterializationResponsibility immovable..." with fixes.
Re-applies c74900ca67 with fixes for the ThinLtoJIT example.
2020-09-11 14:09:05 -07:00
Mircea Trofin 9a2bab5ea2 [ThinLTO] Make -lto-embed-bitcode an enum
The current behavior of -lto-embed-bitcode is not quite the same as that
of -fembed-bitcode. While both populate .llvmbc with bitcode, the latter
populates it with pre-optimized bitcode(*), while the former with
post-optimized. The scenarios driving them are different - the latter's
goal is to allow re-compilation, while the former, IIUC, is execution.

I plan to add a third mode for thinlto cases, closely-related to
-fembed-bitcode's scenario: adding the bitcode pre-optimization, but
post-merging. This would allow re-compilation without requiring the
other .bc files that were merged (akin to how -fembed-bitcode allows
recompilation without all the .h files)

The third mode can't co-exist with the current -lto-embed-bitcode mode,
because the latter would overwrite it. For clarity, we change
-lto-embed-bitcode to be an enum.

(*) That's the compiler semantics. The driver splits compilation in 2
phases, so if -fembed-bitcode is given to the driver, the .llvmbc is
optimized bitcode; if the option is passed to the compiler (after -cc1),
the section is pre-optimized.

Differential Revision: https://reviews.llvm.org/D87477
2020-09-11 13:24:54 -07:00
Sam Clegg fa2a8acc71 [WebAssembly] Add assembly syntax for mutable globals
This adds and optional ", immutable" to the end of a `.globaltype`
declaration.  I would have prefered to match the `.wat` syntax
where immutable is the default and `mut` is the signifier for
mutable globals.  Sadly changing the default would break backwards
compat with existing assembly in the wild so I think its best
to stick with this approach.

Differential Revision: https://reviews.llvm.org/D87515
2020-09-11 11:11:02 -07:00
Sanjay Patel 40f12ef621 [SLP] further limit bailout for load combine candidate (PR47450)
The test example based on PR47450 shows that we can
match non-byte-sized shifts, but those won't ever be
bswap opportunities. This isn't a full fix (we'd still
match if the shifts were by 8-bits for example), but
this should be enough until there's evidence that we
need to do more (this is a borderline case for
vectorization in the first place).
2020-09-11 11:56:11 -04:00
Krzysztof Parzyszek f92908cc74 [DSE] Make sure that DSE+MSSA can handle masked stores
Differential Revision: https://reviews.llvm.org/D87414
2020-09-11 10:00:21 -05:00
Sanjay Patel 6aa3fc4a5b Revert "[InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)"
This reverts commit 324a53205a.

On closer examination of at least one of the test diffs,
this does not appear to be correct in all cases. Even the
existing 'nsw' creation may be wrong based on this example:
https://alive2.llvm.org/ce/z/uL4Hw9
https://alive2.llvm.org/ce/z/fJMKQS
2020-09-11 10:54:48 -04:00
Sanjay Patel 324a53205a [InstCombine] propagate 'nsw' on pointer difference of 'inbounds' geps (PR47430)
There's no signed wrap if both geps have 'inbounds':
https://alive2.llvm.org/ce/z/nZkQTg
https://alive2.llvm.org/ce/z/7qFauh
2020-09-11 10:39:09 -04:00
Simon Pilgrim 48b510c4bc [NFC] Fix compiler warnings due to integer comparison of different signedness
Fix by directly using INT_MAX and INT32_MAX.

Patch by: @nullptr.cpp (Yang Fan)

Differential Revision: https://reviews.llvm.org/D87347
2020-09-11 15:32:03 +01:00
Florian Hahn 8da6ae4ce1 Revert "[ConstraintSystem] Add helpers to deal with linear constraints."
This reverts commit 3eb141e507.

This uses __builtin_mul_overflow which is not available everywhere.
2020-09-11 14:49:04 +01:00
Florian Hahn 3eb141e507 [ConstraintSystem] Add helpers to deal with linear constraints.
This patch introduces a new ConstraintSystem class, that maintains a set
of linear constraints and uses Fourier–Motzkin elimination to eliminate
constraints to check if there are solutions for the system.

It also adds a convert-constraint-log-to-z3.py script, which can parse
the debug output of the constraint system and convert it to a python
script that feeds the constraints into Z3 and checks if it produces the
same result as the LLVM implementation. This is for verification
purposes.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D84544
2020-09-11 14:43:22 +01:00
Andrzej Warzynski 4eed800b18 [NFC] Fix the signature and definition of findByPrefix
In https://reviews.llvm.org/rG257b29715bb27b7d9f6c3c40c481b6a4af0b37e5,
the definition of OptTable::Info::Flags was changed from `unsigned
short` to `unsigned int`, but the definition/declaration of
OptTable::findByPrefix wasn't updated to reflect that.

This patch updates findByPrefix accordingly.
2020-09-11 12:38:28 +01:00
Jeremy Morse 0caeaff123 [LiveDebugValues][NFC] Re-land 60db26a66d, add instr-ref tests
This was landed but reverted in 5b9c2b1bea due to asan picking up a memory
leak. This is fixed in the change to InstrRefBasedImpl.cpp. Original
commit message follows:

[LiveDebugValues][NFC] Add instr-ref tests, adapt old tests

This patch adds a few tests in DebugInfo/MIR/InstrRef/ of interesting
behaviour that the instruction referencing implementation of
LiveDebugValues has. Mostly, these tests exist to ensure that if you
give the "-experimental-debug-variable-locations" command line switch,
the right implementation runs; and to ensure it behaves the same way as
the VarLoc LiveDebugValues implementation.

I've also touched roughly 30 other tests, purely to make the tests less
rigid about what output to accept. DBG_VALUE instructions are usually
printed with a trailing !debug-location indicating its scope:

  !debug-location !1234

However InstrRefBasedLDV produces new DebugLoc instances on the fly,
meaning there sometimes isn't a numbered node when they're printed,
making the output:

  !debug-location !DILocation(line: 0, blah blah)

Which causes a ton of these tests to fail. This patch removes checks for
that final part of each DBG_VALUE instruction. None of them appear to
be actually checking the scope is correct, just that it's present, so
I don't believe there's any loss in coverage here.

Differential Revision: https://reviews.llvm.org/D83054
2020-09-11 12:14:44 +01:00
Simon Pilgrim 70a05ee288 [X86] Keep variables from getDataLayout/getDebugLoc calls as const reference. NFCI.
These are only ever used as references in the called functions, so just pass the original reference instead of copying it.
2020-09-11 10:44:42 +01:00
Benjamin Kramer 5405ee553a [CodeGenPrepare] Simplify code. NFCI. 2020-09-11 11:24:08 +02:00
Florian Hahn c0825fa5fc Revert "[ORC] Make MaterializationResponsibility immovable, pass by unique_ptr."
This reverts commit c74900ca67.

This appears to be breaking some builds on macOS and has been causing
build failures on Green Dragon (see below). I am reverting this for now,
to unblock testing on Green Dragon.

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/18144/console

[65/187] /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iexamples/ThinLtoJIT -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT -Iinclude -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -O3  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.9    -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -MF examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o.d -o examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -c /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp
FAILED: examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iexamples/ThinLtoJIT -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT -Iinclude -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -O3  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.9    -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -MF examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o.d -o examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -c /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp
In file included from /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp:7:
/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoInstrumentationLayer.h:37:68: error: non-virtual member function marked 'override' hides virtual member function
  void emit(MaterializationResponsibility R, ThreadSafeModule TSM) override;
                                                                   ^
/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include/llvm/ExecutionEngine/Orc/Layer.h:103:16: note: hidden overloaded virtual function 'llvm::orc::IRLayer::emit' declared here: type mismatch at 1st parameter ('std::unique_ptr<MaterializationResponsibility>' vs 'llvm::orc::MaterializationResponsibility')
  virtual void emit(std::unique_ptr<MaterializationResponsibility> R,
               ^
1 error generated.
2020-09-11 09:35:20 +01:00
Martin Storsjö e6419d320d [MC] [Win64EH] Fix builds with expensive checks enabled
This fixes a failed assert if expensive checks are enabled,
since 1308bb99e0.
2020-09-11 11:17:01 +03:00
David Sherwood 1e1770a07e [SVE][CodeGen] Fix InlineFunction for scalable vectors
When inlining functions containing allocas of scalable vectors we
cannot specify the size in the lifetime markers, since we don't
know this at compile time.

Added new test here:

  test/Transforms/Inline/AArch64/sve-alloca-merge.ll

Differential Revision: https://reviews.llvm.org/D87139
2020-09-11 08:34:51 +01:00
Yevgeny Rouban 28012e00d8 [NewPM] Introduce PreserveCFG check
Check that all passes, which report they preserve CFG,
are really preserving CFG.
A new standard instrumentation is introduced. It can be
switched on/off by the flag verify-cfg-preserved, which
is on by default for debug builds.

Reviewers: kuhar, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81558
2020-09-11 14:32:21 +07:00
Martin Storsjö 1308bb99e0 [MC] [Win64EH] Write packed ARM64 epilogues if possible
This gives a pretty substantial size reduction; for a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 66 KB.

Differential Revision: https://reviews.llvm.org/D87369
2020-09-11 10:31:04 +03:00
Martin Storsjö 700fbe591a [MC] [Win64EH] Canonicalize ARM64 unwind opcodes
Convert 2-byte opcodes to equivalent 1-byte ones.

Adjust the existing exhaustive testcase to avoid being altered by
the simplification rules (to keep that test exercising all individual
opcodes).

Fix the assembler parser limits for register pairs; for .seh_save_regp
and .seh_save_regp_x, we can allow up to x29, for a x29+x30 pair
(which gets remapped to the UOP_SaveFPLR(X) opcodes), for .seh_save_fregp
and .seh_save_fregpx, allow up to d14+d15.

Not creating .seh_save_next for float register pairs, as the
actual unwinder implementation in current versions of Windows is buggy
for that case.

This gives a minimal but measurable size reduction. (For a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 48 bytes. The opcode
sequences are padded to a 4 byte boundary, so very small improvements
might not end up mattering directly.)

Differential Revision: https://reviews.llvm.org/D87367
2020-09-11 10:31:04 +03:00
Martin Storsjö 46416f0803 [CodeGen] [WinException] Remove a redundant explicit section switch for aarch64
The following EmitWinEHHandlerData() implicitly switches to .xdata, just
like on x86_64.

This became orphaned from the original code requiring it in
0b61d220c9 / https://reviews.llvm.org/D61095.

Differential Revision: https://reviews.llvm.org/D87447
2020-09-11 10:31:04 +03:00
Michael Liao f787fe15d8 [EarlyCSE] Remove unnecessary operand swap.
- As min/max are commutative operators, there is no need to swap
  operands. That breaks the convention calculating the hash value.
2020-09-11 02:14:04 -04:00
Alok Kumar Sharma e45b0708ae [DebugInfo] Fixing CodeView assert related to lowerBound field of DISubrange.
This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287
    after DIsSubrange upgrade D80197

    Assert condition is now removed and Count is calculated in case LowerBound
    is absent or zero and Count or UpperBound is constant. If Count is unknown
    it is later handled as VLA (currently Count is set to zero).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D87406
2020-09-11 11:12:49 +05:30