InstCombine would propagate shufflevector insts that had wider output vectors onto
predecessors, which would sometimes push undef's onto the divisor of a div/rem and
result in bad codegen.
I've fixed this by just banning propagating shufflevector back if the result of
the shufflevector is wider than the input vectors.
Patch by: @sheredom (Neil Henning)
Differential Revision: https://reviews.llvm.org/D52548
llvm-svn: 343329
These are the updated baseline tests for D52548 -
I'm putting the tests next to the tests where the transform
functions as expected, so we can see the intended/unintended
consequences.
Patch by: @sheredom (Neil Henning)
llvm-svn: 343328
https://reviews.llvm.org/D51147
Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.
reviewed by : dsanders.
llvm-svn: 343325
This patch also introduces testing for libomptarget-nvptx
which has been missing until now. I propose to add tests for
all bugs that are fixed in the future.
The target check-libomptarget-nvptx is not run by default because
- we can't determine if there is a GPU plugged into the system.
- it will require the latest Clang compiler. Keeping compatibility
with older releases would prevent testing newer code generation
developed in trunk.
Differential Revision: https://reviews.llvm.org/D51687
llvm-svn: 343324
(1) Print debugging output under a session lock to avoid garbled messages when
compiling on multiple threads.
(2) Name MaterializationUnits, add an ostream operator for them, and so they can
be easily referenced in debugging output, and have that ostream operator
optionally print code/data/hidden symbols provided by that materialization unit
based on command line options.
llvm-svn: 343323
This seems to cause the thread's exit code to be clobbered, breaking
Chromium tests.
Also revert follow-up r342654.
> In long-running builds we've seen some ASan complaints during thread creation that we suspect are due to leftover poisoning from previous threads whose stacks occupied that memory. This patch adds a hook that unpoisons the stack just before the NtTerminateThread syscall.
>
> Differential Revision: https://reviews.llvm.org/D52091
llvm-svn: 343322
- Add fix so that all code paths that create DWARFContext
with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method
llvm-svn: 343317
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D52409
llvm-svn: 343315
On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan
and UBSan enabled. With the same configuration on my machine, the test
passes but takes more than 3 minutes to do so. I could increase the
timeout, but I believe it makes more sense to split up the test because
it allows for more parallelism.
Differential revision: https://reviews.llvm.org/D52603
llvm-svn: 343313
My previous change (rL340911) set the two features for architectures
>= 6, which wrongly includes v6m. Now set to >= 6 and not Cortex-M.
Differential Revision: https://reviews.llvm.org/D52644
llvm-svn: 343309
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.
The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.
It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.
Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D51702
llvm-svn: 343308
This change is in preparation for a future work on improving support for
optimizable register moves. We already know if a write is from a zero-idiom, so
we can propagate that bit of information to the PRF. We use an APInt mask to
identify registers that are set to zero.
llvm-svn: 343307
Review D52594 will change the default in llvm for armv6k from the
non-existent cpu arm1176jf-s to mpcore. The tests in arm-cortex-cpus.c
need to be updated to account for this change.
Differential Revision: https://reviews.llvm.org/D52595
llvm-svn: 343304
The ARMTargetParser.def contains an entry for arm1176j-s which is the
default for the ArmV6K architecture. This cpu does not exist, there are
only arm1176jz-s and arm1176jzf-s and they are both architecture ArmV6KZ.
The only CPUs that are actually ArmV6K are the mpcore, mpcore_nofpu and
later revisions of the arm1136 family r1px (which we don't have a table
entry for).
This patch removes the arm1176j-s and makes mpcore the default for armv6k.
Differential Revision: https://reviews.llvm.org/D52594
llvm-svn: 343303
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.
Differential Revision: https://reviews.llvm.org/D52551
llvm-svn: 343302
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52484
llvm-svn: 343300
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.
Differential Revision: https://reviews.llvm.org/D52605
llvm-svn: 343297
Summary:
Currently,
cd test/xray/TestCases/Posix
$build/bin/llvm-lit fdr-thread-order.cc
fails because `rm fdr-thread-order.*` deletes the .cc file.
This patch uses:
* %t as temporary directory name containing log files
* %t.exe as executable name
It does not delete %t after the test finishes for debugging convenience.
This matches the behavior of tests of various other LLVM components.
Log files will not clog up because the temporary directory (unique among
test files but the same among multiple invocations of a test) is cleaned
at the beginning of the test.
Reviewers: dberris, mboerger, eizan
Reviewed By: dberris
Subscribers: delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D52638
llvm-svn: 343295
Summary: A RichManglingContext constructed with an invalid demangled name or with a demangled function name without any context will have an empty context. This triggers an assertion in RichManglingContext::GetBufferRef() when debugging a native Windows process on x86 when it shouldn't. Remove the assertion.
Reviewers: aleksandr.urakov, zturner, lldb-commits
Reviewed By: zturner
Subscribers: erik.pilkington
Differential Revision: https://reviews.llvm.org/D52626
llvm-svn: 343292
Summary:
This is for coding standard conformance, and for fixing an ODR violation
issue: __xray::ThreadLocalData is defined twice and differently in
xray_fdr_logging.cc and xray_basic_logging.cc
Reviewers: dberris, mboerger, eizan
Reviewed By: dberris
Subscribers: delcypher, jfb, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D52639
llvm-svn: 343289
move constructor.
This is basically the same fix as r343261, but applied to the move constructor:
Failure to lock the context during module destruction can lead to data races if
other threads are operating on the context.
llvm-svn: 343286
render the function deleted instead of rendering the program ill-formed.
This change also adds an enabled-by-default warning for the case where
an explicitly-defaulted special member function of a non-template class
is implicitly deleted by the type checking rules. (This fires either due
to this language change or due to pre-C++20 reasons for the member being
implicitly deleted). I've tested this on a large codebase and found only
bugs (where the program means something that's clearly different from
what the programmer intended), so this is enabled by default, but we
should revisit this if there are problems with this being enabled by
default.
llvm-svn: 343285
Summary:
This change allows us to use the library path from which the LLVM
libraries are installed, in case the LLVM installation generates shared
libraries.
This should address llvm.org/PR39070.
Reviewers: mboerger, eizan
Subscribers: mgorny, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52597
llvm-svn: 343280