A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.
Differential Revision: https://reviews.llvm.org/D52257
llvm-svn: 342898
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.
- PrintIR routines implement printing callbacks.
- StandardInstrumentations class provides a central place to manage all
the "standard" in-tree pass instrumentations. Currently it registers
PrintIR callbacks.
Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923
llvm-svn: 342896
- When return address signing is enabled, the LR may be signed on function entry
- When an exception is thrown the return address is inspected used to unwind the call stack
- Before this happens, the return address must be correctly authenticated to avoid causing an abort by dereferencing the signed pointer
Differential Revision: https://reviews.llvm.org/D51432
llvm-svn: 342895
Implement final argument precedence if multiple /debug arguments are passed on the command-line to match expected link.exe behavior.
Support /debug:none and emit warning for /debug:fastlink with automatic fallback to /debug:full.
Emit error if last /debug:option is unknown.
Emit warning if last /debugtype:option is unknown.
https://reviews.llvm.org/D50404
llvm-svn: 342894
Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.
This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.
llvm-svn: 342892
- The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified
- The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers
- This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389
Differential Revision: https://reviews.llvm.org/D52082
llvm-svn: 342891
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for
use in the later select.
As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q
But we're not using 'uaddo' in those cases via DAG transforms. This happens in
CGP after D8889 without checking target lowering to see if the op is supported.
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned
saturated add and converts to uaddo without checking target capabilities.
This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title
(unlike x86 which sees improvements for all sizes because all sizes are 'custom').
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases.
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO,
so this patch will fire on those tests.
Another possibility given the existing behavior: we could remove the legal-or-custom
check altogether because we're assuming that a UADDO sequence is canonical/optimal
before we ever reach here. But that seems like a bug to me. If the target doesn't
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining
using a UADDO node. This is similar justification for why we don't canonicalize IR to
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first
place.
Differential Revision: https://reviews.llvm.org/D51929
llvm-svn: 342886
Discussed on cfe-commits (Week-of-Mon-20180820), this change leads to
the generation of invalid IR for OpenCL without giving an error.
Therefore, the conclusion was to revert.
llvm-svn: 342885
The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.
Patch by Dragan Mladjenovic.
Differential Revision: https://reviews.llvm.org/D52045
llvm-svn: 342884
[Clang][CodeGen][ObjC]: Fix non-bridged CoreFoundation builds on ELF targets
that use `-fconstant-cfstrings`. The original changes from differential
for a similar patch to PE/COFF (https://reviews.llvm.org/D44491) did not
check for an edge case where the global could be a constant which surfaced
as an issue when building for ELF because of different linkage semantics.
This patch addresses several issues with crashes related to CF builds on ELF
as well as improves data layout by ensuring string literals that back
the actual CFConstStrings end up in .rodata in line with Mach-O.
Change itself tested with CoreFoundation on Linux x86_64 but should be valid
for BSD-like systems as well that use ELF as the native object format.
Differential Revision: https://reviews.llvm.org/D52344
llvm-svn: 342883
gcc uses operand modifier 'x' in inline asm for VSX registers.
Without this modifier, instructions which use VSX numbering for their
operands are printed as VMX registers. This patch adds support for the
operand modifier 'x'.
Differential Revision: https://reviews.llvm.org/D52244
llvm-svn: 342882
LSan can be enabled by itself or as part of the address sanitizer.
Rather than checking the enabled sanitizers for both, just set the LSan
env options whenever a sanitizer is enabled.
llvm-svn: 342881
It would be best to introduce ISD::BitFieldExtract,
because clearly more than one backend faces the same problem.
But for now let's solve this in the x86-specific DAG combine.
https://bugs.llvm.org/show_bug.cgi?id=38938
llvm-svn: 342880
If the alignment is at least 4, this should report true.
Something still seems off with how < 4-byte types are
handled here though.
Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.
llvm-svn: 342879
Summary:
NativeProcessProtocol is an abstract class, but it still contains a
significant amount of code. Some of that code is tested via tests of
specific derived classes, but these tests don't run everywhere, as they
are OS and arch-specific. They are also relatively high-level, which
means some functionalities (particularly the failure cases) are
hard/impossible to test.
In this approach, I replace the abstract methods with mocks, which
allows me to inject failures into the lowest levels of breakpoint
setting code and test the class behavior in this situation.
Reviewers: zturner, teemperor
Subscribers: mgorny, lldb-commits
Differential Revision: https://reviews.llvm.org/D52152
llvm-svn: 342875
A sequence of VMUL and VADD instructions always give the same or better
performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33.
Executing the VMUL and VADD back-to-back requires the same cycles, but
having separate instructions allows scheduling to avoid the hazard between
these 2 instructions.
Differential Revision: https://reviews.llvm.org/D52289
llvm-svn: 342874
This caused miscompilation of WebRTC for Android: PR39060.
> We've had the pass enabled downstream for a couple of weeks and it
> seems to be okay, so enable it by default.
>
> Differential Revision: https://reviews.llvm.org/D51920
llvm-svn: 342873
- The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers
- This is an UNPREDICTABLE instruction and shouldn't be done
- It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch
- This fixes https://bugs.llvm.org/show_bug.cgi?id=38389
Differential Revision: https://reviews.llvm.org/D52085
llvm-svn: 342872
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D51968
llvm-svn: 342871
Originally committed in rL342210 but was reverted in rL342260 because
it was causing issues in vectorized code, because I had forgotten to
ensure that we're operating on scalar values.
Original commit message:
On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.
Differential Revision: https://reviews.llvm.org/D51983
llvm-svn: 342870
Summary:
Previously we'd just show the exception and not the output from the
executed script. This is unhelpful in the case that the script actually
reports some useful information on the failure.
Now we print the output and re-raise the exception.
Reviewers: kubamracek, george.karpenkov
Subscribers: #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D52350
llvm-svn: 342869
changing the value of `SANITIZER_MMAP_RANGE_SIZE` to something more
sensible. The available VMA is at most 64GiB and not 256TiB that
was previously being used.
This change gives us several wins:
* Drastically improves LeakSanitizer performance on
Darwin ARM64 devices. On a simple synthentic benchmark
this took leak detection time from ~30 seconds to 0.5 seconds
due to the `ForEachChunk(...)` method enumerating a much smaller
number of regions. Previously we would pointlessly iterate
over a large portion of the SizeClassAllocator32's ByteMap
that would could never be set due it being configured for a much
larger VM space than is actually availble.
* Decreases the memory required for the Primary allocator.
Previously the ByteMap inside the the allocator used
an array of pointers that took 512KiB of space. Now the required
space for the array is 128 bytes.
rdar://problem/43509428
Differential Revision: https://reviews.llvm.org/D51173
llvm-svn: 342868
`Dex` should utilize `FuzzyFindRequest.RestrictForCodeCompletion` flags
and omit symbols not meant for code completion when asked for it.
The measurements below were conducted with setting
`FuzzyFindRequest.RestrictForCodeCompletion` to `true` (so that it's
more realistic). Sadly, the average latency goes down, I suspect that is
mostly because of the empty queries where the number of posting lists is
critical.
| Metrics | Before | After | Relative difference
| ----- | ----- | ----- | -----
| Cumulative query latency (7000 `FuzzyFindRequest`s over LLVM static index) | 6182735043 ns | 7202442053 ns | +16%
| Whole Index size | 81.24 MB | 81.79 MB | +0.6%
Out of 292252 symbols collected from LLVM codebase 136926 appear to be
restricted for code completion.
Reviewers: ioeric
Differential Revision: https://reviews.llvm.org/D52357
llvm-svn: 342866
Summary:
The `set` statements was incorrectly reading the value of the local variable and
setting the value of the parent variable.
Reviewers: tycho, gchatelet, john.brawn
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52343
llvm-svn: 342865
Armv8.4-A adds a few FP16 instructions that can optionally be implemented
in CPUs of Armv8.2-A and above.
This patch adds a feature to clang to permit selection of these
instructions. This interacts with the +fp16 option as follows:
Prior to Armv8.4-A:
*) +fp16fml implies +fp16
*) +nofp16 implies +nofp16fml
From Armv8.4-A:
*) The above conditions apply, additionally: +fp16 implies +fp16fml
Patch by Bernard Ogden.
Differential Revision: https://reviews.llvm.org/D50229
llvm-svn: 342862
Currently, attributes from previous declarations ('inherited attributes')
are added to the end of a declaration's list of attributes. Before
r338800, the attribute list was in reverse. r338800 changed the order
of non-inherited (parsed from the current declaration) attributes, but
inherited attributes are still appended to the end of the list.
This patch appends inherited attributes after other inherited
attributes, but before any non-inherited attribute. This is to make the
order of attributes in the AST correspond to the order in the source
code.
Differential Revision: https://reviews.llvm.org/D50214
llvm-svn: 342861
Summary:
This change spans both LLVM and compiler-rt, where we do the following:
- Add XRay to the LLVMBuild system, to allow for distributing the XRay
trace loading library along with the LLVM distributions.
- Use `llvm-config` better in the compiler-rt XRay implementation, to
depend on the potentially already-distributed LLVM XRay library.
While this is tested with the standalone compiler-rt build, it does
require that the LLVMXRay library (and LLVMSupport as well) are
available during the build. In case the static libraries are available,
the unit tests will build and work fine. We're still having issues with
attempting to use a shared library version of the LLVMXRay library since
the shared library might not be accessible from the standard shared
library lookup paths.
The larger change here is the inclusion of the LLVMXRay library in the
distribution, which allows for building tools around the XRay traces and
profiles that the XRay runtime already generates.
Reviewers: echristo, beanz
Subscribers: mgorny, hiraditya, mboerger, llvm-commits
Differential Revision: https://reviews.llvm.org/D52349
llvm-svn: 342859
This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015.
llvm-svn: 342856
Debian build bots are running Clang 4, which apparently does not support
the "deprecated" attribute properly. Clang pretends to support the attribute,
but the attribute doesn't do anything.
(live example: https://wandbox.org/permlink/0De69aXns0t1D59r)
On a separate note, I'm not sure I understand why we're even running the
libc++ tests under Clang-4. Is this a configuration we support? I can
understand that libc++ should _build_ with Clang 4, but it's not clear
to me that new libc++ headers should be usable under older compilers
like that.
llvm-svn: 342854
Variable Shifts/Rotates using the CL register have different behaviours to the immediate instructions - split accordingly to help remove yet more repeated overrides from the schedule models.
llvm-svn: 342852
This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed.
llvm-svn: 342851