This reverts commit 3f572c7b84.
The MSan sanitizer buildbot was broken by rL367901. This commit
(rL368079) depends on the broken commit that need to be reverted, and
thus itself is being reverted.
See https://reviews.llvm.org/rL367901 for more information.
llvm-svn: 368106
Globals are instrumented by adding a pointer tag to their symbol values
and emitting metadata into a special section that allows the runtime to tag
their memory when the library is loaded.
Due to order of initialization issues explained in more detail in the comments,
shadow initialization cannot happen during regular global initialization.
Instead, the location of the global section is marked using an ELF note,
and we require libc support for calling a function provided by the HWASAN
runtime when libraries are loaded and unloaded.
Based on ideas discussed with @evgeny777 in D56672.
Differential Revision: https://reviews.llvm.org/D65770
llvm-svn: 368102
This check is only meaningful for COFF and it is perfectly valid to create
such a GlobalValue in ELF.
Differential Revision: https://reviews.llvm.org/D65686
llvm-svn: 368094
If we're after type legalization we should only be trying to turn
v2i64 into v2i32. So bitcast to v4i32, shuffle the even elements
together. Then use X86ISD::CVTSI2P. The alternative is to leave
the v2i64 type alone and let it scalarized. Hopefully keeping
it packed is better.
Fixes PR42905.
llvm-svn: 368091
This reverts r368059 (git commit 0f95710976)
This caused Clang to assert while self-hosting and compiling
SystemZInstrInfo.cpp. Reduction is running.
llvm-svn: 368084
This flag is now the default behavior so we no longer need to
set it in tests.
Some redundant tests have been removed after verifying we have
an equivalent test that didn't use the flag.
llvm-svn: 368079
https://reviews.llvm.org/D65698
This adds a KnownBits analysis pass for GISel. This was done as a
pass (compared to static functions) so that we can add other features
such as caching queries(within a pass and across passes) in the future.
This patch only adds the basic pass boiler plate, and implements a lazy
non caching knownbits implementation (ported from SelectionDAG). I've
also hooked up the AArch64PreLegalizerCombiner pass to use this - there
should be no compile time regression as the analysis is lazy.
llvm-svn: 368065
Summary:
Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles
two shifts one after another. If the shifts are `shl`, we still can
easily perform the fold, with no extra legality checks:
https://rise4fun.com/Alive/OQbM
If we have right-shift however, we won't be able to make it
any simpler than it already is.
After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`)
* If it's a logical shift, then constant-fold to `0` (not `undef`)
* If it's a `ashr`, then a splat of original signbit
https://rise4fun.com/Alive/E1Khttps://rise4fun.com/Alive/i0V
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65380
llvm-svn: 368059
D62198 introduced an option to relax the checks for
hasOnlyUniformBranches. This commit turns the option on by default, for
better code generation in some cases in AMDGPU.
Differential Revision: https://reviews.llvm.org/D63198
Change-Id: I9cbff002a1e74d3b7eb96b4192dc8129936d537d
llvm-svn: 368042
Remove redundant `yaml2obj-elf-file-headers-with-e_flags.yaml` test
case. The same functionality is checked by the `Mips/elf-flags.yaml`.
llvm-svn: 368023
This patch changes the DAG legalizer to respect the operation actions
set by the target for strict floating-point operations. (Currently, the
legalizer will usually fall back to mutate to the non-strict action
(which is assumed to be legal), and only skip mutation if the strict
operation is marked legal.)
With this patch, if whenever a strict operation is marked as Legal or
Custom, it is passed to the target as usual. Only if it is marked as
Expand will the legalizer attempt to mutate to the non-strict operation.
Note that this will now fail if the non-strict operation is itself
marked as Custom -- the target will have to provide a Custom definition
for the strict operation then as well.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D65226
llvm-svn: 368012
There are multiple yaml2obj-* tests in llvm/test/Object
folder. This is not correct place to have them and my intention
was to move them out to test\tools\yaml2obj folder. I reviewed
them, made some changes, and my comments are below.
For all tests I:
Added comments when needed.
Moved them from llvm/test/Object to yaml2obj tests.
Another changes performed:
1) yaml2obj-invalid.yaml. It was a test for an invalid YAML input.
I just moved it.
2) yaml2obj-coff-multi-doc.test/yaml2obj-elf-multi-doc.test:
these were a tests for testing --docnum=x functionality,
one was for COFF and one for ELF. I merged them into one.
3) yaml2obj-elf-bits-endian.test:
I removed its 4 YAML inputs (merged into the main test).
4) yaml2obj-readobj.test:
This file has a long history. It was added to check the
"parsing of header charactestics" initially. Then was used to test
how yaml2obj writes the relocations. Then was upgraded to check how
yaml2obj handle "-o" option. I think it should be heavily splitted
and refactored in a separate patch. For now I leaved it as is, but restyled
to reduce the changes in a follow-ups.
5) yaml2obj-elf-alignment.yaml: its intention was to check we
can set sh-addralign field. I moved, renamed (to elf-sh-addralign.yaml)
and updated this test.
6) yaml2obj-elf-file-headers.yaml: I removed it.
It's intention was to check that
yaml2obj handles OS/ABI and ELF type (e.g Relocatable).
We are testing this already, for example in D64800. We might want
to add a better (more complete) test, but keeping the existent test
does not have much sense I think.
7) yaml2obj-elf-file-headers-with-e_flags.yaml: I would describe its intention
as "testing MIPS e_flags". It is far from being complete and tests only
a few flags. I leaved it alone for now.
8) yaml2obj-elf-rel.yaml: its intention is to check the MIPS32 relocations.
We have a version for MIPS64 here: test\Object\Mips\elf-mips64-rel.yaml
Seems them both are incomplete. I leaved them alone for now.
9) yaml2obj-elf-rel-noref.yaml: was introduced to check the support of arm32
R_ARM_V4BX relocatiion. I leaved it alone for now.
10) yaml2obj-elf-section-basic.yaml: it just checked that we are able to recognise
trivial fields like section 'Name', 'Type', 'Flags' and others. All of our yaml2obj
tests are heavily using it. I just removed this test.
11) yaml2obj-elf-section-invalid-size.yaml: its intention was to check the
"Section size must be greater than or equal to the content size" error.
I moved this test to `tools\yaml2obj\section-size-content.yaml'
12) yaml2obj-elf-symbol-basic.yaml: its intention seems was to support declarations
of the symbols in yaml2obj. I removed it. We use this in almost each test we already have.
13) yaml2obj-elf-symbol-LocalGlobalWeak.yaml: its intention was to check that we can
declare different symbol bindings. I moved it to tools\yaml2obj\elf-symbol-binding.yaml.
14) yaml2obj-coff-invalid-alignment.test: check that error is reported for a too large coff
section alignment. Moved it to tools\yaml2obj\coff-invalid-alignment.test
15) yaml2obj-elf-symbol-visibility.yaml: tests ELF symbols visibility. I improved it and
moved to tools\yaml2obj\elf-symbol-visibility.yaml and tools\obj2yaml\elf-symbol-visibility.yaml
Differential revision: https://reviews.llvm.org/D65652
llvm-svn: 367988
A function is "no-return" if we never reach a return instruction, either
because there are none or the ones that exist are dead.
Test have been adjusted:
- either noreturn was added, or
- noreturn was avoided by modifying the code.
The new noreturn_{sync,async} test make sure we do handle invoke
instructions with a noreturn (and potentially nowunwind) callee
correctly, even in the presence of potential asynchronous exceptions.
llvm-svn: 367948
Summary:
When the WebAssembly backend encounters a return type that doesn't
fit within i32, SelectionDAG performs sret demotion, adding an
additional argument to the start of the function that contains
a pointer to an sret buffer to use instead. However, this conflicts
with the emscripten sjlj lowering pass. There we translate calls like:
```
call {i32, i32} @foo()
```
into (in pseudo-llvm)
```
%addr = @foo
call {i32, i32} @__invoke_{i32,i32}(%addr)
```
i.e. we perform an indirect call through an extra function.
However, the sret transform now transforms this into
the equivalent of
```
%addr = @foo
%sret = alloca {i32, i32}
call {i32, i32} @__invoke_{i32,i32}(%sret, %addr)
```
(while simultaneously translation the implementation of @foo as well).
Unfortunately, this doesn't work out. The __invoke_ ABI expected
the function address to be the first argument, causing crashes.
There is several possible ways to fix this:
1. Implementing the sret rewrite at the IR level as well and performing
it as part of lowering to __invoke
2. Fixing the wasm backend to recognize that __invoke has a special ABI
3. A change to the binaryen/emscripten ABI to recognize this situation
This revision implements the middle option, teaching the backend to
treat __invoke_ functions specially in sret lowering. This is achieved
by
1) Introducing a new CallingConv ID for invoke functions
2) When this CallingConv ID is seen in the backend and the first argument
is marked as sret (a function pointer would never be marked as sret),
swapping the first two arguments.
Reviewed By: tlively, aheejin
Differential Revision: https://reviews.llvm.org/D65463
llvm-svn: 367935
When we remove instructions cached references could still be live. This
patch avoids removing invoke instructions that are replaced by calls and
instead keeps them around but in a dead block.
llvm-svn: 367933
Any addresses that we pass to llvm-symbolizer are going to be untagged,
while any HWASAN instrumented globals are going to be tagged in the
symbol table. Therefore we need to untag the addresses before using them.
Differential Revision: https://reviews.llvm.org/D65769
llvm-svn: 367926
FastISel already does this since the initial arm64 port was upstreamed, so
it seems there are no issues with doing this at -O0 for very small memcpys.
Gives a 0.2% geomean code size improvement on CTMark.
Differential Revision: https://reviews.llvm.org/D65758
llvm-svn: 367919
Sets section alignments of the specified architecture slices to the
alignment values.
Alignment values are hexadecimal values that are powers of 2.
Differential Revision: https://reviews.llvm.org/D65420
llvm-svn: 367908
This patch changes our defualt legalization behavior for 16, 32, and
64 bit vectors with i8/i16/i32/i64 scalar types from promotion to
widening. For example, v8i8 will now be widened to v16i8 instead of
promoted to v8i16. This keeps the elements widths the same and pads
with undef elements. We believe this is a better legalization strategy.
But it carries some issues due to the fragmented vector ISA. For
example, i8 shifts and multiplies get widened and then later have
to be promoted/split into vXi16 vectors.
This has the potential to cause regressions so we wanted to get
it in early in the 10.0 cycle so we have plenty of time to
address them.
Next steps will be to merge tests that explicitly test the command
line option. And then we can remove the option and its associated
code.
llvm-svn: 367901
Patch D56593 by @courbet results in calls to `bcmp()` in some cases, should
the target support the it. Unless `TTI::MemCmpExpansionOptions()`
is overridden by the target.
In a proprietary benchmark we see a performance drop of about 12% on PNG
compression before this patch, though it passes all tests.
This patch mirrors X86 for AArch64 and initializes
`TTI::MemCmpExpansionOptions()` to then expand calls to `bcmp()` when
appropriate. No tuning of the parameters was performed, but, at this point,
it's enough to recover the performance drop above.
This problem also exists on ARM. Once a consensus is reached for AArch64, we
can work to fix ARM as well.
Authors:
- Evandro Menezes (@evandro) <e.menezes@samsung.com>
- Brian Rzycki (@brzycki) <b.rzycki@samsung.com>
Differential revision: https://reviews.llvm.org/D64805
llvm-svn: 367898
Summary:
The Arm Neoverse N1 Software Optimization Guide [1], Section "4.8 Branch
instruction alignment" states:
"Consider aligning subroutine entry points and branch targets to 32B
boundaries, within the bounds of the code-density requirements of the
program."
This patch sets the preferred function alignment on Neoverse N1 to 2^4=16B.
This was already the case in some of the latest Cortex-A CPUs. Benchmarking
in previous Cortex-A CPUs suggested that 16B alignment is already better
than the default. See commit d04ee305.
The reason we don't set it to 32B right now (as the optimisation guide
suggests) is that this will impact code size and perhaps the instruction
cache performance. Therefore we need benchmark numbers first.
I have also added testing for A75 and A76 that we were missing.
[1] https://developer.arm.com/docs/swog309707/latest
Reviewers: fhahn, greened, samparker, dmgreen
Reviewed By: dmgreen
Subscribers: dmgreen, javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65654
llvm-svn: 367894
This appears to slightly help patterns similar to what's
shown in PR42874:
https://bugs.llvm.org/show_bug.cgi?id=42874
...but not in the way requested.
That fix will require some later IR and/or backend pass to
decompose multiply/shifts into something more optimal per
target. Those transforms already exist in some basic forms,
but probably need enhancing to catch more cases.
https://rise4fun.com/Alive/Qzv2
llvm-svn: 367891