Summary:
This patch splits the Attributor::run() function into multiple functions.
Simple Logic changes to make this possible:
# Moved iteration count verification earlier.
# NumFinalAAs get set a little bit later.
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: jdoerfert
Subscribers: hiraditya, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81022
Summary:
This patch adds optional field into function summary,
implements asm and bitcode serialization. YAML
serialization is omitted and can be added later if
needed.
This patch includes this information into summary only
if module contains at least one sanitize_memtag function.
In a near future MTE is the user of the analysis.
Later if needed we can provede more direct control
on when information is included into summary.
Reviewers: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80908
This patch extends numerical expressions to allow calls to
predefined functions. These calls can be combined with the
existing numerical operators, which includes nesting calls.
The call syntax is:
<func>(<args>)
Where <func> is a predefined string literal, currently limited to
one of add, max, min and sub. <arg> is a comma seperated list of
numerical expressions.
Subscribers: arichardson, hiraditya, thopre, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79936
This patch relaxes the post-dominance requirement for accesses to
objects visible after the function returns.
Instead of requiring the killing def to post-dominate the access to
eliminate, the set of 'killing blocks' (= blocks that completely
overwrite the original access) is collected.
If all paths from the access to eliminate and an exit block go through a
killing block, the access can be removed.
To check this property, we first get the common post-dominator block for
the killing blocks. If this block does not post-dominate the access
block, there may be a path from DomAccess to an exit block not involving
any killing block.
Otherwise we have to check if there is a path from the DomAccess to the
common post-dominator, that does not contain a killing block. If there
is no such path, we can remove DomAccess. For this check, we start at
the common post-dominator and then traverse the CFG backwards. Paths are
terminated when we hit a killing block or a block that is not executed
between DomAccess and a killing block according to the post-order
numbering (if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess).
This gives the following improvements on the total number of stores
after DSE for MultiSource, SPEC2K, SPEC2006:
Tests: 237
Same hash: 206 (filtered out)
Remaining: 31
Metric: dse.NumRemainingStores
Program base new100 diff
test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2%
test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6%
test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4%
test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2%
test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0%
test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5%
test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5%
test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3%
test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3%
test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2%
test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2%
test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1%
test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1%
test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1%
test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1%
test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1%
test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1%
test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0%
test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0%
test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0%
test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0%
test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0%
test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0%
test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0%
test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0%
test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0%
There might be potential to pre-compute some of the information of which
blocks are on the path to an exit for each block, but the overall
benefit might be comparatively small.
On the set of benchmarks, 15738 times out of 20322 we reach the
CFG check, the CFG check is successful. The total number of iterations
in the CFG check is 187810, so on average we need less than 10 steps in
the check loop. Bumping the threshold in the loop from 50 to 150 gives a
few small improvements, but I don't think they warrant such a big bump
at the moment. This is all pending further tuning in the future.
Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv
Reviewed By: george.burgess.iv
Differential Revision: https://reviews.llvm.org/D78932
Currently, some fairly arbitrary subset of overriden methods in
RISCVISelLowering are private rather than public (which is the
visibility they have in TargetLowering). I suspect this is a holdover
from too closely copying another backend.
D78545 pointed out this can be difficult for some downstream patches,
and nobody has come forward to suggest a reason for keeping the
visibility as-is.
This commit simply makes all overridden methods match the public
visiblity of the parent.
Differential Revision: https://reviews.llvm.org/D79928
Extract the existing code from getInstructionThroughput into
TTImpl::getUserCost. The duplicated code in the AMDGPU backend has
also been removed.
Differential Revision: https://reviews.llvm.org/D81448
Add the remaining arithmetic opcodes into the generic implementation
of getUserCost and then call this from getInstructionThroughput. Most
of the backends have been modified to return the base implementation
for cost kinds other RecipThroughput. The outlier here is AMDGPU
which already uses getArithmeticInstrCost for all the cost kinds.
This change means that most of the opcodes can be removed from that
backends implementation of getUserCost.
Differential Revision: https://reviews.llvm.org/D80992
Summary:
Add LHM/SHM instructions. Add regression tests for them of asmparser,
mccodeemitter, and disassembler. In order to add those instructions,
add new decode functions to disassembler, and add new print functions
to instprinter.
Differential Revision: https://reviews.llvm.org/D81535
The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols.
Reviewed by craig.topper
Differential Revision: https://reviews.llvm.org/D78471
The fp16 ops are legalized by extending/chopping them as needed.
The tests are shamelessly stolen from the RISC-V backend.
Differential Revision: https://reviews.llvm.org/D77569
This ensures that we match SelectionDAG behaviour by waiting until the expand
pseudos pass to generate ADRP + ADD pairs. Doing this at selection time for the
G_ADD_LOW is fine because by the time we get to selecting the G_ADD_LOW,
previous attempts to fold it into loads/stores must have failed.
Differential Revision: https://reviews.llvm.org/D81512
blocks.
Summary: The current LoopFusion forget to update the incoming block of
the phis in second loop guard non loop successor from second loop guard
block to first loop guard block. A test case is provided to better
understand the problem.
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81421
Summary:
The natural alignments for extending and splatting loads had not
previously been tested. It is good to have them tested because they
are non-obvious details in the SIMD spec proposal.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81303
The test case has been reviewed in the patch D75866
Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson
Differential Revision: https://reviews.llvm.org/D75866
SUMMARY:
in the aix assembly , it do not have .hidden and .protected directive.
in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as.
in aix assembly, the visibility attribute are support in the pseudo-op like
.extern Name [ , Visibility ]
.globl Name [, Visibility ]
.weak Name [, Visibility ]
in this patch, we implement the visibility attribute for the global variable, function or extern function .
for example.
extern __attribute__ ((visibility ("hidden"))) int
bar(int* ip);
__attribute__ ((visibility ("hidden"))) int b = 0;
__attribute__ ((visibility ("hidden"))) int
foo(int* ip){
return (*ip)++;
}
the visibility of .comm linkage do not support , we will have a separate patch for it.
we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it.
Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson
Differential Revision: https://reviews.llvm.org/D75866
Summary: Refactor the current global header iteration to be callback-based, and add a feature that reports the size of the global variable during reporting. This allows binaries without symbols to still report the size of the global variable, which is always available in the HWASan globals PT_NOTE metadata.
Reviewers: eugenis, pcc
Reviewed By: pcc
Subscribers: mgorny, llvm-commits, #sanitizers
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D80599
Summary:
Fixes up two small issues with the gn build.
1 - Ensures that the correct ldflag `-pthread` is provided, not just linking the library.
2 - Ensures that libraries are linked in the same group as the dependencies. This fixes a problem where system libraries (libc) are involved in a link-order dependency that's not being fulfilled.
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80591
Similar to what some other targets have done. This information
could be reused by other frontends so doesn't make sense to live
in clang.
-Rename CK_Generic to CK_None to better reflect its illegalness.
-Move function for translating from string to enum into llvm.
-Call checkCPUKind directly from the string to enum translation
and update CPU kind to CK_None accordinly. Caller will use CK_None
as sentinel for bad CPU.
I'm planning to move all the CPU to feature mapping out next. As
part of that I want to devise a better way to express CPUs inheriting
features from an earlier CPU. Allowing this to be expressed in a
less rigid way than just falling through a switch. Or using gotos
as we've had to do lately.
Differential Revision: https://reviews.llvm.org/D81439
As shown in PR46237:
https://bugs.llvm.org/show_bug.cgi?id=46237
The size-savings win for hoisting an 8-bit ALU immediate (intentionally
excluding store constants) requires extreme conditions; it may not even
be possible when including REX prefix bytes on x86-64.
I did draft a version of this patch that included use counts after the
loop, but I suspect that accounting is not working as expected. I think
that is because the number of constant uses are changing as we select
instructions (for example as we transform shl/add into LEA).
Differential Revision: https://reviews.llvm.org/D81468
It was annoying enough that every custom lowering needed to set the
insert point, but this was made worse since now these all needed to be
updated to setInstrAndDebugLoc. Consolidate these so every
legalization action has the right insert position by default.
This should fix dropping debug info in every custom AMDGPU
legalization.
The current relationship between LegalizerHelper and MachineIRBuilder
confuses me, because the LegalizerHelper modifies the MachineIRBuilder
which it does not own. Constructing a LegalizerHelper destroys the
insert point, since the constructor calls setMF, which clears all the
fields. Try to separate these functions, so it's possible to construct
a LegalizerHelper from an existing MachineIRBuilder without losing the
insert point/debug loc.
The construction APIs for MachineIRBuilder don't make much sense, and
it's been annoying to sort through it with these trivial functions
separate from the declaration.
New instructions were getting printed both in createdInstr, and in the
final printNewInstrs, so it made it look like the same instructions
were created twice. This overall made reading the debug output
harder. Stop printing the initial construction and only print new
instructions in the summary at the end. This avoids printing the less
useful case where instructions are sometimes initially created with no
operands.
I'm not sure this is the correct instance to remove; now the visible
ordering is different. Now you will typically see the one erased
instruction message before all the new instructions in order. I think
this is the more logical view of typical legalization changes,
although it's mechanically backwards from the normal
insert-new-erase-old pattern.
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422
If a resource can be held for multiple cycles in the schedule model
then an instruction can be placed into the available queue, another
instruction can be scheduled, but the first will not be taken back out if
the two instructions hazard. To fix this make sure that we update the
available queue even on the first MOp of a cycle, pushing available
instructions back into the pending queue if they now conflict.
This happens with some downstream schedules we have around MVE
instruction scheduling where we use ResourceCycles=[2] to show the
instruction executing over two beats. Apparently the test changes here
are OK too.
Differential Revision: https://reviews.llvm.org/D76909
scalarizeBinop currently folds
vec_bo((inselt VecC0, V0, Index), (inselt VecC1, V1, Index))
->
inselt(vec_bo(VecC0, VecC1), scl_bo(V0,V1), Index)
This patch extends this to account for cases where one of the vec_bo operands is already all-constant and performs similar cost checks to determine if the scalar binop with a constant still makes sense:
vec_bo((inselt VecC0, V0, Index), VecC1)
->
inselt(vec_bo(VecC0, VecC1), scl_bo(V0,extractelt(V1,Index)), Index)
Fixes PR42174
Differential Revision: https://reviews.llvm.org/D80885
Summary:
It is important to emit HINT instructions instead of BTI ones when
BTI is disabled. This allows compatibility with other assemblers
(e.g. GAS).
Still, developers of assembly code will want to write code that is
compatible with both pre- and post-BTI CPUs. They could use HINT
mnemonics, but the new mnemonics are a lot more readable (e.g.
bti c instead of hint #34), and they will result in the same
encodings. So, while LLVM should not *emit* the new mnemonics when
BTI is disabled, this patch will at least make LLVM *accept*
assembly code that uses them.
Reviewers: pbarrio, tamas.petz, ostannard
Reviewed By: pbarrio, ostannard
Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81257
Same idea as for zip, uzp, etc. Teach the post-legalizer combiner to recognize
G_SHUFFLE_VECTORs that are trn1/trn2 instructions.
- Add G_TRN1 and G_TRN2
- Port mask matching code from AArch64ISelLowering
- Produce G_TRN1 and G_TRN2 in the post-legalizer combiner
- Select via importer
Add select-trn.mir to test selection.
Add postlegalizer-combiner-trn.mir to test the combine. This is similar to the
existing arm64-trn test.
Note that both of these tests contain things we currently don't legalize.
I figured it would be easier to test these now rather than later, since once
we legalize the G_SHUFFLE_VECTORs, it's not guaranteed that someone will update
the tests.
Differential Revision: https://reviews.llvm.org/D81182
Summary:
As specified in https://github.com/WebAssembly/simd/pull/232. These
instructions are implemented as LLVM intrinsics for now rather than
normal ISel patterns to make these instructions opt-in. Once the
instructions are merged to the spec proposal, the intrinsics will be
replaced with proper ISel patterns.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81222