Summary: Some constants can be handled with less instructions than our current results. And it seems our original approach is not very easy to extend. Therefore this patch proposes to materialize all 64-bit constants by enumerated patterns.
I traversed almost all constants to verified the functionality of these pattens. A traversed comparison of the number of instructions used by the original method and the new method has also been completed, where no degradation was caused by this patch. This patch also passed Bootstrap test and SPEC test.
Improvements of this patch are shown in llvm/test/CodeGen/PowerPC/constants-i64.ll
Reviewed By: steven.zhang, stefanp
Differential Revision: https://reviews.llvm.org/D92089
This relands D64327 with a more specific workaround for R_386_GOTOFF
(gold<2.34 bug https://sourceware.org/bugzilla/show_bug.cgi?id=16794)
.debug_info has quite a few .debug_str relocations (R_386_32/R_ARM_ABS32).
The original workaround was too general and introduced too many .L symbols
used just as relocation targets.
From the original review:
... it reduced the size of a big ARM-32 debug image by 33%. It contained ~68M
of relocations symbols out of total ~71M symbols (96% of symbols table was
generated for relocations with symbol).
This patch does two things:
1: fix the typo that intrinsic mfvscr should be with no readmem property
2: since VSCR is not modeled yet, add has side effect for SAT bit clobber
intrinsics/instructions.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D90807
Currently when building OpenMP along with LLVM, CMake variables for OpenMP (prefix with `LIBOMP` and `LIBOMPTARGET`) will not be passed through because by default it uses the prefix of the runtime name, aka `OPENMP` in this case. This patch fixed this issue.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D93603
This is consistent with the resolution to power-of-2 alignments.
Otherwise, emitCodeAlignment and emitValueToAlignment cannot handle alignments
larger than 2**32 and will trigger assertion failure (PR35218).
Note: GNU as as of 2.35 will use 1 for such a large byte `.align`
Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager.
This fix makes the npm OptBisectInstrumentation use the global OptBisect.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D92897
Also remove iteration over ArchiveFile symbols in buildInputSectionPriorities --
that was rendered unnecessary after D92539, which included ObjFiles from
ArchiveFiles inside the `inputFiles` vector.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D93569
Obj-C symbols may have spaces and colons, which our previous order file
parser would be confused by. The order file format has made the very unfortunate
choice of using colons for its delimiters, which means that we have to use
heuristics to determine if a given colon is part of a symbol or not...
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93567
This patch removes InstrumentFuncEntry as it is dead.
The constructor of FuncPGOInstrumentation passes InstrumentFuncEntry
to MST, but it doesn't make a local copy as a member variable.
Modify the cppcoreguidelines-pro-type-member-init checker to ignore warnings from the move and copy-constructors when they are compiler defined with `= default` outside of the type declaration.
Reported as [LLVM bug 36819](https://bugs.llvm.org/show_bug.cgi?id=36819)
Reviewed By: malcolm.parsons
Differential Revision: https://reviews.llvm.org/D93333
Define vector vfwmul intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93584
Define vector vfwadd/vfwsub intrinsics and lower them to V
instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93583
Define vector vfsgnj/vfsgnjn/vfsgnjx intrinsics and lower them to V
instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93581
Define vector vfmul/vfdiv/vfrdiv intrinsics and lower them to V instructions.
We work with @rogfer01 from BSC to come out this patch.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Differential Revision: https://reviews.llvm.org/D93580
Extracted regions can have both inputs and outputs. In addition, the
CodeExtractor removes inputs that are only used in llvm.assumes, and
sunken allocas (values are used entirely in the extracted region as
denoted by lifetime intrinsics). We also cannot combine sections that
have different constants in the same structural location, and these
constants will have to elevated to argument. This patch deduplicates
extracted functions that only have inputs and non of the special cases.
We test that correctly deduplicate in:
test/Transforms/IROutliner/outlining-same-globals.ll
test/Transforms/IROutliner/outlining-same-constants.ll
test/Transforms/IROutliner/outlining-different-structure.ll
Reviewers: jroelofs, paquette
Differential Revision: https://reviews.llvm.org/D86978
Extracted regions can have both inputs and outputs. In addition, the
CodeExtractor removes inputs that are only used in llvm.assumes, and
sunken allocas (values are used entirely in the extracted region as
denoted by lifetime intrinsics). We also cannot combine sections that
have different constants in the same structural location, and these
constants will have to elevated to argument. This patch deduplicates
extracted functions that only have inputs and non of the special cases.
We test that correctly deduplicate in:
test/Transforms/IROutliner/outlining-same-globals.ll
test/Transforms/IROutliner/outlining-same-constants.ll
test/Transforms/IROutliner/outlining-different-structure.ll
These properties aren't additive. They are closer to ReadOnly and
WriteOnly. The default is ReadWrite. ReadMem cancels the write property and
WriteMem cancels the read property. Combining them leaves neither.
This patch checks that when we process WriteMem, the Mod flag is
still set. And for ReadMem we check that the Ref flag set still set.
I've updated 2 target intrinsics that were combining these properties.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D93571
The common encodings table holds only 127 entries. The encodings index for compact entries is 8 bits wide, and indexes 127..255 are stored locally to each second-level page. Prior to this diff, lld would `fatal()` if encodings overflowed the 127 limit.
This diff populates a per-second-level-page encodings table as needed. When the per-page encodings table hits its limit, we must terminate the page. If such early termination would consume fewer entries than a regular (non-compact) encoding page, then we prefer the regular format.
Caveat: one reason the common-encoding table might overflow is because of DWARF debug-info references, which are not yet implemented and will come with a later diff.
Differential Revision: https://reviews.llvm.org/D93267
... so just ensure that we pass DomTreeUpdater it into it.
Apparently, there were no dedicated tests just for that functionality,
so i'm adding one here.