llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	1f9bc2854f	[GlobalISel][AArch64][NFC] Fix incorrect comment in selectUnmergeValues s/scalar/vector/ llvm-svn: 352243	2019-01-25 21:28:27 +00:00
Alina Sbirlea	a34bcbf335	Revert rL352238. llvm-svn: 352241	2019-01-25 21:12:08 +00:00
Alina Sbirlea	890a8e575f	[WarnMissedTransforms] Set default to 1. Summary: Set default value for retrieved attributes to 1, since the check is against 1. Eliminates the warning noise generated when the attributes are not present. Reviewers: sanjoy Subscribers: jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D57253 llvm-svn: 352238	2019-01-25 20:51:55 +00:00
Ana Pazos	05a6064385	Reapply: [RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI This reapplies commit r352010 with RISC-V test fixes. llvm-svn: 352237	2019-01-25 20:22:49 +00:00
Guozhi Wei	81f3fd4bf8	[MBP] Don't move bottom block before header if it can't reduce taken branches If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case: -->OldTop<- \| . \| \| . \| \| . \| ---Pred \| \| \| BB----- Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving. Differential Revision: https://reviews.llvm.org/D57067 llvm-svn: 352236	2019-01-25 19:45:13 +00:00
Craig Topper	4cf28bad5b	[X86] Combine masked store and truncate into masked truncating stores. We also need to combine to masked truncating with saturation stores, but I'm leaving that for a future patch. This does regress some tests that used truncate wtih saturation followed by a masked store. Those now use a truncating store and use min/max to saturate. Differential Revision: https://reviews.llvm.org/D57218 llvm-svn: 352230	2019-01-25 18:37:36 +00:00
Vedant Kumar	db3f9774ee	[HotColdSplit] Introduce a cost model to control splitting behavior The main goal of the model is to avoid increasing function size, as that would eradicate any memory locality benefits from splitting. This happens when: - There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost. - The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller. - The code size cost of the split code is less than the cost of a set-up call. A secondary goal is to prevent excessive overall binary size growth. With the cost model in place, I experimented to find a splitting threshold that works well in practice. To make warm & cold code easily separable for analysis purposes, I moved split functions to a "cold" section. I experimented with thresholds between [0, 4] and set the default to the threshold which minimized geomean __text size. Experiment data from building LNT+externals for X86 (N = 639 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1736.3 \| 0, n=0 \| 10961.6 \| \| -Os, thresh=0 \| 1740.53 \| 124.482, n=134 \| 11014 \| \| -Os, thresh=1 \| 1734.79 \| 57.8781, n=90 \| 10978.6 \| \| -Os, thresh=2 \| 1733.85 \| 65.6604, n=61 \| 10977.6 \| \| -Os, thresh=3 \| 1733.85 \| 65.3071, n=61 \| 10977.6 \| \| -Os, thresh=4 \| 1735.08 \| 67.5156, n=54 \| 10965.7 \| \| -Oz \| 1554.4 \| 0, n=0 \| 10153 \| \| -Oz, thresh=2 \| 1552.2 \| 65.633, n=61 \| 10176 \| \| -O3 \| 2563.37 \| 0, n=0 \| 13105.4 \| \| -O3, thresh=2 \| 2559.49 \| 71.1072, n=61 \| 13162.4 \| Picking thresh=2 reduces the geomean __text section size by 0.14% at -Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that TEXT size is page-aligned, whereas section sizes are byte-aligned. Experiment data from building LNT+externals for ARM64 (N = 558 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1763.96 \| 0, n=0 \| 42934.9 \| \| -Os, thresh=2 \| 1760.9 \| 76.6755, n=61 \| 42934.9 \| Picking thresh=2 reduces the geomean __text section size by 0.17% at -Os and causes no growth in the TEXT segment. Measurements were done with D57082 (r352080) applied. Differential Revision: https://reviews.llvm.org/D57125 llvm-svn: 352228	2019-01-25 18:30:37 +00:00
Vedant Kumar	13ef84fced	[MC] Teach the MachO object writer about N_FUNC_COLD N_FUNC_COLD is a new MachO symbol attribute. It's a hint to the linker to order a symbol towards the end of its section, to improve locality. Example: ``` void a1() {} __attribute__((cold)) void a2() {} void a3() {} int main() { a1(); a2(); a3(); return 0; } ``` A linker that supports N_FUNC_COLD will order _a2 to the end of the text section. From `nm -njU` output, we see: ``` _a1 _a3 _main _a2 ``` Differential Revision: https://reviews.llvm.org/D57190 llvm-svn: 352227	2019-01-25 18:30:22 +00:00
Sanjay Patel	0020f8bb23	[x86] simplify logic in lowerShuffleWithUndefHalf(); NFCI This seems unnecessarily complicated because we gave names to opposite polarity bools and have code comments that don't really line up with the logic. Step 1: remove UndefUpper and assert that it is the opposite of UndefLower after the initial early exit. llvm-svn: 352217	2019-01-25 17:00:41 +00:00
Florian Hahn	ca95ee5e11	[DiagnosticInfo] Add support for preserving newlines in remark arguments. This patch adds a new type StringBlockVal which can be used to emit a YAML block scalar, which preserves newlines in a multiline string. It also updates MappingTraits<DiagnosticInfoOptimizationBase::Argument> to use it for argument values with more than a single newline. This is helpful for remarks that want to display more in-depth information in a more structured way. Reviewers: thegameg, anemet Reviewed By: anemet Subscribers: hfinkel, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57159 llvm-svn: 352216	2019-01-25 16:59:06 +00:00
Tom Weaver	4db70d9695	[TEST][COMMIT] - fix comment typo in AsmPrinter/DwarfDebug.cpp - NFC llvm-svn: 352214	2019-01-25 16:29:35 +00:00
Simon Pilgrim	f56298f4b9	[X86] Simplify X86ISD::ADD/SUB if we don't use the result flag Simplify to the generic ISD::ADD/SUB if we don't make use of the result flag. This mainly helps with ADDCARRY/SUBBORROW intrinsics which get expanded to X86ISD::ADD/SUB but could be simplified further. Noticed in some of the test cases in PR31754 Differential Revision: https://reviews.llvm.org/D57234 llvm-svn: 352210	2019-01-25 15:58:28 +00:00
Sanjay Patel	21aa6ddc14	[x86] narrow a shuffle that doesn't use or set any high elements This isn't the final fix for our reduction/horizontal codegen, but it takes care of a lot of the problems. After we narrow the shuffle, existing combines for insert/extract and binops kick in, and we end up with cheaper 128-bit ops. The avg and mul reduction tests show an existing shuffle lowering hole for AVX2/AVX512. I think in its most minimal form this is: https://bugs.llvm.org/show_bug.cgi?id=40434 ...but we might need multiple fixes to get it right. Differential Revision: https://reviews.llvm.org/D57156 llvm-svn: 352209	2019-01-25 15:37:42 +00:00
Sam McCall	1e7491ea9c	[JSON] Work around excess-precision issue when comparing T_Integer numbers. Reviewers: bkramer Subscribers: kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D57237 llvm-svn: 352204	2019-01-25 15:05:33 +00:00
Simon Pilgrim	dea6174b0b	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352193	2019-01-25 11:38:40 +00:00
Simon Pilgrim	cdf58092e4	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352191	2019-01-25 11:34:58 +00:00
Diana Picus	8976ad12a9	[ARM GlobalISel] Support shifts for Thumb2 Same as ARM. On this occasion we split some of the instruction select tests for more complicated instructions into their own files, so we can reuse them for ARM and Thumb mode. Likewise for the legalizer tests. llvm-svn: 352188	2019-01-25 10:48:42 +00:00
Diana Picus	23628c7b05	[ARM GlobalISel] Remove rebase artifact from r351882. NFC r351882 introduced some superfluous calls to mark G_INTTOPTR and G_PTRTOINT as legal (looks like a rebase mishap). Remove them. llvm-svn: 352187	2019-01-25 10:48:35 +00:00
Javed Absar	a3e3d85286	[TblGen] Extend !if semantics through new feature !cond This patch extends TableGen language with !cond operator. Instead of embedding !if inside !if which can get cumbersome, one can now use !cond. Below is an example to convert an integer 'x' into a string: !cond(!lt(x,0) : "Negative", !eq(x,0) : "Zero", !eq(x,1) : "One, 1 : "MoreThanOne") Reviewed By: hfinkel, simon_tatham, greened Differential Revision: https://reviews.llvm.org/D55758 llvm-svn: 352185	2019-01-25 10:25:25 +00:00
Anton Korobeynikov	509d5c4a7d	[MSP430] Fix absolute addressing mode printing in AsmPrinter Align checks for absolute addressing mode with its current implementation (SR is used as a base register). This fixes https://bugs.llvm.org/show_bug.cgi?id=39993 Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56785 llvm-svn: 352178	2019-01-25 09:14:05 +00:00
Zi Xuan Wu	308a609c6e	[PowerPC] Enhance the fast selection of cmp instruction and clean up related asserts Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support. We'd use VSX float comparison instruction instead of non-vsx float comparison instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is opened. If the target does not have corresponding VSX instruction comparison for some type, just copy VSX-related register to common float register class and use non-vsx comparison instruction. Differential Revision: https://reviews.llvm.org/D57078 llvm-svn: 352174	2019-01-25 07:24:59 +00:00
Craig Topper	6fd9af587a	[X86] Add non-masked versions of vpconflict intrinsics so we can use a select in the header file in clang. I'll remove and autoupgrade the old intrinsics in a future commit. llvm-svn: 352172	2019-01-25 07:08:07 +00:00
Alex Bradbury	456d3798d6	[RISCV] Custom-legalise i32 SDIV/UDIV/UREM on RV64M Follow the same custom legalisation strategy as used in D57085 for variable-length shifts (see that patch summary for more discussion). Although we may lose out on some late-stage DAG combines, I think this custom legalisation strategy is ultimately easier to reason about. There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all neutral in terms of the number of instructions. Differential Revision: https://reviews.llvm.org/D57096 llvm-svn: 352171	2019-01-25 05:11:34 +00:00
Max Kazantsev	38cd9acbb9	[LoopSimplifyCFG] Fix inconsistency in blocks in loop markup 2nd part of D57095 with the same reason, just in another place. We never fold branches that are not immediately in the current loop, but this check is missing in `IsEdgeLive` As result, it may think that the edge in subloop is dead while it's live. It's a pessimization in the current stance. Differential Revision: https://reviews.llvm.org/D57147 Reviewed By: rupprecht llvm-svn: 352170	2019-01-25 05:05:02 +00:00
Alex Bradbury	299d690a50	[RISCV] Custom-legalise 32-bit variable shifts on RV64 The previous DAG combiner-based approach had an issue with infinite loops between the target-dependent and target-independent combiner logic (see PR40333). Although this was worked around in rL351806, the combiner-based approach is still potentially brittle and can fail to select the 32-bit shift variant when profitable to do so, as demonstrated in the pr40333.ll test case. This patch instead introduces target-specific SelectionDAG nodes for SHLW/SRLW/SRAW and custom-lowers variable i32 shifts to them. pr40333.ll is a good example of how this approach can improve codegen. This adds DAG combine that does SimplifyDemandedBits on the operands (only lower 32-bits of first operand and lower 5 bits of second operand are read). This seems better than implementing SimplifyDemandedBitsForTargetNode as there is no guarantee that would be called (and it's not for e.g. the anyext return test cases). Also implements ComputeNumSignBitsForTargetNode. There are codegen changes in atomic-rmw.ll and atomic-cmpxchg.ll but the new instruction sequences are semantically equivalent. Differential Revision: https://reviews.llvm.org/D57085 llvm-svn: 352169	2019-01-25 05:04:00 +00:00
Matt Arsenault	3b9a82ff2c	AMDGPU/GlobalISel: Remove leftover setAction Also move G_GEP actions together. llvm-svn: 352168	2019-01-25 04:54:00 +00:00
Matt Arsenault	3e08b772b3	AMDGPU/GlobalISel: Scalarize add/sub llvm-svn: 352167	2019-01-25 04:53:57 +00:00
Matt Arsenault	e6cebd0d69	GlobalISel: fewerElementsVector for more cast types llvm-svn: 352166	2019-01-25 04:37:33 +00:00
Matt Arsenault	95fd95cfe0	GlobalISel: fewerElementsVector for a few more trivial ops llvm-svn: 352165	2019-01-25 04:03:38 +00:00
Matt Arsenault	5d622fbcc1	AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul llvm-svn: 352162	2019-01-25 03:23:04 +00:00
Vedant Kumar	9d70f2b939	[HotColdSplit] Describe the pass in more detail, NFC llvm-svn: 352161	2019-01-25 03:22:38 +00:00
Vedant Kumar	65de025d64	[HotColdSplit] Split more aggressively before/after cold invokes While a cold invoke itself and its unwind destination can't be extracted, code which unconditionally executes before/after the invoke may still be profitable to extract. With cost model changes from D57125 applied, this gives a 3.5% increase in split text across LNT+externals on arm64 at -Os. llvm-svn: 352160	2019-01-25 03:22:23 +00:00
Matt Arsenault	1b1e685f10	GlobalISel: Support fewerElementsVector for icmp/fcmp Also legalize 64-bit compares for AMDGPU llvm-svn: 352157	2019-01-25 02:59:34 +00:00
Matt Arsenault	ca676343a9	GlobalISel: Implement fewerElementsVector for extensions llvm-svn: 352155	2019-01-25 02:36:32 +00:00
Peter Collingbourne	1a8acfb768	hwasan: If we split the entry block, move static allocas back into the entry block. Otherwise they are treated as dynamic allocas, which ends up increasing code size significantly. This reduces size of Chromium base_unittests by 2MB (6.7%). Differential Revision: https://reviews.llvm.org/D57205 llvm-svn: 352152	2019-01-25 02:08:46 +00:00
Matt Arsenault	990f507704	GlobalISel: Add convenience mutatations to scalarize llvm-svn: 352143	2019-01-25 00:51:00 +00:00
Benjamin Kramer	653020d3cc	[GlobalISel][AArch64] Avoid unused variable warning for variable only used in assert llvm-svn: 352133	2019-01-24 23:45:07 +00:00
Nemanja Ivanovic	b9b75de0ae	[PowerPC] Exploit store instructions that store a single vector element This patch exploits the instructions that store a single element from a vector to preform a (store (extract_elt)). We already have code that does this with ISA 3.0 instructions that were added to handle i8/i16 types. However, we had never exploited the existing ones that handle f32/f64/i32/i64 types. Differential revision: https://reviews.llvm.org/D56175 llvm-svn: 352131	2019-01-24 23:44:28 +00:00
Matt Arsenault	6bab7ab11e	RegBankSelect: Fix use after free in r352123 llvm-svn: 352130	2019-01-24 23:42:01 +00:00
Benjamin Kramer	1411ecf08b	[GlobalISel][AArch64] Avoid unused function warnings in Release builds llvm-svn: 352129	2019-01-24 23:39:47 +00:00
Sanjay Patel	4c304b2923	[x86] move half-size shuffle mask creation to helper; NFC As noted in D57156, we want to check at least part of this pattern earlier (in combining), so this will allow the code to be shared instead of duplicated. llvm-svn: 352127	2019-01-24 23:12:36 +00:00
Aditya Nandakumar	3ba0d94bce	[GISel]: Change how CSE is enabled by default for each pass https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126	2019-01-24 23:11:25 +00:00
Jessica Paquette	76c40f827d	Suppress unused capture warning in CheckCopy Werror bots didn't like the lambda + assert thing in my previous commit. Capture everything to suppress the error. Example failure here: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29393 llvm-svn: 352124	2019-01-24 22:51:31 +00:00
Matt Arsenault	baa5d2e69c	RegBankSelect: Support some more complex part mappings llvm-svn: 352123	2019-01-24 22:47:04 +00:00
Zachary Turner	8371da385a	[PDB] Increase TPI hash bucket count. PDBs contain several serialized hash tables. In the microsoft-pdb repo published to support LLVM implementing PDB support, the provided initializes the bucket count for the TPI and IPI streams to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398. In the LLVM code for generating PDBs, these streams are created with minimum number of buckets. This difference makes LLVM generated PDBs slower for when used for debugging. Patch by C.J. Hebert Differential Revision: https://reviews.llvm.org/D56942 llvm-svn: 352117	2019-01-24 22:25:55 +00:00
Jessica Paquette	245047dfe8	[GlobalISel][AArch64] Add isel support for FP16 vector @llvm.ceil This patch adds support for vector @llvm.ceil intrinsics when full 16 bit floating point support isn't available. To do this, this patch... - Implements basic isel for G_UNMERGE_VALUES - Teaches the legalizer about 16 bit floats - Teaches AArch64RegisterBankInfo to respect floating point registers on G_BUILD_VECTOR and G_UNMERGE_VALUES - Teaches selectCopy about 16-bit floating point vectors It also adds - A legalizer test for the 16-bit vector ceil which verifies that we create a G_UNMERGE_VALUES and G_BUILD_VECTOR when full fp16 isn't supported - An instruction selection test which makes sure we lower to G_FCEIL when full fp16 is supported - A test for selecting G_UNMERGE_VALUES And also updates arm64-vfloatintrinsics.ll to show that the new ceiling types work as expected. https://reviews.llvm.org/D56682 llvm-svn: 352113	2019-01-24 22:00:41 +00:00
Bob Haarman	38ebaf7d5d	allow COFF .def directive in module assembly when using ThinLTO Summary: Using COFF's .def directive in module assembly used to crash ThinLTO with "this directive only supported on COFF targets" when getting symbol information in ModuleSymbolTable. This change allows ModuleSymbolTable to process such code and adds a test to verify that the .def directive has the desired effect on the native object file, with and without ThinLTO. Fixes https://bugs.llvm.org/show_bug.cgi?id=36789 Reviewers: rnk, pcc, vlad.tsyrklevich Subscribers: mehdi_amini, eraman, hiraditya, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57073 llvm-svn: 352112	2019-01-24 21:41:03 +00:00
Eli Friedman	525ef0159d	[Analysis] Fix isSafeToLoadUnconditionally handling of volatile. A volatile operation cannot be used to prove an address points to normal memory. (LangRef was recently updated to state it explicitly.) Differential Revision: https://reviews.llvm.org/D57040 llvm-svn: 352109	2019-01-24 21:31:13 +00:00
Michael Trent	f4c902bd77	Limit dyld image suffixes guessed by guessLibraryShortName() Summary: guessLibraryShortName() separates a full Mach-O dylib install name path into a short name and a dyld image suffix. The short name is the name of the dylib without its path or extension. The dyld image suffix is a string used by dyld to load variants of dylibs if available at runtime; for example, "when binding this process, load 'debug' variants of all required dylibs." dyld knows exactly what the image suffix is, but by convention diagnostic tools such as llvm-nm attempt to guess suffix names by looking at the install name path. These dyld image suffixes are separated from the short name by a '_' character. Because the '_' character is commonly used to separate words in filenames guessLibraryShortName() cannot reliably separate a dylib's short name from an arbitrary image suffix; imagine if both the short name and the suffix contains an '_' character! To better deal with this ambiguity, guessLibraryShortName() will recognize only "_debug" and "_profile" as valid Suffix values. Calling code needs to be tolerant of guessLibraryShortName() guessing incorrectly. The previous implementation of guessLibraryShortName() did not allow '_' characters to appear in short names. When present, the short name would be truncated, e.g., "libcompiler_rt" => "libcompiler". This change allows "libcompiler_rt" and "libcompiler_rt_debug" to both be recognized as "libcompiler_rt". rdar://47412244 Reviewers: kledzik, lhames, pete Reviewed By: pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56978 llvm-svn: 352104	2019-01-24 20:59:44 +00:00
Haojian Wu	b9613a39b8	Fix a compiler error introduced in r352093. llvm-svn: 352098	2019-01-24 20:30:48 +00:00

1 2 3 4 5 ...

119932 Commits