llvm-project

Commit Graph

Author	SHA1	Message	Date
Alina Sbirlea	890a8e575f	[WarnMissedTransforms] Set default to 1. Summary: Set default value for retrieved attributes to 1, since the check is against 1. Eliminates the warning noise generated when the attributes are not present. Reviewers: sanjoy Subscribers: jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D57253 llvm-svn: 352238	2019-01-25 20:51:55 +00:00
Ana Pazos	05a6064385	Reapply: [RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI This reapplies commit r352010 with RISC-V test fixes. llvm-svn: 352237	2019-01-25 20:22:49 +00:00
Guozhi Wei	81f3fd4bf8	[MBP] Don't move bottom block before header if it can't reduce taken branches If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case: -->OldTop<- \| . \| \| . \| \| . \| ---Pred \| \| \| BB----- Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving. Differential Revision: https://reviews.llvm.org/D57067 llvm-svn: 352236	2019-01-25 19:45:13 +00:00
Craig Topper	4cf28bad5b	[X86] Combine masked store and truncate into masked truncating stores. We also need to combine to masked truncating with saturation stores, but I'm leaving that for a future patch. This does regress some tests that used truncate wtih saturation followed by a masked store. Those now use a truncating store and use min/max to saturate. Differential Revision: https://reviews.llvm.org/D57218 llvm-svn: 352230	2019-01-25 18:37:36 +00:00
Vedant Kumar	db3f9774ee	[HotColdSplit] Introduce a cost model to control splitting behavior The main goal of the model is to avoid increasing function size, as that would eradicate any memory locality benefits from splitting. This happens when: - There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost. - The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller. - The code size cost of the split code is less than the cost of a set-up call. A secondary goal is to prevent excessive overall binary size growth. With the cost model in place, I experimented to find a splitting threshold that works well in practice. To make warm & cold code easily separable for analysis purposes, I moved split functions to a "cold" section. I experimented with thresholds between [0, 4] and set the default to the threshold which minimized geomean __text size. Experiment data from building LNT+externals for X86 (N = 639 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1736.3 \| 0, n=0 \| 10961.6 \| \| -Os, thresh=0 \| 1740.53 \| 124.482, n=134 \| 11014 \| \| -Os, thresh=1 \| 1734.79 \| 57.8781, n=90 \| 10978.6 \| \| -Os, thresh=2 \| 1733.85 \| 65.6604, n=61 \| 10977.6 \| \| -Os, thresh=3 \| 1733.85 \| 65.3071, n=61 \| 10977.6 \| \| -Os, thresh=4 \| 1735.08 \| 67.5156, n=54 \| 10965.7 \| \| -Oz \| 1554.4 \| 0, n=0 \| 10153 \| \| -Oz, thresh=2 \| 1552.2 \| 65.633, n=61 \| 10176 \| \| -O3 \| 2563.37 \| 0, n=0 \| 13105.4 \| \| -O3, thresh=2 \| 2559.49 \| 71.1072, n=61 \| 13162.4 \| Picking thresh=2 reduces the geomean __text section size by 0.14% at -Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that TEXT size is page-aligned, whereas section sizes are byte-aligned. Experiment data from building LNT+externals for ARM64 (N = 558 programs, all sizes in bytes): \| Configuration \| __text geom size \| __cold geom size \| TEXT geom size \| \| -Os \| 1763.96 \| 0, n=0 \| 42934.9 \| \| -Os, thresh=2 \| 1760.9 \| 76.6755, n=61 \| 42934.9 \| Picking thresh=2 reduces the geomean __text section size by 0.17% at -Os and causes no growth in the TEXT segment. Measurements were done with D57082 (r352080) applied. Differential Revision: https://reviews.llvm.org/D57125 llvm-svn: 352228	2019-01-25 18:30:37 +00:00
Vedant Kumar	13ef84fced	[MC] Teach the MachO object writer about N_FUNC_COLD N_FUNC_COLD is a new MachO symbol attribute. It's a hint to the linker to order a symbol towards the end of its section, to improve locality. Example: ``` void a1() {} __attribute__((cold)) void a2() {} void a3() {} int main() { a1(); a2(); a3(); return 0; } ``` A linker that supports N_FUNC_COLD will order _a2 to the end of the text section. From `nm -njU` output, we see: ``` _a1 _a3 _main _a2 ``` Differential Revision: https://reviews.llvm.org/D57190 llvm-svn: 352227	2019-01-25 18:30:22 +00:00
Florian Hahn	fd7ee47940	[opt-viewer] Add javascript to expand/hide full message for multiline remarks. This patch adds support for displaying remarks with multiple lines. For such remarks, it creates a hidden div containing the message's lines except the first one in a <pre> tag. It also prepends a link (with '+' as text) to the regular remark line. This link can be used to show/hide the div containing the full remark. In combination with D57159, this allows for better displaying of multiline remarks in the html pages generated by opt-viewer. The Javascript is very simple and should be supported by any recent major browser. Reviewers: hfinkel, anemet, thegameg, serge-sans-paille Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D57167 llvm-svn: 352223	2019-01-25 17:48:31 +00:00
Sanjay Patel	0020f8bb23	[x86] simplify logic in lowerShuffleWithUndefHalf(); NFCI This seems unnecessarily complicated because we gave names to opposite polarity bools and have code comments that don't really line up with the logic. Step 1: remove UndefUpper and assert that it is the opposite of UndefLower after the initial early exit. llvm-svn: 352217	2019-01-25 17:00:41 +00:00
Florian Hahn	ca95ee5e11	[DiagnosticInfo] Add support for preserving newlines in remark arguments. This patch adds a new type StringBlockVal which can be used to emit a YAML block scalar, which preserves newlines in a multiline string. It also updates MappingTraits<DiagnosticInfoOptimizationBase::Argument> to use it for argument values with more than a single newline. This is helpful for remarks that want to display more in-depth information in a more structured way. Reviewers: thegameg, anemet Reviewed By: anemet Subscribers: hfinkel, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57159 llvm-svn: 352216	2019-01-25 16:59:06 +00:00
Tom Weaver	4db70d9695	[TEST][COMMIT] - fix comment typo in AsmPrinter/DwarfDebug.cpp - NFC llvm-svn: 352214	2019-01-25 16:29:35 +00:00
Javed Absar	2ee81933d0	[TblGen][NFC] Fix documentation formatting llvm-svn: 352212	2019-01-25 16:17:57 +00:00
Alex Bradbury	c67515d542	[RISCV][NFC] s/f32/f64 in double-arith.ll The intrinsic names erroneously used the .f32 variant. As the return and argument types were still double the intrinsics calls worked properly. llvm-svn: 352211	2019-01-25 16:04:04 +00:00
Simon Pilgrim	f56298f4b9	[X86] Simplify X86ISD::ADD/SUB if we don't use the result flag Simplify to the generic ISD::ADD/SUB if we don't make use of the result flag. This mainly helps with ADDCARRY/SUBBORROW intrinsics which get expanded to X86ISD::ADD/SUB but could be simplified further. Noticed in some of the test cases in PR31754 Differential Revision: https://reviews.llvm.org/D57234 llvm-svn: 352210	2019-01-25 15:58:28 +00:00
Sanjay Patel	21aa6ddc14	[x86] narrow a shuffle that doesn't use or set any high elements This isn't the final fix for our reduction/horizontal codegen, but it takes care of a lot of the problems. After we narrow the shuffle, existing combines for insert/extract and binops kick in, and we end up with cheaper 128-bit ops. The avg and mul reduction tests show an existing shuffle lowering hole for AVX2/AVX512. I think in its most minimal form this is: https://bugs.llvm.org/show_bug.cgi?id=40434 ...but we might need multiple fixes to get it right. Differential Revision: https://reviews.llvm.org/D57156 llvm-svn: 352209	2019-01-25 15:37:42 +00:00
Clement Courbet	b120127001	Revert r351954 "Add a value_type to ArrayRef." This breaks arm self-hosted buildbots. llvm-svn: 352206	2019-01-25 15:25:52 +00:00
Sam McCall	1e7491ea9c	[JSON] Work around excess-precision issue when comparing T_Integer numbers. Reviewers: bkramer Subscribers: kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D57237 llvm-svn: 352204	2019-01-25 15:05:33 +00:00
Nico Weber	e4ed82d674	gn build: Merge r352149 llvm-svn: 352202	2019-01-25 14:53:30 +00:00
Nico Weber	0c828ccc67	gn build: Revert r352200, commit message was wrong llvm-svn: 352201	2019-01-25 14:52:50 +00:00
Nico Weber	74bb231b90	gn build: Merge r352148 llvm-svn: 352200	2019-01-25 14:50:14 +00:00
Alex Bradbury	38c4ec31cb	[RISCV] Add tests to demonstrate bitcasted fneg/fabs dagcombines This target-independent code won't trigger for cases such as RV32FD where custom SelectionDAG nodes are generated. These new tests demonstrate such cases. Additionally, float-arith.ll was updated so that fneg.s, fsgnjn.s, and fabs.s selection patterns are actually exercised. llvm-svn: 352199	2019-01-25 14:33:08 +00:00
Simon Pilgrim	d6e1e3569c	Fix line endings and trim trailing whitespace. NFCI. llvm-svn: 352198	2019-01-25 14:29:57 +00:00
Haojian Wu	7852b7106a	gitignore: ignore clangd index files. Reviewers: kadircet Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D57227 llvm-svn: 352197	2019-01-25 14:05:18 +00:00
Simon Pilgrim	d41ccddda9	[X86] Add addcarry/subborrow combine tests Show failure to simplify cases with zero op/flags llvm-svn: 352196	2019-01-25 12:26:27 +00:00
James Henderson	759d5e6783	[llvm-symbolizer] Add switch to adjust addresses by fixed offset If a stack trace or similar has a list of addresses from an executable or DSO loaded at a variable address (e.g. due to ASLR), the addresses will not directly correspond to the addresses stored in the object file. If a user wishes to use llvm-symbolizer, they have to subtract the load address from every address. This is somewhat inconvenient, especially as the output of --print-address will result in the adjusted address being listed, rather than the address coming from the stack trace, making it harder to map results between the two. This change adds a new switch to llvm-symbolizer --adjust-vma which takes an offset, which is then used to automatically do this calculation. The printed address remains the input address (allowing for easy mapping), whilst the specified offset is applied to the addresses when performing the lookup. The switch is conceptually similar to llvm-objdump's new switch of the same name (see D57051), which in turn mirrors a GNU switch. There is no equivalent switch in addr2line. Reviewed by: grimar Differential Revision: https://reviews.llvm.org/D57151 llvm-svn: 352195	2019-01-25 11:49:21 +00:00
Max Kazantsev	7822d25de3	[NFC] One more crashing test on LoopSimplifyCFG llvm-svn: 352194	2019-01-25 11:47:16 +00:00
Simon Pilgrim	dea6174b0b	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352193	2019-01-25 11:38:40 +00:00
Simon Pilgrim	cdf58092e4	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352191	2019-01-25 11:34:58 +00:00
Max Kazantsev	e5116e9b4a	[NFC] Add failing test on LCSSA forming llvm-svn: 352190	2019-01-25 11:32:21 +00:00
Diana Picus	8976ad12a9	[ARM GlobalISel] Support shifts for Thumb2 Same as ARM. On this occasion we split some of the instruction select tests for more complicated instructions into their own files, so we can reuse them for ARM and Thumb mode. Likewise for the legalizer tests. llvm-svn: 352188	2019-01-25 10:48:42 +00:00
Diana Picus	23628c7b05	[ARM GlobalISel] Remove rebase artifact from r351882. NFC r351882 introduced some superfluous calls to mark G_INTTOPTR and G_PTRTOINT as legal (looks like a rebase mishap). Remove them. llvm-svn: 352187	2019-01-25 10:48:35 +00:00
Javed Absar	a3e3d85286	[TblGen] Extend !if semantics through new feature !cond This patch extends TableGen language with !cond operator. Instead of embedding !if inside !if which can get cumbersome, one can now use !cond. Below is an example to convert an integer 'x' into a string: !cond(!lt(x,0) : "Negative", !eq(x,0) : "Zero", !eq(x,1) : "One, 1 : "MoreThanOne") Reviewed By: hfinkel, simon_tatham, greened Differential Revision: https://reviews.llvm.org/D55758 llvm-svn: 352185	2019-01-25 10:25:25 +00:00
Douglas Yung	914e838e63	[llvm-objcopy] Add support for -g as an alias for --strip-debug This change adds an option -g to llvm-objcopy which is an alias for the existing option --strip-debug. This fixes PR40003. Reviewed by: alexshap Differential Revision: https://reviews.llvm.org/D57217 llvm-svn: 352182	2019-01-25 09:57:20 +00:00
Simon Pilgrim	d36f7730cd	[llvm-mca][X86] Add missing shuffle tests Match the coverage of test\CodeGen\X86\avx512-shuffle-schedule.ll so we can get rid of -print-schedule (and fix PR37160) without losing schedule tests llvm-svn: 352179	2019-01-25 09:17:30 +00:00
Anton Korobeynikov	509d5c4a7d	[MSP430] Fix absolute addressing mode printing in AsmPrinter Align checks for absolute addressing mode with its current implementation (SR is used as a base register). This fixes https://bugs.llvm.org/show_bug.cgi?id=39993 Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56785 llvm-svn: 352178	2019-01-25 09:14:05 +00:00
Max Kazantsev	6f2a0c6827	[NFC] Add test with multiple loops llvm-svn: 352176	2019-01-25 08:46:00 +00:00
Zi Xuan Wu	308a609c6e	[PowerPC] Enhance the fast selection of cmp instruction and clean up related asserts Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support. We'd use VSX float comparison instruction instead of non-vsx float comparison instruction if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is opened. If the target does not have corresponding VSX instruction comparison for some type, just copy VSX-related register to common float register class and use non-vsx comparison instruction. Differential Revision: https://reviews.llvm.org/D57078 llvm-svn: 352174	2019-01-25 07:24:59 +00:00
Craig Topper	6fd9af587a	[X86] Add non-masked versions of vpconflict intrinsics so we can use a select in the header file in clang. I'll remove and autoupgrade the old intrinsics in a future commit. llvm-svn: 352172	2019-01-25 07:08:07 +00:00
Alex Bradbury	456d3798d6	[RISCV] Custom-legalise i32 SDIV/UDIV/UREM on RV64M Follow the same custom legalisation strategy as used in D57085 for variable-length shifts (see that patch summary for more discussion). Although we may lose out on some late-stage DAG combines, I think this custom legalisation strategy is ultimately easier to reason about. There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all neutral in terms of the number of instructions. Differential Revision: https://reviews.llvm.org/D57096 llvm-svn: 352171	2019-01-25 05:11:34 +00:00
Max Kazantsev	38cd9acbb9	[LoopSimplifyCFG] Fix inconsistency in blocks in loop markup 2nd part of D57095 with the same reason, just in another place. We never fold branches that are not immediately in the current loop, but this check is missing in `IsEdgeLive` As result, it may think that the edge in subloop is dead while it's live. It's a pessimization in the current stance. Differential Revision: https://reviews.llvm.org/D57147 Reviewed By: rupprecht llvm-svn: 352170	2019-01-25 05:05:02 +00:00
Alex Bradbury	299d690a50	[RISCV] Custom-legalise 32-bit variable shifts on RV64 The previous DAG combiner-based approach had an issue with infinite loops between the target-dependent and target-independent combiner logic (see PR40333). Although this was worked around in rL351806, the combiner-based approach is still potentially brittle and can fail to select the 32-bit shift variant when profitable to do so, as demonstrated in the pr40333.ll test case. This patch instead introduces target-specific SelectionDAG nodes for SHLW/SRLW/SRAW and custom-lowers variable i32 shifts to them. pr40333.ll is a good example of how this approach can improve codegen. This adds DAG combine that does SimplifyDemandedBits on the operands (only lower 32-bits of first operand and lower 5 bits of second operand are read). This seems better than implementing SimplifyDemandedBitsForTargetNode as there is no guarantee that would be called (and it's not for e.g. the anyext return test cases). Also implements ComputeNumSignBitsForTargetNode. There are codegen changes in atomic-rmw.ll and atomic-cmpxchg.ll but the new instruction sequences are semantically equivalent. Differential Revision: https://reviews.llvm.org/D57085 llvm-svn: 352169	2019-01-25 05:04:00 +00:00
Matt Arsenault	3b9a82ff2c	AMDGPU/GlobalISel: Remove leftover setAction Also move G_GEP actions together. llvm-svn: 352168	2019-01-25 04:54:00 +00:00
Matt Arsenault	3e08b772b3	AMDGPU/GlobalISel: Scalarize add/sub llvm-svn: 352167	2019-01-25 04:53:57 +00:00
Matt Arsenault	e6cebd0d69	GlobalISel: fewerElementsVector for more cast types llvm-svn: 352166	2019-01-25 04:37:33 +00:00
Matt Arsenault	95fd95cfe0	GlobalISel: fewerElementsVector for a few more trivial ops llvm-svn: 352165	2019-01-25 04:03:38 +00:00
Matt Arsenault	5d622fbcc1	AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul llvm-svn: 352162	2019-01-25 03:23:04 +00:00
Vedant Kumar	9d70f2b939	[HotColdSplit] Describe the pass in more detail, NFC llvm-svn: 352161	2019-01-25 03:22:38 +00:00
Vedant Kumar	65de025d64	[HotColdSplit] Split more aggressively before/after cold invokes While a cold invoke itself and its unwind destination can't be extracted, code which unconditionally executes before/after the invoke may still be profitable to extract. With cost model changes from D57125 applied, this gives a 3.5% increase in split text across LNT+externals on arm64 at -Os. llvm-svn: 352160	2019-01-25 03:22:23 +00:00
Matt Arsenault	1b1e685f10	GlobalISel: Support fewerElementsVector for icmp/fcmp Also legalize 64-bit compares for AMDGPU llvm-svn: 352157	2019-01-25 02:59:34 +00:00
Matt Arsenault	ca676343a9	GlobalISel: Implement fewerElementsVector for extensions llvm-svn: 352155	2019-01-25 02:36:32 +00:00
Peter Collingbourne	1a8acfb768	hwasan: If we split the entry block, move static allocas back into the entry block. Otherwise they are treated as dynamic allocas, which ends up increasing code size significantly. This reduces size of Chromium base_unittests by 2MB (6.7%). Differential Revision: https://reviews.llvm.org/D57205 llvm-svn: 352152	2019-01-25 02:08:46 +00:00
Peter Collingbourne	0b247d1865	gn build: Set is_clang to true in stage2 toolchains. Differential Revision: https://reviews.llvm.org/D57202 llvm-svn: 352146	2019-01-25 01:18:55 +00:00
Matt Arsenault	990f507704	GlobalISel: Add convenience mutatations to scalarize llvm-svn: 352143	2019-01-25 00:51:00 +00:00
Bob Haarman	6710cc7db5	simplify COFF module assembly test and move it to Object Reviewers: pcc, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57192 llvm-svn: 352142	2019-01-25 00:33:05 +00:00
Nico Weber	0e7ba668db	gn build: Build clang with -fno-strict-aliasing, make building with gcc much quieter - gcc doesn't understand -Wstring-conversion, so pass that only to clang - disable a few gcc warnings that are noisy and also disabled in the cmake build - -Wstrict-aliasing pointed out that the cmake build builds clang with -fno-strict-aliasing, so do that too Differential Revision: https://reviews.llvm.org/D57191 llvm-svn: 352141	2019-01-25 00:29:17 +00:00
Vedant Kumar	a48cd9aedd	Try to address Windows bot failure after r352080 See the bot error message reported in https://reviews.llvm.org/D57082. Avoid trying to match full class names in -debug-pass-manager output, because they aren't portable. llvm-svn: 352138	2019-01-25 00:15:16 +00:00
Matt Arsenault	7ba2d82c34	GlobalISel: Add helper to LLT to get a scalar or vector llvm-svn: 352136	2019-01-25 00:10:49 +00:00
Benjamin Kramer	653020d3cc	[GlobalISel][AArch64] Avoid unused variable warning for variable only used in assert llvm-svn: 352133	2019-01-24 23:45:07 +00:00
Nemanja Ivanovic	b9b75de0ae	[PowerPC] Exploit store instructions that store a single vector element This patch exploits the instructions that store a single element from a vector to preform a (store (extract_elt)). We already have code that does this with ISA 3.0 instructions that were added to handle i8/i16 types. However, we had never exploited the existing ones that handle f32/f64/i32/i64 types. Differential revision: https://reviews.llvm.org/D56175 llvm-svn: 352131	2019-01-24 23:44:28 +00:00
Matt Arsenault	6bab7ab11e	RegBankSelect: Fix use after free in r352123 llvm-svn: 352130	2019-01-24 23:42:01 +00:00
Benjamin Kramer	1411ecf08b	[GlobalISel][AArch64] Avoid unused function warnings in Release builds llvm-svn: 352129	2019-01-24 23:39:47 +00:00
David Blaikie	dcc963108a	pdbutil: Remove unused variables llvm-svn: 352128	2019-01-24 23:13:20 +00:00
Sanjay Patel	4c304b2923	[x86] move half-size shuffle mask creation to helper; NFC As noted in D57156, we want to check at least part of this pattern earlier (in combining), so this will allow the code to be shared instead of duplicated. llvm-svn: 352127	2019-01-24 23:12:36 +00:00
Aditya Nandakumar	3ba0d94bce	[GISel]: Change how CSE is enabled by default for each pass https://reviews.llvm.org/D57178 Now add a hook in TargetPassConfig to query if CSE needs to be enabled. By default this hook returns false only for O0 opt level but this can be overridden by the target. As a consequence of the default of enabled for non O0, a few tests needed to be updated to not use CSE (by passing in -O0) to the run line. reviewed by: arsenm llvm-svn: 352126	2019-01-24 23:11:25 +00:00
Jessica Paquette	76c40f827d	Suppress unused capture warning in CheckCopy Werror bots didn't like the lambda + assert thing in my previous commit. Capture everything to suppress the error. Example failure here: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29393 llvm-svn: 352124	2019-01-24 22:51:31 +00:00
Matt Arsenault	baa5d2e69c	RegBankSelect: Support some more complex part mappings llvm-svn: 352123	2019-01-24 22:47:04 +00:00
Armando Montanez	8367b0750f	[elfabi] Add support for reading dynamic symbols from binaries This patch adds initial support for reading dynamic symbols from ELF binaries. Currently, STT_NOTYPE, STT_OBJECT, STT_FUNC, and STT_TLS are explicitly supported. Other symbol types are mapped to ELFSymbolType::Unknown to improve signal/noise ratio. Symbols must meet two criteria to be read into in an ELFStub: - The symbol's binding must be STB_GLOBAL or STB_WEAK. - The symbol's visibility must be STV_DEFAULT or STV_PROTECTED. This filters out symbols that aren't of interest during compile-time linking against a shared object. This change uses DT_HASH and DT_GNU_HASH to determine the size of .dynsym. Using hash tables to determine the number of symbols in .dynsym allows llvm-elfabi to work on binaries without relying on section headers. Differential Revision: https://reviews.llvm.org/D56031 llvm-svn: 352121	2019-01-24 22:39:21 +00:00
Zachary Turner	8371da385a	[PDB] Increase TPI hash bucket count. PDBs contain several serialized hash tables. In the microsoft-pdb repo published to support LLVM implementing PDB support, the provided initializes the bucket count for the TPI and IPI streams to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398. In the LLVM code for generating PDBs, these streams are created with minimum number of buckets. This difference makes LLVM generated PDBs slower for when used for debugging. Patch by C.J. Hebert Differential Revision: https://reviews.llvm.org/D56942 llvm-svn: 352117	2019-01-24 22:25:55 +00:00
Jessica Paquette	245047dfe8	[GlobalISel][AArch64] Add isel support for FP16 vector @llvm.ceil This patch adds support for vector @llvm.ceil intrinsics when full 16 bit floating point support isn't available. To do this, this patch... - Implements basic isel for G_UNMERGE_VALUES - Teaches the legalizer about 16 bit floats - Teaches AArch64RegisterBankInfo to respect floating point registers on G_BUILD_VECTOR and G_UNMERGE_VALUES - Teaches selectCopy about 16-bit floating point vectors It also adds - A legalizer test for the 16-bit vector ceil which verifies that we create a G_UNMERGE_VALUES and G_BUILD_VECTOR when full fp16 isn't supported - An instruction selection test which makes sure we lower to G_FCEIL when full fp16 is supported - A test for selecting G_UNMERGE_VALUES And also updates arm64-vfloatintrinsics.ll to show that the new ceiling types work as expected. https://reviews.llvm.org/D56682 llvm-svn: 352113	2019-01-24 22:00:41 +00:00
Bob Haarman	38ebaf7d5d	allow COFF .def directive in module assembly when using ThinLTO Summary: Using COFF's .def directive in module assembly used to crash ThinLTO with "this directive only supported on COFF targets" when getting symbol information in ModuleSymbolTable. This change allows ModuleSymbolTable to process such code and adds a test to verify that the .def directive has the desired effect on the native object file, with and without ThinLTO. Fixes https://bugs.llvm.org/show_bug.cgi?id=36789 Reviewers: rnk, pcc, vlad.tsyrklevich Subscribers: mehdi_amini, eraman, hiraditya, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57073 llvm-svn: 352112	2019-01-24 21:41:03 +00:00
Eli Friedman	525ef0159d	[Analysis] Fix isSafeToLoadUnconditionally handling of volatile. A volatile operation cannot be used to prove an address points to normal memory. (LangRef was recently updated to state it explicitly.) Differential Revision: https://reviews.llvm.org/D57040 llvm-svn: 352109	2019-01-24 21:31:13 +00:00
Michael Trent	f4c902bd77	Limit dyld image suffixes guessed by guessLibraryShortName() Summary: guessLibraryShortName() separates a full Mach-O dylib install name path into a short name and a dyld image suffix. The short name is the name of the dylib without its path or extension. The dyld image suffix is a string used by dyld to load variants of dylibs if available at runtime; for example, "when binding this process, load 'debug' variants of all required dylibs." dyld knows exactly what the image suffix is, but by convention diagnostic tools such as llvm-nm attempt to guess suffix names by looking at the install name path. These dyld image suffixes are separated from the short name by a '_' character. Because the '_' character is commonly used to separate words in filenames guessLibraryShortName() cannot reliably separate a dylib's short name from an arbitrary image suffix; imagine if both the short name and the suffix contains an '_' character! To better deal with this ambiguity, guessLibraryShortName() will recognize only "_debug" and "_profile" as valid Suffix values. Calling code needs to be tolerant of guessLibraryShortName() guessing incorrectly. The previous implementation of guessLibraryShortName() did not allow '_' characters to appear in short names. When present, the short name would be truncated, e.g., "libcompiler_rt" => "libcompiler". This change allows "libcompiler_rt" and "libcompiler_rt_debug" to both be recognized as "libcompiler_rt". rdar://47412244 Reviewers: kledzik, lhames, pete Reviewed By: pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56978 llvm-svn: 352104	2019-01-24 20:59:44 +00:00
Haojian Wu	b9613a39b8	Fix a compiler error introduced in r352093. llvm-svn: 352098	2019-01-24 20:30:48 +00:00
Nico Weber	24298a4404	gn build: Merge r351990 llvm-svn: 352096	2019-01-24 20:19:18 +00:00
Alina Sbirlea	0a4367209c	[LICM] Cleanup duplicated code. [NFCI] llvm-svn: 352093	2019-01-24 19:57:30 +00:00
Alina Sbirlea	52f6e2a173	[MemorySSA +LICM CFHoist] Solve PR40317. Summary: MemorySSA needs updating each time an instruction is moved. LICM and control flow hoisting re-hoists instructions, thus needing another update when re-moving those instructions. Pending cleanup: the MSSA update is duplicated, should be moved inside moveInstructionBefore. Reviewers: jnspaulsson Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D57176 llvm-svn: 352092	2019-01-24 19:48:35 +00:00
Philip Reames	4775721dda	Test cases for demanded elements on vector GEPs This is the first part of splitting apart https://reviews.llvm.org/D57140 into usuable pieces. Landing the tests in advance of posting a review specifically for the demanded elements part. llvm-svn: 352091	2019-01-24 19:35:28 +00:00
Roman Lebedev	a95a7105ef	[IRBuilder] Remove positivity check from CreateAlignmentAssumption() Summary: An alignment should be non-zero positive power-of-two, anything and everything else is UB. We should not have that check for all these prerequisites here, it's just UB. Also, that was likely confusing middle-end passes. While there, `CreateIntCast()` should be called with `/isSigned/ false`. Think about it, there are two explanations: "An alignment should be positive", therefore the sign bit is unset, so `zext` and `sext` is equivalent. Or a second one: you have `i2 0b10` - a valid alignment, now you `sext` it: `i2 0b110` - no longer valid alignment. Reviewers: craig.topper, jyknight, hfinkel, erichkeane, rjmccall Reviewed By: hfinkel, rjmccall Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D54653 llvm-svn: 352089	2019-01-24 19:32:48 +00:00
Simon Pilgrim	c12a634326	[X86] Regenerate SBB test to fix buildbots. Some local WIP code unexpectedly managed to get in the way. llvm-svn: 352081	2019-01-24 18:57:48 +00:00
Vedant Kumar	ef1ebed1c6	[HotColdSplit] Move splitting earlier in the pipeline Performing splitting early has several advantages: - Inhibiting inlining of cold code early improves code size. Compared to scheduling splitting at the end of the pipeline, this cuts code size growth in half within the iOS shared cache (0.69% to 0.34%). - Inhibiting inlining of cold code improves compile time. There's no need to inline split cold functions, or to inline as much within those split functions as they are marked `minsize`. - During LTO, extra work is only done in the pre-link step. Less code must be inlined during cross-module inlining. An additional motivation here is that the most common cold regions identified by the static/conservative splitting heuristic can (a) be found before inlining and (b) do not grow after inlining. E.g. __assert_fail, os_log_error. The disadvantages are: - Some opportunities for splitting out cold code may be missed. This gap can potentially be narrowed by adding a worklist algorithm to the splitting pass. - Some opportunities to reduce code size may be lost (e.g. store sinking, when one side of the CFG diamond is split). This does not outweigh the code size benefits of splitting earlier. On net, splitting early in the pipeline has substantial code size benefits, and no major effects on memory locality or performance. We measured memory locality using ktrace data, and consistently found that 10% fewer pages were needed to capture 95% of text page faults in key iOS benchmarks. We measured performance on frequency-stabilized iOS devices using LNT+externals. This reverses course on the decision made to schedule splitting late in r344869 (D53437). Differential Revision: https://reviews.llvm.org/D57082 llvm-svn: 352080	2019-01-24 18:55:49 +00:00
Sanjay Patel	e524639d72	[x86] rename VectorShuffle -> Shuffle; NFC This wasn't consistent within the file, so made it harder to search. Standardize on the shorter name to save some typing. llvm-svn: 352077	2019-01-24 18:52:12 +00:00
James Y Knight	2c36240a82	Fix emission of _fltused for MSVC. It should be emitted when any floating-point operations (including calls) are present in the object, not just when calls to printf/scanf with floating point args are made. The difference caused by this is very subtle: in static (/MT) builds, on x86-32, in a program that uses floating point but doesn't print it, the default x87 rounding mode may not be set properly upon initialization. This commit also removes the walk of the types pointed to by pointer arguments in calls. (To assist in opaque pointer types migration -- eventually the pointee type won't be available.) That latter implies that it will no longer consider a call like `scanf("%f", &floatvar)` as sufficient to emit _fltused on its own. And without _fltused, `scanf("%f")` will abort with error R6002. This new behavior is unlikely to bite anyone in practice (you'd have to read a float, and do nothing with it!), and also, is consistent with MSVC. Differential Revision: https://reviews.llvm.org/D56548 llvm-svn: 352076	2019-01-24 18:34:00 +00:00
Simon Pilgrim	f4a1b54097	[X86] Add PR25858 test cases llvm-svn: 352075	2019-01-24 18:30:45 +00:00
Julian Lettner	b62e9dc46b	Revert "[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls" This reverts commit `cea84ab93a`. llvm-svn: 352069	2019-01-24 18:04:21 +00:00
Nirav Dave	58e9833e98	[SelectionDAGBuilder] Simplify HasSideEffect calculation. NFC. llvm-svn: 352067	2019-01-24 17:56:03 +00:00
Nirav Dave	b41a198472	[InlineAsm] Don't calculate registers for inline asm memory operands. NFCI. llvm-svn: 352066	2019-01-24 17:47:18 +00:00
Sanjay Patel	e5a0bcf7b8	[x86] add low/high undef half shuffle mask helpers; NFC This is the most common usage for isUndefInRange, so make the code slightly less duplicated and more readable. llvm-svn: 352063	2019-01-24 17:05:02 +00:00
Philip Reames	86bbf7ccee	[RS4GC] Expand/standardize tests introduced in rL352059 Write a couple of variations on vector geps w/both scalars and vectors live over safepoints. Use update_test_checks to show all the IR. llvm-svn: 352062	2019-01-24 16:45:23 +00:00
Philip Reames	4d683ee7e3	[RS4GC] Be slightly less conservative for gep vector_base, scalar_idx After submitting https://reviews.llvm.org/D57138, I realized it was slightly more conservative than needed. The scalar indices don't appear to be a problem on a vector gep, we even had a test for that. Differential Revision: https://reviews.llvm.org/D57161 llvm-svn: 352061	2019-01-24 16:34:00 +00:00
Philip Reames	a657510eb7	[RS4GC] Avoid crashing on gep scalar_base, vector_idx This is an alternative to https://reviews.llvm.org/D57103. After discussion, we dedicided to check this in as a temporary workaround, and pursue a true fix under the original thread. The issue at hand is that the base rewriting algorithm doesn't consider the fact that GEPs can turn a scalar input into a vector of outputs. We had handling for scalar GEPs and fully vector GEPs (i.e. all vector operands), but not the scalar-base + vector-index forms. A true fix here requires treating GEP analogously to extractelement or shufflevector. This patch is merely a workaround. It simply hides the crash at the cost of some ugly code gen for this presumable very rare pattern. Differential Revision: https://reviews.llvm.org/D57138 llvm-svn: 352059	2019-01-24 16:08:18 +00:00
Simon Pilgrim	2f018de6a3	[TargetLowering] Rename getExpandedFixedPointMultiplication to expandFixedPointMul. NFCI. Match the (much shorter) name used in various legalization methods. llvm-svn: 352056	2019-01-24 15:46:54 +00:00
Nirav Dave	bd069f424f	[SelectionDAGBuilder] Fuse inline asm input operand loops passes. NFCI. llvm-svn: 352053	2019-01-24 15:15:32 +00:00
Michael Platings	7e552761f3	[Docs] Add information about unit tests to the testing guide Differential Revision: https://reviews.llvm.org/D57088 llvm-svn: 352052	2019-01-24 15:11:26 +00:00
Nirav Dave	c5cb2bed58	[X86] Add missing isReg() guards in FixupSetCCs pass. llvm-svn: 352051	2019-01-24 15:04:17 +00:00
Sanjay Patel	55787a7e77	[x86] add tests for unpack shuffle lowering; NFC https://bugs.llvm.org/show_bug.cgi?id=40434 llvm-svn: 352048	2019-01-24 14:12:34 +00:00
Simon Pilgrim	30b206b5da	[CostModel][X86] Add SMUL fixed point cost tests llvm-svn: 352046	2019-01-24 13:48:20 +00:00
Simon Pilgrim	47ca8606ba	[TTI] Add generic SADDO/SSUBO costs Added x86 scalar sadd_with_overflow/ssub_with_overflow costs. llvm-svn: 352045	2019-01-24 13:36:45 +00:00
Simon Pilgrim	a131e4e296	[TTI] Add generic UADDSAT/USUBSAT costs Add generic costs calculation for UADDSAT/USUBSAT intrinsics, this fallbacks to using generic costs for uadd_with_overflow/usub_with_overflow + a select. Differential Revision: https://reviews.llvm.org/D56907 llvm-svn: 352044	2019-01-24 12:27:10 +00:00
Simon Pilgrim	2d1964b90f	[TTI] Add generic UADDO/USUBO costs Added x86 scalar uadd_with_overflow/usub_with_overflow costs. Differential Revision: https://reviews.llvm.org/D56907 llvm-svn: 352043	2019-01-24 12:10:20 +00:00
Florian Hahn	bed7f9eab2	Revert "[HotColdSplitting] Get DT and PDT from the pass manager." This reverts commit `a6982414ed` (llvm-svn: 352036), because it causes a memory leak in the pass manager. Failing bot http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/10351/steps/check-llvm%20asan/logs/stdio llvm-svn: 352041	2019-01-24 11:22:08 +00:00
Petar Avramovic	79df859685	[MIPS GlobalISel] Select zero extending and sign extending load Select zero extending and sign extending load for MIPS32. Use size from MachineMemOperand to determine number of bytes to load. Differential Revision: https://reviews.llvm.org/D57099 llvm-svn: 352038	2019-01-24 10:27:21 +00:00
Petar Avramovic	b5a939d246	[MIPS GlobalISel] Combine extending loads Use CombinerHelper to combine extending load instructions. G_LOAD combined with G_ZEXT, G_SEXT or G_ANYEXT gives G_ZEXTLOAD, G_SEXTLOAD or G_LOAD with same type as def of extending instruction respectively. Similarly G_ZEXTLOAD combined with G_ZEXT gives G_ZEXTLOAD and G_SEXTLOAD combined with G_SEXT gives G_SEXTLOAD with same type as def of extending instruction. Differential Revision: https://reviews.llvm.org/D56914 llvm-svn: 352037	2019-01-24 10:09:52 +00:00
Florian Hahn	a6982414ed	[HotColdSplitting] Get DT and PDT from the pass manager. Instead of manually computing DT and PDT, we can get the from the pass manager, which ideally has them already cached. With the new pass manager, we could even preserve DT/PDT on a per function basis in a module pass. I think this also addresses the TODO about re-using the computed DTs for BFI. IIUC, GetBFI will fetch the DT from the pass manager and when we will fetch the cached version later. Reviewers: vsk, hiraditya, tejohnson, thegameg, sebpop Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D57092 llvm-svn: 352036	2019-01-24 09:44:52 +00:00
Simon Atanasyan	b6d3c50a36	Reapply: [mips] Handle MipsMCExpr sub-expression for the MEK_DTPREL tag This reapplies commit r351987 with a failed test fix. Now the test accepts both DW_OP_GNU_push_tls_address and DW_OP_form_tls_address opcode. Original commit message: ``` This is a fix for a regression introduced by the rL348194 commit. In that change new type (MEK_DTPREL) of MipsMCExpr expression was added, but in some places of the code this type of expression considered as unexpected. This change fixes the bug. The MEK_DTPREL type of expression is used for marking TLS DIEExpr only and contains a regular sub-expression. Where we need to handle the expression, we retrieve the sub-expression and handle it in a common way. ``` llvm-svn: 352034	2019-01-24 09:13:14 +00:00
Jonas Paulsson	5916dea338	[SystemZ] Remember to reset the NoPHIs property on MF in createPHIsForSelects() After creating new PHI instructions during isel pseudo expansion, the NoPHIs property of MF should be reset in case it was previously set. Review: Ulrich Weigand llvm-svn: 352030	2019-01-24 07:54:41 +00:00
Craig Topper	1e718429c1	[X86] Update SelectionDAGDumper to print the extension type and expanding flag for masked loads. Add truncating and compressing for masked stores. llvm-svn: 352029	2019-01-24 07:51:34 +00:00
Craig Topper	e79b779fbb	[X86] Add test cases for opportunities to fold a truncate and a masked store into a truncating masked store. llvm-svn: 352027	2019-01-24 06:15:03 +00:00
Max Kazantsev	66f92df761	[NFC] Add another failing test on LoopSimplifyCFG llvm-svn: 352026	2019-01-24 05:43:19 +00:00
Max Kazantsev	56515a2c76	[LoopSimplifyCFG] Fix inconsistency in live blocks markup When we choose whether or not we should mark block as dead, we have an inconsistent logic in markup of live blocks. - We take candidate IF its terminator branches on constant AND it is immediately in current loop; - We mark successor live IF its terminator doesn't branch by constant OR it branches by constant and the successor is its always taken block. What we are missing here is that when the terminator branches on a constant but is not taken as a candidate because is it not immediately in the current loop, we will mark only one (always taken) successor as live. Therefore, we do NOT do the actual folding but may NOT mark one of the successors as live. So the result of markup is wrong in this case, and we may then hit various asserts. Thanks Jordan Rupprech for reporting this! Differential Revision: https://reviews.llvm.org/D57095 Reviewed By: rupprecht llvm-svn: 352024	2019-01-24 05:20:29 +00:00
Max Kazantsev	11d3314241	[NFC] Add a failing test on live block markup in term folding llvm-svn: 352023	2019-01-24 05:05:55 +00:00
David Blaikie	7b585673d1	DebugInfo: Use assembly label arithmetic for address pool size for easier reading/editing Recommits 350048, 350050 That broke buildbots because of some typos in the test case. llvm-svn: 352019	2019-01-24 03:27:57 +00:00
Ana Pazos	5c0521ac52	Revert "[RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI" This reverts commit ccfb060ecb5d7e18ea729455660484d576bde2cc. Some tests need to to fixed before reapplying this commit. llvm-svn: 352014	2019-01-24 03:00:26 +00:00
Ana Pazos	c54abc520c	[RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI Summary: Affected instructions: PseudoLI simplest form (ADDI with X0) ALU operations with immediate (they do not set status flag - ADDI, ORI, XORI) Reviewers: asb Reviewed By: asb Subscribers: shiva0217, rkruppe, kito-cheng, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei Differential Revision: https://reviews.llvm.org/D56526 llvm-svn: 352010	2019-01-24 02:41:40 +00:00
Ana Pazos	29ace0e62c	[RISCV] Set isReMaterializable for ORI, XORI Reviewers: asb Reviewed By: asb Subscribers: asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei Differential Revision: https://reviews.llvm.org/D57069 llvm-svn: 352008	2019-01-24 02:31:23 +00:00
Douglas Yung	abfa98c9de	[docs] Remove extra character from git URL in Getting Started guide. llvm-svn: 352005	2019-01-24 01:22:32 +00:00
David Blaikie	79c3d8b127	llvm-symbolizer: Extract individual test cases now that it's easier to use directly (without a piped input file) Pulling out the split-dwarf tests by way of example of how I think llvm-symbolizer should be tested going forward. Open to debate/discussion, though. llvm-svn: 352004	2019-01-24 01:19:17 +00:00
Julian Lettner	cea84ab93a	[Sanitizers] UBSan unreachable incompatible with ASan in the presence of `noreturn` calls Summary: UBSan wants to detect when unreachable code is actually reached, so it adds instrumentation before every `unreachable` instruction. However, the optimizer will remove code after calls to functions marked with `noreturn`. To avoid this UBSan removes `noreturn` from both the call instruction as well as from the function itself. Unfortunately, ASan relies on this annotation to unpoison the stack by inserting calls to `_asan_handle_no_return` before `noreturn` functions. This is important for functions that do not return but access the the stack memory, e.g., unwinder functions like `longjmp` (`longjmp` itself is actually "double-proofed" via its interceptor). The result is that when ASan and UBSan are combined, the `noreturn` attributes are missing and ASan cannot unpoison the stack, so it has false positives when stack unwinding is used. Changes: # UBSan now adds the `expect_noreturn` attribute whenever it removes the `noreturn` attribute from a function # ASan additionally checks for the presence of this attribute Generated code: ``` call void @__asan_handle_no_return // Additionally inserted to avoid false positives call void @longjmp call void @__asan_handle_no_return call void @__ubsan_handle_builtin_unreachable unreachable ``` The second call to `__asan_handle_no_return` is redundant. This will be cleaned up in a follow-up patch. rdar://problem/40723397 Reviewers: delcypher, eugenis Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D56624 llvm-svn: 352003	2019-01-24 01:06:19 +00:00
Nico Weber	970d9d9acc	gn build: Merge r351320 (the 9.0.0 version bump) llvm-svn: 352002	2019-01-24 01:00:52 +00:00
David Callahan	d2eeb2516d	Update entry count for cold calls Summary: Profile sample files include the number of times each entry or inlined call site is sampled. This is translated into the entry count metadta on functions. When sample data is being read, if a call site that was inlined in the sample program is considered cold and not inlined, then the entry count of the out-of-line functions does not reflect the current compilation. In this patch, we note call sites where the function was not inlined and as a last action of the sample profile loading, we update the called function's entry count to reflect the calls from these call sites which are not included in the profile file. Reviewers: danielcdh, wmi, Kader, modocache Reviewed By: wmi Subscribers: davidxl, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D52845 llvm-svn: 352001	2019-01-24 00:55:23 +00:00
Douglas Yung	7876c0ecf2	[llvm-symbolizer] Add support for -i and -inlines as aliases for -inlining This change adds two options, -i and -inlines as aliases for the -inlining option to llvm-symbolizer to improve compatibility with the GNU addr2line utility which accepts these options. It also modifies existing tests that use -inlining to exercise these new aliases as well. This fixes PR40073. Reviewed by: jhenderson, Quolyk, ruiu Differential Revision: https://reviews.llvm.org/D57083 llvm-svn: 351999	2019-01-24 00:34:09 +00:00
Amara Emerson	addb7ab2ae	Revert "[mips] Handle MipsMCExpr sub-expression for the MEK_DTPREL tag" This reverts commit r351987 as it broke some bots. llvm-svn: 351998	2019-01-24 00:24:59 +00:00
Mircea Trofin	ec02630278	[llvm] Clarify responsiblity of some of DILocation discriminator APIs Summary: Renamed setBaseDiscriminator to cloneWithBaseDiscriminator, to match similar APIs. Also changed its behavior to copy over the other discriminator components, instead of eliding them. Renamed cloneWithDuplicationFactor to cloneByMultiplyingDuplicationFactor, which more closely matches what this API does. Reviewers: dblaikie, wmi Reviewed By: dblaikie Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56220 llvm-svn: 351996	2019-01-24 00:10:25 +00:00
Reid Kleckner	e80799e6af	[ADT] Notify ilist traits about in-list transfers Summary: Previously no client of ilist traits has needed to know about transfers of nodes within the same list, so as an optimization, ilist doesn't call transferNodesFromList in that case. However, now there are clients that want to use ilist traits to cache instruction ordering information to optimize dominance queries of instructions in the same basic block. This change updates the existing ilist traits users to detect in-list transfers and do nothing in that case. After this change, we can start caching instruction ordering information in LLVM IR data structures. There are two main ways to do that: - by putting an order integer into the Instruction class - by maintaining order integers in a hash table on BasicBlock I plan to implement and measure both, but I wanted to commit this change first to enable other out of tree ilist clients to implement this optimization as well. Reviewers: lattner, hfinkel, chandlerc Subscribers: hiraditya, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D57120 llvm-svn: 351992	2019-01-23 22:59:52 +00:00
Hideki Saito	4e4ecae028	[LV][VPlan] Change to implement VPlan based predication for VPlan-native path Context: Patch Series #2 for outer loop vectorization support in LV using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Patch series #2 checks that inner loops are still trivially lock-step among all vector elements. Non-loop branches are blindly assumed as divergent. Changes here implement VPlan based predication algorithm to compute predicates for blocks that need predication. Predicates are computed for the VPLoop region in reverse post order. A block's predicate is computed as OR of the masks of all incoming edges. The mask for an incoming edge is computed as AND of predecessor block's predicate and either predecessor's Condition bit or NOT(Condition bit) depending on whether the edge from predecessor block to the current block is true or false edge. Reviewers: fhahn, rengolin, hsaito, dcaballe Reviewed By: fhahn Patch by Satish Guggilla, thanks! Differential Revision: https://reviews.llvm.org/D53349 llvm-svn: 351990	2019-01-23 22:43:12 +00:00
Peter Collingbourne	020ce3f026	hwasan: Read shadow address from ifunc if we don't need a frame record. This saves a cbz+cold call in the interceptor ABI, as well as a realign in both ABIs, trading off a dcache entry against some branch predictor entries and some code size. Unfortunately the functionality is hidden behind a flag because ifunc is known to be broken on static binaries on Android. Differential Revision: https://reviews.llvm.org/D57084 llvm-svn: 351989	2019-01-23 22:39:11 +00:00
Simon Atanasyan	812f1c55b1	[mips] Handle MipsMCExpr sub-expression for the MEK_DTPREL tag This is a fix for a regression introduced by the rL348194 commit. In that change new type (MEK_DTPREL) of MipsMCExpr expression was added, but in some places of the code this type of expression considered as unexpected. This change fixes the bug. The MEK_DTPREL type of expression is used for marking TLS DIEExpr only and contains a regular sub-expression. Where we need to handle the expression, we retrieve the sub-expression and handle it in a common way. llvm-svn: 351987	2019-01-23 22:02:53 +00:00
Reid Kleckner	f9ebacfd29	Revert r351938 "[ARM] Alter the register allocation order for minsize on Thumb2" This change caused fatal backend errors when compiling a file in libvpx for Android. llvm-svn: 351979	2019-01-23 21:10:48 +00:00
Alexey Bataev	897129dc3f	[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target. Enable full support for the debug info. Differential revision: https://reviews.llvm.org/D46189 llvm-svn: 351974	2019-01-23 18:59:54 +00:00
Alexey Bataev	25624e2e5b	Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target." This reverts commit r351972. Some pieces of the patch was not applied correctly. llvm-svn: 351973	2019-01-23 18:48:36 +00:00
Alexey Bataev	fe0b356063	[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target. Enable full support for the debug info. Recommit to fix the emission of the not required closing brace. Differential revision: https://reviews.llvm.org/D46189 llvm-svn: 351972	2019-01-23 18:28:59 +00:00
Craig Topper	aa0e74c1fc	[X86] Autogenerate complete checks. NFC llvm-svn: 351970	2019-01-23 18:25:49 +00:00
James Henderson	25ce596cd1	[llvm-symbolizer] Improve compatibility of --functions with GNU addr2line This fixes https://bugs.llvm.org/show_bug.cgi?id=40072. GNU addr2line's --functions switch is off by default, has a short alias of -f, and does not take an argument. This patch changes llvm-symbolizer to allow the second and third point (changing the default behaviour may have negative impacts on users). If the option is missing a value, it now treats it as "linkage". This change does cause one previously valid command-line to behave differently. Before --functions <value> was accepted, but now only --functions=<value> is allowed (as well as --functions). The old behaviour will result in the value being treated as a positional argument. The previous testing for --functions=short has been pulled out into a new test that also tests the other accepted values and option formats. Reviewed by: ruiu Differential Revision: https://reviews.llvm.org/D57049 llvm-svn: 351968	2019-01-23 17:27:48 +00:00
Haojian Wu	15a77418a9	Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target." This reverts commit r351846. This patch may generate illegal assembly code, see ``` $ ./bin/clang -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-grtev4-linux-gnu -S -disable-free -disable-llvm-verifier -discard-value-names -main-file-name new.cc -mrelocation-model pic -pic-level 2 -mthread-model posix -fmerge-all-constants -mdisable-fp-elim -relaxed-aliasing -no-integrated-as -mpie-copy-relocations -munwind-tables -fcuda-is-device -target-feature +ptx60 -target-cpu sm_35 -dwarf-column-info -debug-info-kind=line-directives-only -dwarf-version=2 -debugger-tuning=gdb -o empty.s -x cuda empty.cc $ cat empty.s // // Generated by LLVM NVPTX Back-End // .version 6.0 .target sm_35 .address_size 64 } ``` llvm-svn: 351966	2019-01-23 16:39:57 +00:00
Andrea Di Biagio	d768d35515	[MC][X86] Correctly model additional operand latency caused by transfer delays from the integer to the floating point unit. This patch adds a new ReadAdvance definition named ReadInt2Fpu. ReadInt2Fpu allows x86 scheduling models to accurately describe delays caused by data transfers from the integer unit to the floating point unit. ReadInt2Fpu currently defaults to a delay of zero cycles (i.e. no delay) for all x86 models excluding BtVer2. That means, this patch is only a functional change for the Jaguar cpu model only. Tablegen definitions for instructions (V)PINSR* have been updated to account for the new ReadInt2Fpu. That read is mapped to the the GPR input operand. On Jaguar, int-to-fpu transfers are modeled as a +6cy delay. Before this patch, that extra delay was added to the opcode latency. In practice, the insert opcode only executes for 1cy. Most of the actual latency is actually contributed by the so-called operand-latency. According to the AMD SOG for family 16h, (V)PINSR* latency is defined by expression f+1, where f is defined as a forwarding delay from the integer unit to the fpu. When printing instruction latency from MCA (see InstructionInfoView.cpp) and LLC (only when flag -print-schedule is speified), we now need to account for any extra forwarding delays. We do this by checking if scheduling classes declare any negative ReadAdvance entries. Quoting a code comment in TargetSchedule.td: "A negative advance effectively increases latency, which may be used for cross-domain stalls". When computing the instruction latency for the purpose of our scheduling tests, we now add any extra delay to the formula. This avoids regressing existing codegen and mca schedule tests. It comes with the cost of an extra (but very simple) hook in MCSchedModel. Differential Revision: https://reviews.llvm.org/D57056 llvm-svn: 351965	2019-01-23 16:35:07 +00:00
James Henderson	21ed868390	[llvm-readelf] Don't suppress static symbol table with --dyn-symbols + --symbols In r287786, a bug was introduced into llvm-readelf where it didn't print the static symbol table if both --symbols and --dyn-symbols were specified, even if there was no dynamic symbol table. This is obviously incorrect. This patch fixes this issue, by delegating the decision of which symbol tables should be printed to the final dumper, rather than trying to decide in the command-line option handling layer. The decision was made to follow the approach taken in this patch because the LLVM style dumper uses a different order to the original GNU style behaviour (and GNU readelf) for ELF output. Other approaches resulted in behaviour changes for other dumpers which felt wrong. In particular, I wanted to avoid changing the order of the output for --symbols --dyn-symbols for LLVM style, keep what is emitted by --symbols unchanged for all dumpers, and avoid having different orders of .dynsym and .symtab dumping for GNU "--symbols" and "--symbols --dyn-symbols". Reviewed by: grimar, rupprecht Differential Revision: https://reviews.llvm.org/D57016 llvm-svn: 351960	2019-01-23 16:15:39 +00:00
Simon Pilgrim	ac5b775522	Fix indentation. NFCI. llvm-svn: 351958	2019-01-23 16:01:19 +00:00
Simon Pilgrim	f87226eb70	[IR] Match intrinsic parameter by scalar/vectorwidth This patch replaces the existing LLVMVectorSameWidth matcher with LLVMScalarOrSameVectorWidth. The matching args must be either scalars or vectors with the same number of elements, but in either case the scalar/element type can differ, specified by LLVMScalarOrSameVectorWidth. I've updated the _overflow intrinsics to demonstrate this - allowing it to return a i1 or <N x i1> overflow result, matching the scalar/vectorwidth of the other (add/sub/mul) result type. The masked load/store/gather/scatter intrinsics have also been updated to use this, although as we specify the reference type to be llvm_anyvector_ty we guarantee the mask will be <N x i1> so no change in behaviour Differential Revision: https://reviews.llvm.org/D57090 llvm-svn: 351957	2019-01-23 16:00:22 +00:00
Krzysztof Parzyszek	036715408a	[Hexagon] Remove incorrect bit negation llvm-svn: 351956	2019-01-23 15:36:33 +00:00
Benjamin Kramer	4ebed81fc4	[AArch64] Fix out of bounds strlen CFIInst is not zero-terminated. This is one of more annoying functional differences between StringRef and ArrayRef. Found by asan. llvm-svn: 351955	2019-01-23 14:51:21 +00:00
Clement Courbet	c7956346da	Re-land rL322538 "Add a value_type to ArrayRef." llvm-svn: 351954	2019-01-23 14:20:59 +00:00
Simon Pilgrim	0e08b6f017	Move saturated arithmetic intrinsics to other integer intrinsics. NFCI. They were in the floating point group. llvm-svn: 351953	2019-01-23 13:49:10 +00:00
George Rimar	617adef933	[llvm-objdump] - Move common code to a new printRelocation() helper. NFC. This extracts the common code for printing relocations into a new helper function. llvm-svn: 351951	2019-01-23 13:39:12 +00:00
Tim Renouf	f64f8efe13	[AMDGPU] With XNACK, cannot clause a load with result coalesced with operand Summary: With XNACK, an smem load whose result is coalesced with an operand (thus it overwrites its own operand) cannot appear in a clause, because some other instruction might XNACK and restart the whole clause. The clause breaker already realized that an smem that overwrites an operand cannot appear in a clause, and broke the clause. The problem that this commit fixes is that the SIFormMemoryClauses optimization formed a bundle with early clobber, which caused the earlier code that set up the coalesced operand to be removed as dead. Differential Revision: https://reviews.llvm.org/D57008 Change-Id: I703c4d5b0bf7d6060222bec491f45c18bb3c0016 llvm-svn: 351950	2019-01-23 13:38:06 +00:00
Martin Storsjo	0d19a399a3	[llvm-objcopy] [COFF] Error out on use of unhandled options Prefer erroring out than silently not doing what was requested. Differential Revision: https://reviews.llvm.org/D57045 llvm-svn: 351948	2019-01-23 11:54:55 +00:00
Martin Storsjo	1be91958b3	[llvm-objcopy] [COFF] Fix handling of aux symbols for big objects The aux symbols were stored in an opaque std::vector<uint8_t>, with contents interpreted according to the rest of the symbol. All aux symbol types but one fit in 18 bytes (sizeof(coff_symbol16)), and if written to a bigobj, two extra padding bytes are written (as sizeof(coff_symbol32) is 20). In the storage agnostic intermediate representation, store the aux symbols as a series of coff_symbol16 sized opaque blobs. (In practice, all such aux symbols only consist of one aux symbol, so this is more flexible than what reality needs.) The special case is the file aux symbols, which are written in potentially more than one aux symbol slot, without any padding, as one single long string. This can't be stored in the same opaque vector of fixed sized aux symbol entries. The file aux symbols will occupy a different number of aux symbol slots depending on the type of output object file. As nothing in the intermediate process needs to have accurate raw symbol indices, updating that is moved into the writer class. Differential Revision: https://reviews.llvm.org/D57009 llvm-svn: 351947	2019-01-23 11:54:51 +00:00
Martin Storsjo	481334056f	[llvm-objcopy] [COFF] Remove testcase debugging lines. NFC. These are no longer necessary as the testcase now seems to run fine on the buildbots that previously failed on this case, after SVN r351934. llvm-svn: 351946	2019-01-23 11:54:36 +00:00
Florian Hahn	68cea130df	[HotColdSplitting] Remove unused SSAUpdater.h include (NFC). llvm-svn: 351945	2019-01-23 11:51:38 +00:00
George Rimar	fd383e7e22	[llvm-objdump] - Move variable. NFC. It was too far from the place where it is used. llvm-svn: 351942	2019-01-23 10:52:38 +00:00
George Rimar	bcbe98bcb9	[llvm-objdump] - Split disassembleObject() into two methods. NFCI. Currently, disassembleObject() is a ~550 lines length function. This patch splits it into two, where first do all helper objects initializations and calls the second which does all the rest job. This is a straightforward split. Differential revision: https://reviews.llvm.org/D57020 llvm-svn: 351940	2019-01-23 10:33:26 +00:00
Jonas Paulsson	6046d087c5	[SystemZ] Fix test case for buildbot. llvm-clang-x86_64-expensive-checks-win triggered this assert: "llvm.dbg.value intrinsic requires a !dbg attachment" Hopefully, adding reasonable !dbg operands solves this. llvm-svn: 351939	2019-01-23 10:29:12 +00:00
David Green	6a858a9425	[ARM] Alter the register allocation order for minsize on Thumb2 Currently in Arm code, we allocate LR first, under the assumption that it needs to be saved anyway. Unfortunately this has the disadvantage that it will require any instructions using it to be the longer thumb2 instructions, not the shorter thumb1 ones. This switches the order when we are optimising for minsize, returning to the default order so that more lower registers can be used. It can end up requiring more pushed registers, but on average produces smaller code. Differential Revision: https://reviews.llvm.org/D56008 llvm-svn: 351938	2019-01-23 10:18:30 +00:00

1 2 3 4 5 ...

174335 Commits