llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	95840866b7	[X86] Improve v2i64->v2f32 and v4i64->v4f32 uint_to_fp on avx and avx2 targets. Summary: Based on Simon's D52965, but improved to handle strict fp and improve some of the shuffling. Rather than use v2i1/v4i1 and let type legalization continue, just generate all the code with legal types and use an explicit shuffle. I also added an explicit setcc to the v4i64 code to match the semantics of vselect which doesn't just use the sign bit. I'm also using a v4i64->v4i32 truncate instead of the shuffle in Simon's original code. With the setcc this will become a pack. Future work can look into using X86ISD::BLENDV and a different shuffle that only moves the sign bit. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71956	2020-01-05 17:44:08 -08:00
Liu, Chen3	ca3bf289a7	[NFC] Modify the format: Drop the else since we alerady returned in the if.	2020-01-06 09:35:19 +08:00
Brian Gesiak	83a9321f60	[Coroutines] Remove corresponding phi values when apply simplifyTerminatorLeadingToRet Summary: In addMustTailToCoroResumes, we set musttail on those resume instructions that are followed by a ret instruction. This is done by simplifyTerminatorLeadingToRet which replace a sequence of branches leading to a ret with a clone of the ret. However it forgets to remove corresponding PHI values that come from basic block of replaced branch, and may cause jumpthreading pass hangs (https://bugs.llvm.org/show_bug.cgi?id=43720) This patch fix this issue Test Plan: cppcoro library with O3+flto check-llvm Reviewers: modocache, GorNishanov, lewissbaker Reviewed By: modocache Subscribers: mehdi_amini, EricWF, hiraditya, dexonsmith, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71826 Patch by junparser (JunMa)!	2020-01-05 18:26:30 -05:00
Stephen Kelly	445f4d2310	Clang-format previous commit	2020-01-05 22:58:32 +00:00
Fangrui Song	5511861e6d	[MC][ARM] Delete MCSection::HasData and move SHF_ARM_PURECODE logic to ARMELFObjectWriter::addTargetSectionFlags This simplifies the generic interface and also makes SHF_ARM_PURECODE more robust (fixes a TODO). Inspecting MCDataFragment contents covers more cases than MCObjectStreamer::EmitBytes.	2020-01-05 14:20:34 -08:00
Stephen Kelly	35efef5351	Add missing test	2020-01-05 21:55:52 +00:00
Kristina Brooks	b18cb9c471	[Gnu toolchain] Look at standard GCC paths for libstdcxx by default Linux' current addLibCxxIncludePaths and addLibStdCxxIncludePaths are actually almost non-Linux-specific at all, and can be reused almost as such for all gcc toolchains. Only keep Android/Freescale/Cray hacks in Linux's version. Patch by sthibaul (Samuel Thibault) Differential Revision: https://reviews.llvm.org/D69758	2020-01-05 21:43:18 +00:00
Fangrui Song	586acd8490	[MC] Delete MCSection::{rbegin,rend}	2020-01-05 12:51:15 -08:00
Stephen Kelly	ad0a45833b	Allow using traverse() with bindings	2020-01-05 20:48:56 +00:00
Stephen Kelly	4711512384	Fix oversight in AST traversal helper	2020-01-05 20:27:37 +00:00
Fangrui Song	124b918bd3	[MC] Merge MCSymbol::getSectionPtr into getSection and simplify	2020-01-05 12:03:40 -08:00
Fangrui Song	c764304adc	[MC] Drop an unused rule about absolute temporary symbols	2020-01-05 11:39:52 -08:00
Simon Pilgrim	e3bd011890	[X86][SSE] Combine combineLogicBlendIntoConditionalNegate for VSELECT nodes (PR43660) Attempt to use combineLogicBlendIntoConditionalNegate for (select M, (sub 0, X), X) -> (sub (xor X, M), M) We limit this to cases that can't easily replace the VSELECT with a shuffle (non-constant masks) or where a BLENDV is likely to occur (which tends to result in slower codegen).	2020-01-05 18:50:44 +00:00
Simon Pilgrim	6a6e6f04ec	[X86] Move combineLogicBlendIntoConditionalNegate before combineSelect. NFCI. Updates function order in preparation of future fix for PR43660	2020-01-05 17:17:41 +00:00
Simon Pilgrim	3db84f142a	[X86] Merge (identical) LowerGC_TRANSITION_START and LowerGC_TRANSITION_END (NFC) Silences a copy+paste analyzer warning - all they are doing are inserting NOOPs in exactly the same way.	2020-01-05 15:24:57 +00:00
David Green	fb8c9a339a	[ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing the target independent code to handle more folds in more situations (for example if the fast math flags are present, but the global AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately if needed. Differential Revision: https://reviews.llvm.org/D72139	2020-01-05 11:24:04 +00:00
David Green	c15a56f61a	[ARM] Fill in FP16 FMA patterns This adds fp16 variants of all the fma patterns in the ARM backend. Differential Revision: https://reviews.llvm.org/D72138	2020-01-05 11:24:04 +00:00
David Green	5a25399221	[ARM] Add and update FMA tests. NFC	2020-01-05 11:24:04 +00:00
David Green	170de3de2e	[ParserTest] Move raw string literal out of macro Some combinations of gcc and ccache do not deal well with raw strings in macros. Moving the string out to attempt to fix the bots.	2020-01-05 11:24:04 +00:00
Craig Topper	4e37d60f2a	[LegalizeVectorOps][X86] Enable expansion of vector fp_to_uint in LegalizeVectorOps to avoid scalarization. The code here isn't great in all caess. Particularly v4f64->v4i32 on 64-bit AVX targets. But there is some improvement in some configurations. There's definitely some issues with computeNumSignBits with X86ISD::STRICT_FCMP. As well as not being able to propagate sign bits through merge_values nodes that get created during custom legalization.	2020-01-04 19:18:54 -08:00
Craig Topper	16a67d252c	[TargetLowering] In expandFP_TO_UINT, add proper extend or truncate for the condition to feed the DstVT select. Previously, for vectors we created a vselect with a condition that didn't match what the target wanted according to getSetCCResultType. To make up for this, X86 had a special DAG combine to detect if the condition was all sign bits and then insert its own truncate or extend. By adding the extend/truncate here explicitly we can avoid that.	2020-01-04 18:15:20 -08:00
Craig Topper	285d5e6b8b	[LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.	2020-01-04 17:03:50 -08:00
Fangrui Song	085898d469	[ELF] Drop const qualifier to fix -Wrange-loop-analysis. NFC ``` lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection , uint32_t>' (aka 'const pair<lld:🧝:ThunkSection , unsigned int>') creates a copy from type 'const std::pair<ThunkSection , uint32_t>' [-Wrange-loop-analysis] for (const std::pair<ThunkSection , uint32_t> ts : isd->thunkSections) ``` Drop const qualifier to fix -Wrange-loop-analysis. We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined. Reviewed By: Mordante Differential Revision: https://reviews.llvm.org/D72211	2020-01-04 12:24:39 -08:00
Matt Arsenault	d12f2a2998	GlobalISel: Scalarize all division operations This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.	2020-01-04 13:47:10 -05:00
Florian Hahn	b8a3c34eee	Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)." This reverts commit `51ef53f3bd`, as it breaks some bots.	2020-01-04 18:44:38 +00:00
Florian Hahn	51ef53f3bd	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-01-04 18:29:35 +00:00
Florian Hahn	99f74a64a2	[SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC).	2020-01-04 18:29:35 +00:00
Matt Arsenault	1f950ced50	GlobalISel: Define G_READCYCLECOUNTER	2020-01-04 13:10:19 -05:00
Matt Arsenault	4e972224c4	AMDGPU/GlobalISel: Refine SMRD selection rules Fix selecting these for volatile global loads, and ensure the loads are constant enough.	2020-01-04 12:40:35 -05:00
Matt Arsenault	d9b5063b25	AMDGPU/GlobalISel: Legalize more odd sized loads The attempts to widen sufficently aligned, odd sized loads wasn't consistently applied.	2020-01-04 12:38:39 -05:00
Matt Arsenault	5fb59f16e2	AMDGPU/GlobalISel: Assume vcc phis for any vcc input This produces more intelligible looking results, more comparabble to the DAG output in the simplest cases. This is probably wrong in complex control flow, but RegBankSelect doesn't attempt analyzing if this is on a masked path for selecting the bank yet.	2020-01-04 12:38:11 -05:00
Florian Hahn	db82fc5dd8	[Pass Registration] XFAIL load_extension.ll test on macOS. This test fails on macOS, causing the following bots to fail http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/7438/ http://green.lab.llvm.org/green/job/clang-stage1-RA/5034/ Error: Error opening 'build/./lib/libBye.dylib': dlopen(build/./lib/libBye.dylib, 9): image not found -load request ignored.	2020-01-04 17:37:08 +00:00
Matt Arsenault	5eed4e2664	AMDGPU/GlobalISel: Implement applyMappingImpl less incorrectly We're checking the current register bank of the registers in the instruction, but the mapping may have inserted cross bank copies and is expecting to replace the registers. We mostly get away with this currently, because VGPR->SGPR copies are illegal, and we assume this won't happen. In a future change, we'll start relying on more cross register bank copies being inserted, and this starts to break down.	2020-01-04 12:36:05 -05:00
Florian Hahn	4c6c4e2fce	[cmake] Remove install from add_llvm_example_library. This should fix http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/30086	2020-01-04 17:12:24 +00:00
Florian Hahn	0bb22b91ea	Re-apply "[Examples] Add IRTransformations directory to examples." This reverts commit `19fd8925a4`. Should include a fix for PR44197.	2020-01-04 15:47:23 +00:00
Kazuaki Ishizaki	b7ecf1c1c3	NFC: Fix trivial typos in comments	2020-01-04 10:28:41 -05:00
alex-t	ca8b20ca3b	[AMDGPU] need to insert wait between the scalar load and vector store to the same address to avoid WAR conflict. Reviewers: rampitec, vpykhtin, nhaehnle Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D71934	2020-01-04 18:23:14 +03:00
Roman Lebedev	6d05bc2e3a	[NFCI][InstCombine] Refactor 'sink negation into select if that folds one hand of select to 0' fold I would think it's better than having two practically identical folds next to eachother, but then generalization isn't all that pretty due to the fact that we need to produce different `sub` each time.. This change is no-functional-changes-intended refactoring.	2020-01-04 17:30:51 +03:00
Roman Lebedev	772ede3d5d	[InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 (PR44426) This decreases use count of %Op0, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal) %Op0 = %TrueVal %o = select i1 %Cond, i8 %Op0, i8 %FalseVal %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %FalseVal %r = select i1 %Cond, i8 0, i8 %n Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0 %Op0 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op0 %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %TrueVal %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/aHRt https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Roman Lebedev	d2b79c76be	[NFC][InstCombine] 'subtract from one hands of select' pattern tests (PR44426) https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Roman Lebedev	4d8e47ca18	[InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426) This decreases use count of %Op1, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1) %Op1 = %TrueVal %o = select i1 %Cond, i8 %Op1, i8 %FalseVal %r = sub i8 %o, %Op1 => %n = sub i8 %FalseVal, %Op1 %r = select i1 %Cond, i8 0, i8 %n Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0 %Op1 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op1 %r = sub i8 %o, %Op1 => %n = sub i8 %TrueVal, %Op1 %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/avL https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Roman Lebedev	83aa0b6734	[NFC][InstCombine] 'subtract of one hands of select' pattern tests (PR44426) https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Alexey Lapshin	831bfcea47	[Transforms][GlobalSRA] huge array causes long compilation time and huge memory usage. Summary: For artificial cases (huge array, few usages), Global SRA optimization creates a lot of redundant data. It creates an instance of GlobalVariable for each array element. For huge array, that means huge compilation time and huge memory usage. Following example compiles for 10 minutes and requires 40GB of memory. namespace { char LargeBuffer[64 * 1024 * 1024]; } int main ( void ) { LargeBuffer[0] = 0; printf("\n "); return LargeBuffer[0] == 0; } The fix is to avoid Global SRA for large arrays. Reviewers: craig.topper, rnk, efriedma, fhahn Reviewed By: rnk Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71993	2020-01-04 16:42:38 +03:00
Simon Pilgrim	eb0e1978df	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887	2020-01-04 13:15:50 +00:00
Martin Storsjö	1737cc750c	[LLD] [COFF] Don't error out on duplicate absolute symbols with the same value Both MS link.exe and GNU ld.bfd handle it this way; one can have multiple object files defining the same absolute symbols, as long as it defines it to the same value. But if there are multiple absolute symbols with differing values, it is treated as an error. Differential Revision: https://reviews.llvm.org/D71981	2020-01-04 12:29:33 +02:00
Craig Topper	2306f43ccb	[X86] Update MaxIndex test in x86-cmov-converter.ll to return the index and not use the index to look up the array after the loop. This represents a more realistic version of the code being tested. The cmov converter doesn't look at the code after the loop so it doesn't matter for what's being tested. But as noted in this twitter thread https://twitter.com/trav_downs/status/1213311159413161987 gcc can turn the previous MaxIndex code into the MaxValue code. So returning the index makes it a distinct case.	2020-01-03 23:59:54 -08:00
Kelvin Li	ed5fe64581	[OpenMP] NFC: Fix trivial typos in comments Submitted by: kiszk Differential Revision: https://reviews.llvm.org/D72171	2020-01-03 22:03:42 -05:00
LLVM GN Syncbot	0f1e7993e9	[gn build] Port `5d304d68dd`	2020-01-04 02:17:36 +00:00
Daniel Sanders	5d304d68dd	Revert "[gicombiner] Add GIMatchTree and use it for the code generation" All the windows bots are failing match-tree.td and there's no obvious cause that I can see. It's not just the %p formatting problem. My best guess is that there's an ordering issue too but I'll need further information to figure that out. Revert while I'm investigating. This reverts commit `64f1bb5cd2` and `77d4b5f5fe`	2020-01-03 18:17:00 -08:00
Med Ismail Bennani	df71f92fbb	[lldb/Command] Add --force option for `watchpoint delete` command Currently, there is no option to delete all the watchpoint without LLDB asking for a confirmation. Besides making the watchpoint delete command homogeneous with the breakpoint delete command, this option could also become handy to trigger automated watchpoint deletion i.e. using breakpoint actions. rdar://42560586 Differential Revision: https://reviews.llvm.org/D72096 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-01-04 03:11:15 +01:00

1 2 3 4 5 ...

338581 Commits All Branches Search

338581 Commits

All Branches