llvm-project

Commit Graph

Author	SHA1	Message	Date
Philip Reames	37ead201e6	[runtime-unroll] Use incrementing IVs instead of decrementing ones This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing. Why does this matter? A couple of reasons: * SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.) * Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.) Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen. * Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.	2021-11-12 15:44:58 -08:00
Lawrence D'Anna	19cd6f31d8	[lldb] temporarily disable TestPaths.test_interpreter_info on windows I'm disabling this test until the fix is reviewed (here https://reviews.llvm.org/D113650/)	2021-11-12 15:41:39 -08:00
Craig Topper	02bed66cd5	[RISCV] Improve codegen for i32 udiv/urem by constant on RV64. The division by constant optimization often produces constants that are uimm32, but not simm32. These constants require 3 or 4 instructions to materialize without Zba. Since these instructions are often used by a multiply with a LHS that needs to be zero extended with an AND, we can switch the MUL to a MULHU by shifting both inputs left by 32. Once we shift the constant left, the upper 32 bits no longer need to be 0 so constant materialization is free to use LUI+ADDIW. This reduces the constant materialization from 4 instructions to 3 in some cases while also reducing the zero extend of the LHS from 2 shifts to 1. Differential Revision: https://reviews.llvm.org/D113805	2021-11-12 14:49:10 -08:00
Duncan P. N. Exon Smith	9a2b54af22	lld: const-qualify iterations through VarStreamArray, NFC No functionality change here; just unblocking a patch to LLVM.	2021-11-12 14:29:49 -08:00
Duncan P. N. Exon Smith	a678c6743f	IR: Fix const-correctness of SwitchInst::CaseIterator and CaseHandle Fix some confusion between the two types of `const` a pointer/iterator can have. Users of a SwitchInst::CaseIterator should not (and do not!) manually mutate the SwitchInst::CaseHandle that tracks its internal state. Change operator() to return `const CaseHandle&`, remove the non-const-qualified operator(), and const-qualify CaseHandle::setValue() and CaseHandle::setSuccessor(). Differential Revision: https://reviews.llvm.org/D113788	2021-11-12 14:07:04 -08:00
Duncan P. N. Exon Smith	c3edab8f78	ADT: Avoid repeating iterator adaptor/facade template params, NFC Take advantage of class name injection to avoid redundantly specifying template parameters of iterator adaptor/facade base classes. No functionality change, although the private typedefs changed in a couple of cases. - Added a private typedef HashTableIterator::BaseT, following the pattern from r207084 / `3478d4b164`, to pre-emptively appease MSVC (maybe it's not necessary anymore but looks like we do this pretty consistently). Otherwise, I removed private - Removed private typedefs filter_iterator_impl::BaseT and FilterIteratorTest::InputIterator::BaseT since there was only one use of each and the definition was no longer interesting.	2021-11-12 14:00:08 -08:00
Alexey Bataev	e2a86ab847	[SLP][NFCAdd a test for vector intrinsic with scalar parameter, NFC.	2021-11-12 13:49:56 -08:00
Félix Cloutier	12ab3e6c84	format_arg attribute does not support nullable instancetype return type * The format_arg attribute tells the compiler that the attributed function returns a format string that is compatible with a format string that is being passed as a specific argument. * Several NSString methods return copies of their input, so they would ideally have the format_arg attribute. A previous differential (D112670) added support for instancetype methods having the format_arg attribute when used in the context of NSString method declarations. * D112670 failed to account that instancetype can be sugared in certain narrow (but critical) scenarios, like by using nullability specifiers. This patch resolves this problem. Differential Revision: https://reviews.llvm.org/D113636 Reviewed By: ahatanak Radar-Id: rdar://85278860	2021-11-12 13:35:43 -08:00
David Tenty	4602f52d48	[libcxx][AIX] XFAIL tests enabled by locale.fr_FR.UTF-8 We missed the tests in the earlier XFAIL-ing because the locale.fr_FR.UTF-8 feature wasn't available, but since an upgrade these are now showing up on the CI. Differential Revision: https://reviews.llvm.org/D113791	2021-11-12 16:28:24 -05:00
Mogball	2696a9529e	[mlir][ods] Cleanup of Class Codegen helper Depends on D113331 Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D113714	2021-11-12 21:22:01 +00:00
Peter Klausler	ece17064b5	[flang] Handle ENTRY names in IsPureProcedure() predicate Fortran defines an ENTRY point name as being pure if its enclosing subprogram scope defines a pure procedure. Differential Revision: https://reviews.llvm.org/D113711	2021-11-12 13:21:18 -08:00
Mogball	8cf674f12e	[mlir][ods] DialectAsmPrinter -> AsmPrinter in comments	2021-11-12 21:17:50 +00:00
Vitaly Buka	07092ea6bd	[asan] Fix GCC warning "left shift count >= width" Fixes PR52385	2021-11-12 13:04:00 -08:00
Jez Ng	9d0b237c51	[lld-macho] Fix symbol relocs handling for LSDAs Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA relocs as section relocs, but ld -r can turn those relocs into symbol ones. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113721	2021-11-12 16:02:49 -05:00
Jez Ng	d9b6f7e312	[lld-macho] Teach ICF to dedup functions with identical unwind info Dedup'ing unwind info is tricky because each CUE contains a different function address, if ICF operated naively and compared the entire contents of each CUE, entries with identical unwind info but belonging to different functions would never be considered identical. To work around this problem, we slice away the function address before performing ICF. We rely on `relocateCompactUnwind()` to correctly handle these truncated input sections. Here are the numbers before and after D109944, D109945, and this diff were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W: Without any optimizations: base diff difference (95% CI) sys_time 0.849 ± 0.015 0.896 ± 0.012 [ +4.8% .. +6.2%] user_time 3.357 ± 0.030 3.512 ± 0.023 [ +4.3% .. +5.0%] wall_time 3.944 ± 0.039 4.032 ± 0.031 [ +1.8% .. +2.6%] samples 40 38 With `-dead_strip`: base diff difference (95% CI) sys_time 0.847 ± 0.010 0.896 ± 0.012 [ +5.2% .. +6.5%] user_time 3.377 ± 0.014 3.532 ± 0.015 [ +4.4% .. +4.8%] wall_time 3.962 ± 0.024 4.060 ± 0.030 [ +2.1% .. +2.8%] samples 47 30 With `-dead_strip` and `--icf=all`: base diff difference (95% CI) sys_time 0.935 ± 0.013 0.957 ± 0.018 [ +1.5% .. +3.2%] user_time 3.472 ± 0.022 6.531 ± 0.046 [ +87.6% .. +88.7%] wall_time 4.080 ± 0.040 5.329 ± 0.060 [ +30.0% .. +31.2%] samples 37 30 Unsurprisingly, ICF is now a lot slower, likely due to the much larger number of input sections it needs to process. But the rest of the linker only suffers a mild slowdown. Note that the compact-unwind-bad-reloc.s test was expanded because we now handle the relocation for CUE's function address in a separate code path from the rest of the CUE relocations. The extended test covers both code paths. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D109946	2021-11-12 16:02:49 -05:00
Sanjay Patel	6c32dd4dfa	[AArch64][x86] add tests for swapped cmp+vselect patterns; NFC These patterns were noted in the recent D113212 and follow-ups. I did not bother to duplicate every test because it should be clear if we recognize the swaps from a smaller sample. We have complete coverage for the original patterns.	2021-11-12 15:49:46 -05:00
wlei	aab1810006	[llvm-profgen] Fix bug of setting function entry Previously we set `isFuncEntry` flag to true when the funcName from DWARF is equal to the name in symbol table and we use this flag to ignore reporting callsite sample that's from an intra func branch. However, in HHVM, it appears that the symbol table name is inconsistent with the dwarf info func name, it's likely due to `OptimizeGlobalAliases`. This change is a workaround in llvm-profgen side to mark the only one range as the function entry and add warnings for the remaining inconsistence. This also fixed a missing `getCanonicalFnName` for symbol name which caused the mismatching as well. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D113492	2021-11-12 12:18:43 -08:00
Aaron Puchert	59b1e98137	Comment Sema: Make most of CommentSema private (NFC) We only need to expose setDecl, copyArray and the actOn* methods.	2021-11-12 21:11:52 +01:00
Aaron Puchert	3010883fc2	Comment AST: Recognize function-like objects via return type (NFC) Instead of pretending that function pointer type aliases or variables are functions, and thereby losing the information that they are type aliases or variables, respectively, we use the existence of a return type in the DeclInfo to signify a "function-like" object. That seems pretty natural, since it's also the return type (or parameter list) from the DeclInfo that we compare the documentation with. Addresses a concern voiced in D111264#3115104. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D113691	2021-11-12 21:11:11 +01:00
Aaron Puchert	4e7df1ef7b	Comment AST: Find out if function is variadic in DeclInfo::fill Then we don't have to look into the declaration again. Also it's only natural to collect this information alongside parameters and return type, as it's also just a parameter in some sense. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D113690	2021-11-12 21:10:56 +01:00
Peter Hawkins	5074a20dec	Don't define //mlir:MLIRBindingsPythonCore in terms of the NoCAPI and CAPIDeps targets. We noticed that the library structure causes link ordering problems in Google's internal build. However, we don't think the problem is specific to Google's build, it probably can be reproduced anywhere with the right library structure. In general splitting the Python bindings from their dependencies (the C API targets) creates the possibility that the two libraries might end up in the wrong order on the linker command line. We can avoid this problem happening by reverting the structure of the MLIRBindingsPythonCore to represent its dependencies in the usual way, rather than composing an incomplete `MLIRBindingsPythonCoreNoCAPI` target and their CAPI dependencies. It was probably a mistake to rewrite this particular `cc_library()` rule in terms of the two, since nothing guarantees that the two will be correctly ordered by the linker when both are being linked into the same binary, and it was only an incidental "cleanup" done in passing. Otherwise the previous PR (D113565) is fine, since that was about the case where both are being built into two separate shared libraries. It just shouldn't have made this (unrelated) change. Reviewed By: GMNGeoffrey Differential Revision: https://reviews.llvm.org/D113773	2021-11-12 12:05:24 -08:00
Jez Ng	ad8df21db2	[reland][lld-macho] Fix symbol relocs handling for compact unwind's functionAddress Clang seems to emit all functionAddress relocs as section relocs, but `ld -r` can turn those relocs into symbol ones. It turns out that we weren't handling that case correctly when the symbol was a weak def whose definition did not prevail. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113702	2021-11-12 15:01:51 -05:00
Jacques Pienaar	153c298342	[mlir][shape] Add value_as_shape op Part of the very first discussion here, but didn't upstream it before as we didn't use it yet. Fix that for pending updates. Just adding the op here, follow up will add the lowering to codegen.	2021-11-12 11:53:27 -08:00
Duncan P. N. Exon Smith	46a68c85bf	Sema: const-qualify ParsedAttr::iterator::operator() `const`-qualify ParsedAttr::iterator::operator(), clearing up confusion about the two meanings of const for pointers/iterators. Helps unblock removal of (non-const) iterator_facade_base::operator->().	2021-11-12 11:47:16 -08:00
Duncan P. N. Exon Smith	8b3e1adf2b	IR: Avoid duplication of SwitchInst::findCaseValue(), NFC Change the non-const version of findCaseValue() to forward to the const version.	2021-11-12 11:44:08 -08:00
Philip Reames	de2fed6152	[unroll] Keep unrolled iterations with initial iteration The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration. With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code. However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.	2021-11-12 11:40:50 -08:00
Peter Klausler	da25f968a9	[flang] Runtime performance improvements to real formatted input Profiling a basic internal real input read benchmark shows some hot spots in the code used to prepare input for decimal-to-binary conversion, which is of course where the time should be spent. The library that implements decimal to/from binary conversions has been optimized, but not the code in the Fortran runtime that calls it, and there are some obvious light changes worth making here. Move some member functions from *.cpp files into the class definitions of Descriptor and IoStatementState to enable inlining and specialization. Make GetNextInputBytes() the new basic input API within the runtime, replacing GetCurrentChar() -- which is rewritten in terms of GetNextInputBytes -- so that input routines can have the ability to acquire more than one input character at a time and amortize overhead. These changes speed up the time to read 1M random reals using internal I/O from a character array from 1.29s to 0.54s on my machine, which on par with Intel Fortran and much faster than GNU Fortran. Differential Revision: https://reviews.llvm.org/D113697	2021-11-12 11:40:02 -08:00
Keith Smiley	eb6f9f3123	[lld-macho] Fix trailing slash in oso_prefix Previously if you passed `-oso_prefix path/to/foo/` with a trailing slash at the end, using `real_path` would remove that slash, but that slash is necessary to make sure OSO prefix paths end up as valid relative paths instead of starting with `/`. Differential Revision: https://reviews.llvm.org/D113541	2021-11-12 11:29:08 -08:00
Duncan P. N. Exon Smith	1b651be046	ADT: Fix const-correctness of iterator adaptors This fixes const-correctness of iterator adaptors, dropping non-`const` overloads for `operator()`. Iterators, like the pointers that they generalize, have two types of `const`. The `const` qualifier on members indicates whether the iterator itself can be changed. This is analagous to `int const`. The `const` qualifier on return values of `operator()`, `operator[]()`, and `operator->()` controls whether the the pointed-to value can be changed. This is analogous to `const int `. Since `operator()` does not (in principle) change the iterator, then there should only be one definition, which is `const`-qualified. E.g., iterators wrapping `int` should look like: ``` int operator() const; // always const-qualified, no overloads ``` `ba7a6b314f` changed `iterator_adaptor_base` away from this to work around bugs in other iterator adaptors. That was already reverted. This patch adds back its test, which combined llvm::enumerate() and llvm::make_filter_range(), adds a test for iterator_adaptor_base itself, and cleans up the `const`-ness of the other iterator adaptors. This also updates the documented requirements for `iterator_facade_base`: ``` /// OLD: /// - const T &operator() const; /// - T &operator(); /// New: /// - T &operator*() const; ``` In a future commit we might also clean up `iterator_facade`'s overloads of `operator->()` and `operator[]()`. These already (correctly) return non-`const` proxies regardless of the iterator's `const` qualifier. Differential Revision: https://reviews.llvm.org/D113158	2021-11-12 11:24:17 -08:00
Philip Reames	a1b496be6c	(re-)Autogen one last unroll-and-jam test This case was complicated because someone had added new non-autogened test to an autogened file. In particular, those new tests used two variables (%J and %j) which differeded only in capitalization. The auto-updater doesn't distinguish case, so this meant auto-gened versions of the new tests failed with non-obvious errors. There are two key lessons here: 1) Please don't use two values which differ only in case. This is problematic for automatic tooling, but is also hard to understand for a human. 2) Please DO NOT add new tests to an autogened test without running autogen again. If autogen doesn't pass on your new test, put them in a separate file.	2021-11-12 11:21:08 -08:00
Peter Klausler	d1b09adeeb	[flang] Fix rounding edge case in F output editing When an Fw.d output edit descriptor has a "d" value exactly equal to the number of zeroes after the decimal point for a value (e.g., 0.07 with F5.1), the Fw.d output editing code needs to do the rounding itself to either 0.0 or 0.1 after performing a conversion without rounding (to avoid 0.04999 rounding up twice). Differential Revision: https://reviews.llvm.org/D113698	2021-11-12 11:16:25 -08:00
Alfsonso Gregory	f46f93b478	[libc++][NFC] Resolve Python 2 FIXME We don't use Python 2 anymore, so let us do the recommended fix instead of using the workaround made for Python 2. Differential Revision: https://reviews.llvm.org/D107715	2021-11-12 13:55:22 -05:00
Peter Klausler	4a0af824ee	[flang] Respect NO_STOP_MESSAGE=1 in runtime When an environment variable NO_STOP_MESSAGE=1 is set, assume that STOP statements with a successful code have QUIET=.TRUE. Differential Revision: https://reviews.llvm.org/D113701	2021-11-12 10:50:36 -08:00
Lang Hames	3fb641618f	[ORC-RT][llvm-jitlink] Fix a buggy check in ORC-RT MachO TLV deregistration. The check was failing because it was matching against the end of the range, not the start. This bug wasn't causing the ORC-RT MachO TLV regression test to fail because we were only logging deallocation errors (including TLV deregistration errors) and not actually returning a failure code. This commit updates llvm-jitlink to report the errors properly.	2021-11-12 10:36:17 -08:00
Lang Hames	9d5e647428	[JITLink] Fix think-o in handwritten CWrapperFunctionResult -> Error converter. We need to skip the length field when generating error strings. No test case: This hand-hacked deserializer should be removed in the near future once JITLink can use generic ORC APIs (including SPS and WrapperFunction).	2021-11-12 10:36:17 -08:00
Philip Reames	f453e23e67	Autogen a bunch of unrolling tests for ease of update	2021-11-12 10:34:50 -08:00
Peter Klausler	85ec449352	[flang] Fix ORDER= argument to RESHAPE The ORDER= argument to the transformational intrinsic function RESHAPE was being misinterpreted in an inverted way that could be detected only with 3-d or higher rank array. Fix in both folding and the runtime, and extend tests. Differential Revision: https://reviews.llvm.org/D113699	2021-11-12 10:25:00 -08:00
Florian Hahn	03cfea68c6	[SCEV] Update SCEVLoopGuardRewriter to take SCEV -> SCEV map (NFC). Split off refactoring from D113577 to reduce the diff. NFC as the new interface will only be used in D113577.	2021-11-12 18:16:03 +00:00
Nawrin Sultana	7a5680233e	[OpenMP] Set default blocktime to 0 for hybrid cpu Differential Revision:https://reviews.llvm.org/D113012	2021-11-12 12:05:35 -06:00
Quinn Pham	84c5702b76	[lldb][NFC] Inclusive language: rename m_master in ASTImporterDelegate [NFC] As part of using inclusive language within the llvm project, this patch replaces `m_master` in `ASTImporterDelegate` with `m_main`. Reviewed By: teemperor, clayborg Differential Revision: https://reviews.llvm.org/D113720	2021-11-12 12:04:14 -06:00
Simon Pilgrim	3170670541	[AMDGPU] Regenerate udiv.ll tests	2021-11-12 17:57:40 +00:00
Philip Reames	5dd64ef528	Refresh an autogen test to reduce spurious diffs	2021-11-12 09:48:37 -08:00
Fangrui Song	a05384dc89	[ELF] Make --no-relax disable R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization This brings back the original version of D81359. I have found several use cases now. * Unlike GNU ld, LLD's relocation processing is one pass. If we decide to optimize(relax) R_X86_64_{,REX_}GOTPCRELX, we will suppress GOT generation and cannot undo the decision later. Optimizing R_X86_64_REX_GOTPCRELX can usually make it easy to hit `relocation R_X86_64_REX_GOTPCRELX out of range` because the distance to GOT is usually shorter. Without --no-relax, the user has to recompile with `-Wa,-mrelax-relocations=no`. * The option would help during my investigationg of the root cause of https://git.kernel.org/linus/09e43968db40c33a73e9ddbfd937f46d5c334924 * There is need for relaxation for AArch64 & RISC-V. Implementing this for x86-64 improves consistency with little target-specific cost (two-line X86_64.cpp change). Reviewed By: alexander-shaposhnikov Differential Revision: https://reviews.llvm.org/D113615	2021-11-12 09:47:31 -08:00
Sam McCall	4fb62e1383	[clangd] Mark completions as plain-text when there's no snippet part This helps nvim support the "repeat" action Fixes https://github.com/clangd/clangd/issues/922	2021-11-12 18:44:20 +01:00
Philip Reames	e01c91f242	[tests] Add coverage for cases we can prune exits when runtlme unrolling	2021-11-12 09:43:16 -08:00
Nikita Popov	1c5d636af1	[ConstantRangeTest] Add helper to enumerate APInts (NFC) While ForeachNumInConstantRange(ConstantRange::getFull(Bits)) works, it's somewhat roundabout, and I keep looking for this function.	2021-11-12 18:18:51 +01:00
Quinn Pham	52a3ed5b93	[lldb][NFC] Inclusive language: replace master/slave names for ptys [NFC] This patch replaces master and slave with primary and secondary respectively when referring to pseudoterminals/file descriptors. Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D113687	2021-11-12 10:54:18 -06:00
Dmitry Vyukov	79fbba9b79	Revert "tsan: new runtime (v3)" Summary: This reverts commit `ac95b8d954`. There is a number of bot failures: http://45.33.8.238/mac/38755/step_4.txt https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/38135/consoleFull#-148886289949ba4694-19c4-4d7e-bec5-911270d8a58c Reviewers: vitalybuka, melver Subscribers:	2021-11-12 17:49:47 +01:00
Simon Pilgrim	6bb71738e2	[X86] convertShiftLeftToScale - improve vXi8 constant handling Add support for v32i8/v64i8 converting shift-by-constant to multiply-by-constant. This helps us avoid the generic vXi8 shift lowering, and a lot of VPBLENDVB ops which can be particularly slow. We also needed to reorder a few shift lowering patterns to prevent regressions, particularly for XOP+AVX2 (Excavator) targets (which can split to fast v16i8 shifts) and AVX512-BWI targets (which prefers to extend to fast v32i16 shifts).	2021-11-12 16:48:10 +00:00
Zarko Todorovski	bd81c39107	[NFC][llvm] Remove uses of blacklist in llvm/test/Instrumentation Small patch that changes blacklisted_global to blocked_global and a change in comments. Reviewed By: pgousseau Differential Revision: https://reviews.llvm.org/D113692	2021-11-12 16:44:52 +00:00

1 2 3 4 5 ...

404603 Commits All Branches Search

404603 Commits

All Branches