llvm-project

Commit Graph

Author	SHA1	Message	Date
Lang Hames	2e6c92c540	[examples] Fix LLJITWithRemoteDebugging example after `f341161689`.	2021-10-10 20:25:44 -07:00
Esme-Yi	a00ff71668	[XCOFF] Improve error message context. Summary: This patch improves the error message context of the XCOFF interfaces by providing more details. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D110320	2021-10-11 02:52:20 +00:00
Qiu Chaofan	2fc0d439a4	[Clang] [PowerPC] Fix header include typo in smmintrin.h The SSE4 header (smmintrin.h) should include SSSE3 (tmmintrin.h) instead of SSE2 (emmintrin.h). Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D111482	2021-10-11 10:44:08 +08:00
Lang Hames	771e69484a	[ORC] Add dependence on pthreads library to ORC. `f341161689` introduced a dependence (for builds with LLVM_ENABLE_THREADS) on pthreads. This commit updates the CMakeLists.txt file to include a LINK_LIBS entry for pthreads.	2021-10-10 19:34:34 -07:00
LLVM GN Syncbot	816e9d81e2	[gn build] Port `f341161689`	2021-10-11 02:15:38 +00:00
LLVM GN Syncbot	98c9b3362f	[gn build] Port `3df094d31e`	2021-10-11 02:15:37 +00:00
Lang Hames	1b410e0777	[ORC] Add missing headers. These were accidentally left out of `f341161689`.	2021-10-10 19:11:46 -07:00
Arthur O'Dwyer	3df094d31e	[libc++] [P1614] Implement std::compare_three_way. Differential Revision: https://reviews.llvm.org/D110735	2021-10-10 21:57:10 -04:00
Lang Hames	f341161689	[ORC] Add TaskDispatch API and thread it through ExecutorProcessControl. ExecutorProcessControl objects will now have a TaskDispatcher member which should be used to dispatch work (in particular, handling incoming packets in the implementation of remote EPC implementations like SimpleRemoteEPC). The GenericNamedTask template can be used to wrap function objects that are callable as 'void()' (along with an optional name to describe the task). The makeGenericNamedTask functions can be used to create GenericNamedTask instances without having to name the function object type. In a future patch ExecutionSession will be updated to use the ExecutorProcessControl's dispatcher, instead of its DispatchTaskFunction.	2021-10-10 18:39:55 -07:00
Arthur Eubanks	77bc3ba365	[NFC][llvm-reduce] Cleanup types Use Module& wherever possible. Since every reduction immediately turns Chunks into an Oracle, directly pass Oracle instead. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D111122	2021-10-10 18:07:28 -07:00
Amara Emerson	f1e9ecea44	[AArch64][GlobalISel] Legalize G_VECREDUCE_XOR. Treated same as other bitwise reductions.	2021-10-10 17:01:21 -07:00
Wenlei He	9978e0e475	[llvm-profdata] Allow overlap/similarity comparison to use custom hot threshold cutoff Allow overlap/similarity comparison to use custom hot threshold cutoff, instead of using hard coded 990000 as hot cutoff. Differential Revision: https://reviews.llvm.org/D111385	2021-10-10 13:30:18 -07:00
Wenlei He	da4e5fc861	[llvm-profgen] Deduplicate PID when processing perf input When parsing mmap to retrieve PID, deduplicate them before passing PID list to perf script. Perf script would error out when there's duplicated PID in the input, however raw perf data may main duplicated PID for large binary where more than one mmap is needed to load executable segment. Differential Revision: https://reviews.llvm.org/D111384	2021-10-10 13:30:17 -07:00
Sylvestre Ledru	b07ea8a967	clang release notes: improve the wording	2021-10-10 22:26:11 +02:00
Lang Hames	da7f993a8d	[ORC] Reorder callWrapperAsync and callSPSWrapperAsync parameters. The callee address is now the first parameter and the 'SendResult' function the second. This change improves consistentency with the non-async functions where the callee is the first address and the return value the second.	2021-10-10 13:10:43 -07:00
Lang Hames	a42d5c34d0	Revert "Add missing include after dfd74db9" This reverts commit `dd384d2814`. `dfd74db9` was reverted in `8fe3d9df0e`, so this is no longer needed.	2021-10-10 13:01:08 -07:00
Dawid Jurczak	9e65929a8e	[DSE] Re-enable calloc transformation with extra care (PR25892) Transformation from malloc+memset to calloc is always correct and in many situations it brings significant observable benefits in terms of execution speed and memory consumption [1][2]. Unfortunately there are cases when producing calloc cause performance drops [3]. As discussed here: https://reviews.llvm.org/D103009 it's possible to differentiate between those 2 scenarios. If optimizer is able to prove that after malloc call it's _very_ likely to reach memset branch then after calloc emission we shouldn't observe any performance hits. Therefore finding "null pointer check" pattern before memset basic block sounds like good justification for performing transformation. Also that method was already suggested by GCC folks [4]. Main reason for change is that for now to be safe we check for post dominance relation which is way too conservative approach making transformation "almost" disabled in practice. This patch tends to enable transformation again but with extra care. [1] https://stackoverflow.com/questions/2688466/why-mallocmemset-is-slower-than-calloc [2] https://vorpus.org/blog/why-does-calloc-exist/ [3] http://smalldatum.blogspot.com/2017/11/a-new-optimization-in-gcc-5x-and-mysql.html [4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83022 Differential Revision: https://reviews.llvm.org/D110021	2021-10-10 21:47:14 +02:00
Sylvestre Ledru	9c8f950a04	clang release notes: document the -Wbool-operation improvement Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D111215	2021-10-10 21:28:40 +02:00
Nico Weber	62abc1842b	clang: Add range-based CFG::try_blocks() ..and use it. No behavior change.	2021-10-10 15:15:37 -04:00
Nico Weber	23d5fe6235	clang: Convert two loops to for-each And rewrap a line at 80 columns while here. No behavior change.	2021-10-10 14:55:46 -04:00
Joe Loser	65d62e52a7	[libc++][test] Replace a TEST_NOEXCEPT_FALSE with noexcept(false). NFC. Replace `TEST_NOEXCEPT_FALSE` directly with `noexcept(false)` in optional hash test which is only run in C++17 or later. `TEST_NOEXCEPT_FALSE` is only useful in C++03 context where `noexcept` isn't supported by clang. `TEST_NOEXCEPT_FALSE` now only has one remaining use in `hash_unique_ptr.pass.cpp`.	2021-10-10 14:46:35 -04:00
Joe Loser	e53c9251fa	[libc++] Remove empty namespace std in type_traits. NFCI. There is an empty `namespace std` in `type_traits` which was originally used when `std::byte` was added in `c97d8aa866`. At some point, the bitwise operators on `std::byte` got relocated but this empty namespace was left around. Remove it. Reviewed By: Quuxplusone, Mordante, #libc Differential Revision: https://reviews.llvm.org/D111512	2021-10-10 14:35:05 -04:00
Jean Perier	6eb7634f30	[fir] Add character conversion pass Upstream the character conversion pass. Translates entities of one CHARACTER KIND to another. By default the translation is to naively zero-extend or truncate a code point to fit the destination size. This patch is part of the upstreaming effort from fir-dev branch. Co-authored-by: Eric Schweitz <eschweitz@nvidia.com> Co-authored-by: Valentin Clement <clementval@gmail.com> Reviewed By: schweitz Differential Revision: https://reviews.llvm.org/D111405	2021-10-10 20:20:09 +02:00
Joe Loser	67964fc4b2	[libc++][NFC] Replace tab with whitespace in comment There is a stray tab character in a comment block. Replace the tab character with a space for consistency with other comments.	2021-10-10 12:53:35 -04:00
Kazu Hirata	0e9373a6a6	[Basic] Use llvm::is_contained (NFC)	2021-10-10 08:52:14 -07:00
Sanjay Patel	05281d95f2	[InstCombine] move fold for "(X-Y) == 0"; NFC This consolidates related folds that all have a similar use restriction that may not be necessary.	2021-10-10 11:26:03 -04:00
Sanjay Patel	cbd8041b0b	[InstCombine] add tests for (X - Y) == 0; NFC	2021-10-10 11:13:46 -04:00
Sanjay Patel	da210f5d34	[InstCombine] canonicalize "(C2 - Y) > C" as (Y + ~C2) < ~C The test diffs show that we have better analysis/folds for 'add' (although we should at least have the simplifications independently, so we don't have the one-use restriction). This is related to solving regressions that would appear in transforms related to D111410, and that is part of a series of enhancements that may eventually helpi solve PR34047. https://alive2.llvm.org/ce/z/3tB9KG define i1 @src(i8 %x, i8 %C, i8 %C2) { %sub = sub nuw i8 %C2, %x %r = icmp slt i8 %sub, %C ret i1 %r } define i1 @tgt(i8 %x, i8 %C, i8 %C2) { %Cnot = xor i8 %C, -1 %C2not = xor i8 %C2, -1 %add = add nuw i8 %x, %C2not %r = icmp sgt i8 %add, %Cnot ret i1 %r }	2021-10-10 11:06:49 -04:00
Sanjay Patel	c00cab878a	[InstCombine] add test for or-of-icmps; NFC	2021-10-10 11:06:49 -04:00
Chen Zheng	4ead32d1cf	[PowerPC] update test case using the scripts; nfc	2021-10-10 14:39:20 +00:00
Mark de Wever	dcbfceffde	[libc++][nfc] Remove a duplicated include.	2021-10-10 14:21:01 +02:00
Dávid Bolvanský	e6ce86bb62	[NFC] Added tests for PR52056	2021-10-10 11:34:39 +02:00
william woodruff	e7fc254875	[BitcodeAnalyzer] allow a motivated user to dump BLOCKINFO This adds the `--dump-blockinfo` flag to `llvm-bcanalyzer`, allowing a sufficiently motivated user to dump (parts of) the `BLOCKINFO_BLOCK` block. The default behavior is unchanged, and `--dump-blockinfo` only takes effect in the same context as other flags that control dump behavior (i.e., requires that `--dump` is also passed). Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D107536	2021-10-10 10:15:14 +05:30
Amara Emerson	f95d9c95bb	[GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly dividing type sizes. If the wide store we'd generate is not a multiple of the memory type of the narrow stores (e.g. s48 and s32), we'd assert. Fix that.	2021-10-09 21:18:20 -07:00
william woodruff	451d0596d7	[clang] Fix JSON AST output when a filter is used Without this, the combination of `-ast-dump=json` and `-ast-dump-filter FILTER` produces invalid JSON: the first line is a string that says `Dumping $SOME_DECL_NAME: `. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D108441	2021-10-10 07:46:17 +05:30
Med Ismail Bennani	c26e53e129	[lldb/test] Disable 'TestScriptedProcess.py' on macOS This is disabling 'TestScriptedProcess.py' on macOS since it fails on Green Dragon: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/35974 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2021-10-10 03:28:36 +02:00
Joe Loser	903b30fea2	[libc++][test] Remove empty {ind.move.subsumption.compile.pass.cpp} `{ind.move.subsumption.compile.pass.cpp}` was accidentally commited in https://reviews.llvm.org/D102639. Per the conversation on Discord in	2021-10-09 17:20:19 -04:00
Amy Zhuang	5ce368cfe2	[mlir] Vectorize induction variables 1. Add support to vectorize induction variables of loops that are not mapped to any vector dimension in SuperVectorize pass. 2. Fix a bug in getForInductionVarOwner. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D111370	2021-10-09 12:40:24 -07:00
mydeveloperday	3019898e0d	[clang-format][NFC] improve the visual of the "clang-formatted %" NOTE: some files are being removed from those files that are clang-formatted which means some lack of formatting is slipping through the net on reviews	2021-10-09 19:37:03 +01:00
Mehdi Amini	dda810c332	Fix a comment at call-site to match the declared parameter (NFC) (clang-tidy warning)	2021-10-09 17:57:53 +00:00
Ron Lieberman	d022f39d9f	[libomptarget][amdgpu][NFC] tweak a comment	2021-10-09 12:51:53 -04:00
Kazu Hirata	3e1c787b31	[IR] Remove arg_operands and getNumArgOperands (NFC) The last uses were removed on Oct 8, 2021 in commit `46ef2e0bf9`. This is a relanding of `b2ee408dde`.	2021-10-09 09:38:15 -07:00
Sanjay Patel	acafde09a3	[InstCombine] enhance icmp with sub folds There were 2 related but over-specified folds for: C1 - X == C One allowed multi-use but was limited to equal constants. The other allowed different constants but disallowed multi-use. This combines the 2 folds into a more general match. The test diffs show the multi-use cases that were falling through the cracks. https://alive2.llvm.org/ce/z/4_hEt2 define i1 @src(i8 %x, i8 %subC, i8 %C) { %s = sub i8 %subC, %x %r = icmp eq i8 %s, %C ret i1 %r } define i1 @tgt(i8 %x, i8 %subC, i8 %C) { %newC = sub i8 %subC, %C %isneg = icmp eq i8 %x, %newC ret i1 %isneg }	2021-10-09 11:39:49 -04:00
Sanjay Patel	cd76fa79b0	[InstCombine] add tests for icmp of negated op; NFC	2021-10-09 11:39:49 -04:00
Sanjay Patel	38e3b30bd6	[InstCombine] add tests for (iN X s>> N-1) \| Y; NFC These are for a sibling fold suggested in D111410. The tests correspond to the 'and' tests added with: `a35673f4cf`	2021-10-09 11:39:49 -04:00
Dávid Bolvanský	943b304848	Fixed some errors detected by PVS Studio	2021-10-09 17:27:41 +02:00
Dávid Bolvanský	3649fb14d1	Fixed some errors detected by PVS Studio	2021-10-09 17:20:04 +02:00
Nikita Popov	ea12adc169	[CanonicalizeFreeze] Drop IVUsers.h include (NFC) Looking for users of IVUsers, this was a false positive. Only LSR uses IVUsers.	2021-10-09 17:01:26 +02:00
David Green	adec922361	[AArch64] Make -mcpu=generic schedule for an in-order core We would like to start pushing -mcpu=generic towards enabling the set of features that improves performance for some CPUs, without hurting any others. A blend of the performance options hopefully beneficial to all CPUs. The largest part of that is enabling in-order scheduling using the Cortex-A55 schedule model. This is similar to the Arm backend change from `eecb353d0e` which made -mcpu=generic perform in-order scheduling using the cortex-a8 schedule model. The idea is that in-order cpu's require the most help in instruction scheduling, whereas out-of-order cpus can for the most part out-of-order schedule around different codegen. Our benchmarking suggests that hypothesis holds. When running on an in-order core this improved performance by 3.8% geomean on a set of DSP workloads, 2% geomean on some other embedded benchmark and between 1% and 1.8% on a set of singlecore and multicore workloads, all running on a Cortex-A55 cluster. On an out-of-order cpu the results are a lot more noisy but show flat performance or an improvement. On the set of DSP and embedded benchmarks, run on a Cortex-A78 there was a very noisy 1% speed improvement. Using the most detailed results I could find, SPEC2006 runs on a Neoverse N1 show a small increase in instruction count (+0.127%), but a decrease in cycle counts (-0.155%, on average). The instruction count is very low noise, the cycle count is more noisy with a 0.15% decrease not being significant. SPEC2k17 shows a small decrease (-0.2%) in instruction count leading to a -0.296% decrease in cycle count. These results are within noise margins but tend to show a small improvement in general. When specifying an Apple target, clang will set "-target-cpu apple-a7" on the command line, so should not be affected by this change when running from clang. This also doesn't enable more runtime unrolling like -mcpu=cortex-a55 does, only changing the schedule used. A lot of existing tests have updated. This is a summary of the important differences: - Most changes are the same instructions in a different order. - Sometimes this leads to very minor inefficiencies, such as requiring an extra mov to move variables into r0/v0 for the return value of a test function. - misched-fusion.ll was no longer fusing the pairs of instructions it should, as per D110561. I've changed the schedule used in the test for now. - neon-mla-mls.ll now uses "mul; sub" as opposed to "neg; mla" due to the different latencies. This seems fine to me. - Some SVE tests do not always remove movprfx where they did before due to different register allocation giving different destructive forms. - The tests argument-blocks-array-of-struct.ll and arm64-windows-calls.ll produce two LDR where they previously produced an LDP due to store-pair-suppress kicking in. - arm64-ldp.ll and arm64-neon-copy.ll are missing pre/postinc on LPD. - Some tests such as arm64-neon-mul-div.ll and ragreedy-local-interval-cost.ll have more, less or just different spilling. - In aarch64_generated_funcs.ll.generated.expected one part of the function is no longer outlined. Interestingly if I switch this to use any other scheduled even less is outlined. Some of these are expected to happen, such as differences in outlining or register spilling. There will be places where these result in worse codegen, places where they are better, with the SPEC instruction counts suggesting it is not a decrease overall, on average. Differential Revision: https://reviews.llvm.org/D110830	2021-10-09 15:58:31 +01:00
Nico Weber	e2a2e5475c	Revert "Reland "[gn build] (manually) port `6fe2beba7d` (ExceptionTests)"" This reverts commit `842035d8bd`. `1dba6b3` was reverted yet again in `04aff39504`.	2021-10-09 10:18:52 -04:00

1 2 3 4 5 ...

401416 Commits All Branches Search

401416 Commits

All Branches