llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	d4d459e747	[X86] AMD Zen 3: MULX w/ mem operand has the same throughput as with reg op Exegesis is faulty and sometimes when measuring throughput^-1 produces snippets that have loop-carried dependencies, which must be what caused me to incorrectly measure it originally. After looking much more carefully, the inverse throughput should match that of the MULX w/ reg op. As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	0f04936a2d	[X86] AMD Zen 3: MULX produces low part of the result in 3cy, +1cy for high part As per llvm-exegesis measurements.	2021-08-27 13:27:05 +03:00
Roman Lebedev	db2c6cd99c	[NFC][X86][MCA] AMD Zen 3: improve MULX test coverage Latency for MULX isn't right	2021-08-27 13:27:05 +03:00
Yaron Keren	692ebe5395	[docs] Add DIA register instructions to Getting Started with Visual Studio page Since Visual Studio 2017 the DIA libs are not registered by default, see: https://docs.microsoft.com/en-us/visualstudio/extensibility/breaking-changes-2017?view=vs-2019#change-reduce-registry-impact LLDB building instruction already specify registering these DLLs, required both the LLVM PDB tests and LLDB build. Differential Revision: https://reviews.llvm.org/D108811	2021-08-27 13:10:19 +03:00
Balazs Benics	6ad47e1c4f	[analyzer] Catch leaking stack addresses via stack variables Not only global variables can hold references to dead stack variables. Consider this example: void write_stack_address_to(char *q) { char local; q = &local; } void test_stack() { char p; write_stack_address_to(&p); } The address of 'local' is assigned to 'p', which becomes a dangling pointer after 'write_stack_address_to()' returns. The StackAddrEscapeChecker was looking for bindings in the store which referred to variables of the popped stack frame, but it only considered global variables in this regard. This patch relaxes this, catching stack variable bindings as well. --- This patch also works for temporary objects like: struct Bar { const int &ref; explicit Bar(int y) : ref(y) { // Okay. } // End of the constructor call, `ref` is dangling now. Warning! }; void test() { Bar{33}; // Temporary object, so the corresponding memregion is // not* a VarRegion. } --- The return value optimization aka. copy-elision might kick in but that is modeled by passing an imaginary CXXThisRegion which refers to the parent stack frame which is supposed to be the 'return slot'. Objects residing in the 'return slot' outlive the scope of the inner call, thus we should expect no warning about them - except if we explicitly disable copy-elision. Reviewed By: NoQ, martong Differential Revision: https://reviews.llvm.org/D107078	2021-08-27 11:31:16 +02:00
Sylvestre Ledru	c22bd391bc	polly: remove the old reference to svn in the doc	2021-08-27 10:46:50 +02:00
Sylvestre Ledru	fe611b1da8	[clang] Move the soname declaration in a variable at the top of the file Currently, it is a bit buried in the file even if this is pretty important for distro. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108533	2021-08-27 09:07:12 +02:00
Chuanqi Xu	a52cfb3523	[NFC] [ASTReader] Remove unused variables	2021-08-27 14:00:03 +08:00
LLVM GN Syncbot	f8df807653	[gn build] Port `b749ef9e22`	2021-08-27 04:42:51 +00:00
Lang Hames	b749ef9e22	[ORC][ORC-RT] Reapply "Introduce ELF/*nix Platform and runtime..." with fixes. This reapplies `e256445bff`, which was reverted in `45ac5f5441` due to bot errors (e.g. https://lab.llvm.org/buildbot/#/builders/112/builds/8599). The issue that caused the bot failure was fixed in `2e6a4fce35`.	2021-08-27 14:41:58 +10:00
Lang Hames	2e6a4fce35	[ORC][JITLink][ELF] Treat STB_GNU_UNIQUE as Weak in the JIT. This should fix the bot error in https://lab.llvm.org/buildbot/#/builders/112/builds/8599 which forced reversion of the ELFNixPlatform in `45ac5f5441`. This should allow us to re-enable the ELFNixPlatform in a follow-up patch.	2021-08-27 14:41:28 +10:00
Matt Arsenault	ca4be0f9a1	AMDGPU: Fix hardcoded registers in test	2021-08-26 22:09:31 -04:00
Matt Arsenault	a020581f2e	AMDGPU/GlobalISel: Add baseline test for new ABI attribute hints	2021-08-26 22:09:11 -04:00
Matt Arsenault	04ce2de330	AMDGPU: Remove implicit argument attributes when introducing new calls In a future patch, a new set of amdgpu-no-* attributes will be introduced to indicate when a function does not need an implicitly passed input. This pass introduces new instances of these intrinsic calls, and should remove the attributes if they were present before.	2021-08-26 22:08:04 -04:00
Matt Arsenault	a74278f21f	AMDGPU: Fix broken test	2021-08-26 22:08:04 -04:00
Chen Zheng	324bd467a2	[PowerPC][ELF] make sure local variable space does not overlap with parameter save area Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D105271	2021-08-27 01:58:41 +00:00
Matt Arsenault	088cc63640	AMDGPU: Invert AMDGPUAttributor Switch to using BitIntegerState for each of the inputs, and invert their meanings. This now diverges more from the old AMDGPUAnnotateKernelFeatures, but this isn't used yet anyway.	2021-08-26 21:32:13 -04:00
Matt Arsenault	0150597c67	AMDGPU: Fix broken check lines	2021-08-26 21:30:06 -04:00
Matt Arsenault	3fdcd9bb13	GlobalISel: Add CallBase to CallLoweringInfo The DAG version has this, and is necessary for call lowering to take advantage of any attributes at the call site.	2021-08-26 21:09:11 -04:00
Matt Arsenault	46d82e7357	AMDGPU: Restrict attributor transforms We only really want this to add the custom attributes. Theoretically the regular transforms were already run at this point. Touching undefined behavior breaks a lot of tests when this is enabled by default, many of which are expecting to test handling of undef operations.	2021-08-26 21:08:51 -04:00
George Rokos	3819aae6dd	[libomptarget][NFC] Replaced obsolete name "getOrAllocTgtPtr" with new "getTargetPointer" in debug messages.	2021-08-26 18:01:18 -07:00
Matt Arsenault	cf32d61a05	AMDGPU: Remove hacky attribute deduction from AMDGPUAttributor amdgpu-calls and amdgpu-stack-objects don't really belong as attributes, and are currently a hacky way of passing an analysis into the DAG. These don't really belong in the IR, and don't really fit in with the other attributes. Remove these to facilitate inverting the pass. I don't exactly understand the indirect call test changes. These tests are using calls which are trivially replacable with a direct call, so I'm not sure what the point is.	2021-08-26 20:31:14 -04:00
Matt Arsenault	98d7aa435f	AMDGPU: Stop inferring use of llvm.amdgcn.kernarg.segment.ptr We no longer use this intrinsic outside of the backend and no longer support using it outside of kernels.	2021-08-26 20:30:03 -04:00
Heejin Ahn	f5cff292e2	[WebAssembly] Fix PHI when relaying longjmps When doing Emscritpen EH, if SjLj is also enabled and used and if the thrown exception has a possiblity being a longjmp instead of an exception, we shouldn't swallow it; we should rethrow, or relay it. It was done in D106525 and the code is here: `8441a8eea8/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L858-L898)` Here is the pseudocode of that part: (copied from comments) ``` if (%__THREW__.val == 0 \|\| %__THREW__.val == 1) goto %tail else goto %longjmp.rethrow longjmp.rethrow: ;; This is longjmp. Rethrow it %__threwValue.val = __threwValue emscripten_longjmp(%__THREW__.val, %__threwValue.val); tail: ;; Nothing happened or an exception is thrown ... Continue exception handling ... ``` If the current BB (where the `invoke` is created) has successors that has the current BB as its PHI incoming node, now that has to change to `tail` in the pseudocode, because `tail` is the latest BB that is connected with the next BB, but this was missing. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108785	2021-08-26 17:25:26 -07:00
David Blaikie	3784fc493e	Remove set-but-unused variable	2021-08-26 16:58:47 -07:00
Vitaly Buka	f1bb30a495	[sanitizer] No THREADLOCAL in qsort and bsearch qsort can reuse qsort_r if available. bsearch always passes key as the first comparator argument, so we can use it to wrap the original comparator. Differential Revision: https://reviews.llvm.org/D108751	2021-08-26 16:55:06 -07:00
Matt Arsenault	04da89e652	AMDGPU: Remove unnecessary -NEXT checks This avoids spuriously breaking the test in a future change	2021-08-26 19:37:54 -04:00
Matt Arsenault	cab0ec5c45	AMDGPU: Fix amdgpu_gfx calling convention usage in test This was calling a regular C function from amdgpu_gfx, which isn't defined to have all of the necessary implicit arguments.	2021-08-26 19:37:54 -04:00
Jez Ng	c74eb05f21	[lld-macho][nfc] Clean up InputSection constructors	2021-08-26 19:07:48 -04:00
Artem Belevich	5c24a1e1db	[CUDA] update constraints on NVPTX builtins to include PTX73 and 74.	2021-08-26 16:01:57 -07:00
Matt Arsenault	ce51c5d4a9	AMDGPU: Fix crashing on kernel declarations when lowering LDS This was trying to insert the used marker into a declaration.	2021-08-26 19:01:10 -04:00
Jez Ng	9b5148d426	[lld-macho] Have -ObjC load archive members before symbol resolution This is what ld64 does. Deviating in behavior here can result in some subtle duplicate symbol errors, as detailed in the objc.s test. Differential Revision: https://reviews.llvm.org/D108781	2021-08-26 18:52:07 -04:00
Jez Ng	9065fe5591	[lld-macho] Refactor archive loading The previous logic was duplicated between symbol-initiated archive loads versus flag-initiated loads (i.e. `-force_load` and `-ObjC`). This resulted in code duplication as well as redundant work -- we would create Archive instances twice whenever we had one of those flags; once in `getArchiveMembers` and again when we constructed the ArchiveFile. This was motivated by an upcoming diff where we load archive members containing ObjC-related symbols before loading those containing ObjC-related sections, as well as before performing symbol resolution. Without this refactor, it would be difficult to do that while avoiding loading the same archive member twice. Differential Revision: https://reviews.llvm.org/D108780	2021-08-26 18:52:07 -04:00
Jez Ng	2179930868	[lld-macho] Fix unwind info personality size This was missed by {D107035}. This fix addresses the following warning: loop variable 'personality' has type 'const uint32_t &' (aka 'const unsigned int &') but is initialized with type 'const unsigned long long' resulting in a copy [-Wrange-loop-analysis] In addition to fixing the size, I also removed the const reference, since there's no performance benefit to avoiding copies of integer-sized values.	2021-08-26 18:52:06 -04:00
Butygin	1e35a7690d	[mlir][spirv] Initial support for 64 bit index type and builtins Differential Revision: https://reviews.llvm.org/D108516	2021-08-27 01:38:53 +03:00
Benson Chu	7bd92f5911	[AST] Pick last tentative definition as the acting definition Clang currently picks the second tentative definition when VarDecl::getActingDefinition is called. This can lead to attributes being dropped if they are attached to tentative definitions that appear after the second one. This is because VarDecl::getActingDefinition loops through VarDecl::redecls assuming that the last tentative definition is the last element in the iterator. However, it is the second element that would be the last tentative definition. This changeset modifies getActingDefinition to iterate through the declaration chain in reverse, so that it can immediately return when it encounters a tentative definition. Originally the unit test for this changeset did not have a -triple flag for the clang invocation, leading to this test being broken on MacOS, since Mach-O does not support the section attribute. Differential Revision: https://reviews.llvm.org/D99732	2021-08-26 16:49:54 -05:00
Arthur Eubanks	6eed1fb349	[clang][NewPM] Mention that legacy PM flags are deprecated Differential Revision: https://reviews.llvm.org/D108789	2021-08-26 14:42:55 -07:00
Yonghong Song	82d9cb34a2	[DebugInfo] convert btf_tag attrs to DI annotations for func parameters Generate btf_tag annotations for DILocalVariable. The annotations are represented as an DINodeArray in DebugInfo. Differential Revision: https://reviews.llvm.org/D106620	2021-08-26 14:27:58 -07:00
Fangrui Song	a42bd1b560	[CMake] Change -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=off to -DLLVM_ENABLE_NEW_PASS_MANAGER=off LLVM_ENABLE_NEW_PASS_MANAGER is set to ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER, so -DLLVM_ENABLE_NEW_PASS_MANAGER=off has no effect. Change the cache variable to LLVM_ENABLE_NEW_PASS_MANAGER instead. A user opting out the new PM needs to switch from -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=off to -DLLVM_ENABLE_NEW_PASS_MANAGER=off. Also give a warning that -DLLVM_ENABLE_NEW_PASS_MANAGER=off is deprecated. Reviewed By: aeubanks, phosek Differential Revision: https://reviews.llvm.org/D108775	2021-08-26 14:25:31 -07:00
Yonghong Song	1bebc31c61	[DebugInfo] generate btf_tag annotations for func parameters Generate btf_tag annotations for function parameters. A field "annotations" is introduced to DILocalVariable, and annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DILocalVariable(name: "info",, arg: 1, ..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Differential Revision: https://reviews.llvm.org/D106620	2021-08-26 14:18:30 -07:00
Artem Dergachev	7309359928	[analyzer] Fix scan-build report deduplication. The previous behavior was to deduplicate reports based on md5 of the html file. This algorithm might have worked originally but right now HTML reports contain information rich enough to make them virtually always distinct which breaks deduplication entirely. The new strategy is to (finally) take advantage of IssueHash - the stable report identifier provided by clang that is the same if and only if the reports are duplicates of each other. Additionally, scan-build no longer performs deduplication on its own. Instead, the report file name is now based on the issue hash, and clang instances will silently refuse to produce a new html file when a duplicate already exists. This eliminates the problem entirely. The '-analyzer-config stable-report-filename' option is deprecated because report filenames are no longer unstable. A new option is introduced, '-analyzer-config verbose-report-filename', to produce verbose file names that look similar to the old "stable" file names. The old option acts as an alias to the new option. Differential Revision: https://reviews.llvm.org/D105167	2021-08-26 13:34:29 -07:00
Kirill Stoimenov	a3f4139626	[asan] Implemented flag to emit intrinsics to optimize ASan callbacks. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D108377	2021-08-26 20:33:57 +00:00
Kirill Stoimenov	2e83a0efb9	[asan] Fixed a runtime crash. Looks like the NoRegister has some effect on the final code that is generated. My guess is that some optimization kicks in at the end? When I use -S to dump the assembly I get the correct version with 'shrq $3, %r8': movq %r9, %r8 shrq $3, %r8 movsbl 2147450880(%r8), %r8d But, when I disassemble the final binary I get RAX in stead of R8: mov %r9,%r8 shr $0x3,%rax movsbl 0x7fff8000(%r8),%r8d Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D108745	2021-08-26 20:30:25 +00:00
Rob Suderman	90478251c7	[mlir][tosa] Tosa reverse to linalg supporting dynamic shapes Needed to switch to extract to support tosa.reverse using dynamic shapes. Reviewed By: NatashaKnk Differential Revision: https://reviews.llvm.org/D108744	2021-08-26 13:23:59 -07:00
Alexey Bataev	84cbd71c95	[SLP]Improve graph reordering. Reworked reordering algorithm. Originally, the compiler just tried to detect the most common order in the reordarable nodes (loads, stores, extractelements,extractvalues) and then fully rebuilding the graph in the best order. This was not effecient, since it required an extra memory and time for building/rebuilding tree, double the use of the scheduling budget, which could lead to missing vectorization due to exausted scheduling resources. Patch provide 2-way approach for graph reodering problem. At first, all reordering is done in-place, it doe not required tree deleting/rebuilding, it just rotates the scalars/orders/reuses masks in the graph node. The first step (top-to bottom) rotates the whole graph, similarly to the previous implementation. Compiler counts the number of the most used orders of the graph nodes with the same vectorization factor and then rotates the subgraph with the given vectorization factor to the most used order, if it is not empty. Then repeats the same procedure for the subgraphs with the smaller vectorization factor. We can do this because we still need to reshuffle smaller subgraph when buildiong operands for the graph nodes with lasrger vectorization factor, we can rotate just subgraph, not the whole graph. The second step (bottom-to-top) scans through the leaves and tries to detect the users of the leaves which can be reordered. If the leaves can be reorder in the best fashion, they are reordered and their user too. It allows to remove double shuffles to the same ordering of the operands in many cases and just reorder the user operations instead. Plus, it moves the final shuffles closer to the top of the graph and in many cases allows to remove extra shuffle because the same procedure is repeated again and we can again merge some reordering masks and reorder user nodes instead of the operands. Also, patch improves cost model for gathering of loads, which improves x264 benchmark in some cases. Gives about +2% on AVX512 + LTO (more expected for AVX/AVX2) for {625,525}x264, +3% for 508.namd, improves most of other benchmarks. The compile and link time are almost the same, though in some cases it should be better (we're not doing an extra instruction scheduling anymore) + we may vectorize more code for the large basic blocks again because of saving scheduling budget. Differential Revision: https://reviews.llvm.org/D105020	2021-08-26 12:31:18 -07:00
Nikita Popov	8441a8eea8	[MergeICmps] Add test for call before first load (NFC) If a clobbering call happens before all loads, that shouldn't block the transform.	2021-08-26 21:24:22 +02:00
Arthur Eubanks	14d45e41bf	[test] Update precommit tests for D108734	2021-08-26 12:05:56 -07:00
Vitaly Buka	96fa1eaae4	[sanitizer] Add basic qsort test	2021-08-26 12:03:26 -07:00
Jon Chesterfield	3d85342982	[libomptarget][amdgpu][nfc] Rename variables, delete dead code	2021-08-26 19:58:38 +01:00
Andrea Di Biagio	44a13f33be	Revert "[MCA][NFC] Remove redundant calls to std::move." This reverts commit `9cc0023fb8`. due to buildbot failures.	2021-08-26 19:53:17 +01:00

1 2 3 4 5 ...

397574 Commits All Branches Search

397574 Commits

All Branches