llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	9477a308ca	[hwasan][test] Remove obsoleted/removed -fno-experimental-new-pass-manager	2022-02-01 13:24:39 -08:00
Florian Hahn	b1fb613924	[GVN] Add additional tests after `216d1a729`. Further extend test coverage added in `216d1a729`	2022-02-01 21:02:41 +00:00
Hongtao Yu	057e784b09	[llvm-profgen] Clean up unnecessary memory reservations between phases. Cleaning up data structures that are not used after a certain point. This further brings down peak memory usage by 15% for a large benchmark. Before: note: Before parsePerfTraces note: VM: 40.73 GB RSS: 39.18 GB note: Before parseAndAggregateTrace note: VM: 40.73 GB RSS: 39.18 GB note: After parseAndAggregateTrace note: VM: 88.93 GB RSS: 87.97 GB note: Before generateUnsymbolizedProfile note: VM: 88.95 GB RSS: 87.99 GB note: After generateUnsymbolizedProfile note: VM: 93.50 GB RSS: 92.53 GB note: After computeSizeForProfiledFunctions note: VM: 101.13 GB RSS: 99.36 GB note: After generateProbeBasedProfile note: VM: 215.61 GB RSS: 210.88 GB note: After postProcessProfiles note: VM: 237.48 GB RSS: 212.50 GB After: note: Before parsePerfTraces note: VM: 40.73 GB RSS: 39.18 GB note: Before parseAndAggregateTrace note: VM: 40.73 GB RSS: 39.18 GB note: After parseAndAggregateTrace note: VM: 88.93 GB RSS: 87.96 GB note: Before generateUnsymbolizedProfile note: VM: 88.95 GB RSS: 87.97 GB note: After generateUnsymbolizedProfile note: VM: 93.50 GB RSS: 92.51 GB note: After computeSizeForProfiledFunctions note: VM: 93.50 GB RSS: 92.53 GB note: After generateProbeBasedProfile note: VM: 164.87 GB RSS: 163.55 GB note: After postProcessProfiles note: VM: 182.28 GB RSS: 179.43 GB Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D118677	2022-02-01 12:48:08 -08:00
Sanjay Patel	267400c9b0	[x86] add tests for fmul/fdiv with identity constant in select arm; NFC	2022-02-01 15:43:28 -05:00
Sanjay Patel	8191472246	[x86] add more tests for select with identity constant; NFC D118644	2022-02-01 15:43:27 -05:00
Daniel Resnick	97fc568211	[mlir][capi] Add DialectRegistry to MLIR C-API Exposes mlir::DialectRegistry to the C API as MlirDialectRegistry along with helper functions. A hook has been added to MlirDialectHandle that inserts the dialect into a registry. A future possible change is removing mlirDialectHandleRegisterDialect in favor of using mlirDialectHandleInsertDialect, which it is now implemented with. Differential Revision: https://reviews.llvm.org/D118293	2022-02-01 13:42:06 -07:00
Stanislav Mekhanoshin	79606ee85c	[AMDGPU] Check atomics aliasing in the clobbering annotation MemorySSA considers any atomic a def to any operation it dominates just like a barrier or fence. That is correct from memory state perspective, but not required for the no-clobber metadata since we are not using it for reordering. Skip such atomics during the scan just like a barrier if it does not alias with the load. Differential Revision: https://reviews.llvm.org/D118661	2022-02-01 12:33:25 -08:00
Louis Dionne	4f67a90990	[libc++] Fix TOCTOU issue with std::filesystem::remove_all https://bugs.chromium.org/p/llvm/issues/detail?id=19 rdar://87912416 Differential Revision: https://reviews.llvm.org/D118134	2022-02-01 15:31:28 -05:00
Louis Dionne	c7b255e5a8	[libc++][ci] Re-enable the bootstrapping build Differential Revision: https://reviews.llvm.org/D118067	2022-02-01 15:29:00 -05:00
Florian Hahn	216d1a729c	[GVN] Add tests for D118143 not requiring loops.	2022-02-01 20:24:19 +00:00
David Green	c89cfbd4dd	Revert "[DAG] Extend SearchForAndLoads with any_extend handling" This reverts commit `100763a88f` as it was making incorrect assumptions about implicit zero_extends.	2022-02-01 20:18:40 +00:00
Arthur O'Dwyer	c0185ffaec	[clang] Don't typo-fix an expression in a SFINAE context. If this is a SFINAE context, then continuing to look up names (in particular, to treat a non-function as a function, and then do ADL) might too-eagerly complete a type that it's not safe to complete right now. We should just say "okay, that's a substitution failure" and not do any more work than absolutely required. Fixes #52970. Differential Revision: https://reviews.llvm.org/D117603	2022-02-01 15:17:28 -05:00
Arthur O'Dwyer	f6ce456707	[clang] Correctly(?) handle placeholder types in ExprRequirements. Bug #52905 was originally papered over in a different way, but I believe this is the actually proper fix, or at least closer to it. We need to detect placeholder types as close to the front-end as possible, and cause them to fail constraints, rather than letting them persist into later stages. Fixes #52905. Fixes #52909. Fixes #53075. Differential Revision: https://reviews.llvm.org/D118552	2022-02-01 15:16:17 -05:00
Arthur O'Dwyer	6a56d5cc25	[libc++] Fix LWG3589 "The const lvalue reference overload of get for subrange..." https://cplusplus.github.io/LWG/issue3589 Differential Revision: https://reviews.llvm.org/D117961	2022-02-01 15:14:44 -05:00
Florian Mayer	aefb2e134d	[hwasan] work around lifetime issue with setjmp. setjmp can return twice, but PostDominatorTree is unaware of this. as such, it overestimates postdominance, leaving some cases (see attached compiler-rt) where memory does not get untagged on return. this causes false positives later in the program execution. this is a crude workaround to unblock use-after-scope for now, in the longer term PostDominatorTree should bemade aware of returns_twice function, as this may cause problems elsewhere. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118647	2022-02-01 12:14:20 -08:00
Valentin Clement	aab4263ad6	[flang] Lower basic STOP statement This patch lowers STOP statement without arguments and ERROR STOP. STOP statement with arguments lowering will come in later patches ince it requires some expression lowering to be added. STOP statement is lowered to a runtime call. Also makes sure we are creating a constant in the MLIR arith constant. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: kiranchandramohan, schweitz Differential Revision: https://reviews.llvm.org/D118697 Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>	2022-02-01 20:54:45 +01:00
Peter Klausler	82cf35bc89	[flang] Fix/work around warnings from GCC 11 Apply part of a pending patch for GCC 11 warnings, and rework a piece of code, to dodge warnings on flag from GCC 11 build bots exposed by a recent patch. Applying without review to get bots working again; changes also tested against GCC 9.3.0.	2022-02-01 11:54:04 -08:00
Stanislav Mekhanoshin	c2b18a3cc5	[AMDGPU] Allow scalar loads after barrier Currently we cannot convert a vector load into scalar if there is dominating barrier or fence. It is considered a clobbering memory access to prevent memory operations reordering. While reordering is not possible the actual memory is not being clobbered by a barrier or fence and we can still use a scalar load for a uniform pointer. The solution is not to bail on a first clobbering access but traverse MemorySSA to the root excluding barriers and fences. Differential Revision: https://reviews.llvm.org/D118419	2022-02-01 11:43:17 -08:00
Jeremy Morse	8e75536e51	[DebugInfo][InstrRef][NFC] Bypass a frequently-noop loop Bypass this loop if it would do nothing -- if there are no register masks to be examined, there's no point looking at each location to see if the location has been def'd. Awkwardly, this was responsible for almost an entire half a percent of performance improvement on CTMark. Differential Revision: https://reviews.llvm.org/D118613	2022-02-01 19:39:09 +00:00
Jeremy Morse	3fab2d138e	[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out In certain circumstances with things like autogenerated code and asan, you can end up with thousands of Values live at the same time, causing a large working set and a lot of information spilled to the stack. Unfortunately InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory when there are many many stack slots. See the reproducer in D116821. It seems very unlikely that a developer would be able to reason about hundreds of live named local variables at the same time, so a huge working set and many stack slots is an indicator that we're likely analysing autogenerated or instrumented code. In those cases: gracefully degrade by setting an upper bound on the amount of stack slots to track. This limits peak memory consumption, at the cost of dropping some variable locations, but in a rare scenario where it's unlikely someone is actually going to use them. In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero. Differential Revision: https://reviews.llvm.org/D118601	2022-02-01 19:25:29 +00:00
Matt Morehouse	de4e8bc3ac	[HWASan] Properly handle musttail calls. Fixes a compile error when the `clang::musttail` attribute is used. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D118712	2022-02-01 11:23:43 -08:00
Chris Bieneman	7a0cbe11fb	[NFC] These tests require a default target These test cases all rely on a default target being specified. Adding the requirement gets the tests properly skipped when LLVM_DEFAULT_TARGET_TRIPLE is unset.	2022-02-01 13:18:39 -06:00
Shubham Sandeep Rastogi	466329d047	Change namespace llvm::swift to namespace llvm::binaryformat because of clashes with the apple/llvm-project repository The namespace llvm::swift is causing errors to pop up in the apple/llvm-project build when cherry-picking `4ce1f3d47c` into apple/llvm-project Differential Review: https://reviews.llvm.org/D118716	2022-02-01 11:15:21 -08:00
Chris Bieneman	bb808720bb	[NFC] Use llvm-as instead of llc llvm-as does everything this test requires, but doesn't depend on a target being registered. This gets the test passing when LLVM_DEFAUL_TARGET_TRIPLE is unset.	2022-02-01 13:07:22 -06:00
Anna Thomas	4fc52db116	[InstCombine] Remove weaker fence adjacent to a stronger fence We have an instCombine rule to remove identical consecutive fences. We can extend this to remove weaker fences when we have consecutive stronger fence. As stated in the LangRef, a fence with a stronger ordering also implies ordering weaker than itself: "A fence which has seq_cst ordering, in addition to having both acquire and release semantics specified above, participates in the global program order of other seq_cst operations and/or fences." Reviewed-By: reames Differential Revision: https://reviews.llvm.org/D118607	2022-02-01 11:05:34 -08:00
Jeremy Morse	91fb66cf91	[DebugInfo][InstrRef][NFC] Don't build a map of un-needed values When finding locations for variable values at the start of a block, we build a large map of every value to every location, and then pick out the locations for values that are desired. This takes up quite a lot of time, because, unsurprisingly, there are usually more values in registers and stack slots than there are variables. This patch instead creates a map of desired values to their locations, which are initially illegal locations. Then, as we examine every available value, we can select locations for values we care about, and ignore those that we don't. This substantially reduces the amount of work done (i.e., building a map up of values to locations that nothing wants or needs). Geomean performance improvement of 1% on CTMark, woo. Differential Revision: https://reviews.llvm.org/D118597	2022-02-01 18:58:06 +00:00
Joseph Huber	53d5757ea2	[OpenMP] Add kernel string attribute to kernel function This patch adds a function attribute to the kernel function generated in OpenMP offloading. We already create a `nvvm.annotations` metadata node indicating the kernels present in the program. However, this created some indirection when trying to identify if a specific function was an entry. We add a single function attribute for each function now to simplify this. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D118708	2022-02-01 13:49:31 -05:00
Jez Ng	3e951808d5	[lld-macho][nfc] Comments and style fixes Added some comments (particularly around finalize() and finalizeContents()) as well as doing some rephrasing / grammar fixes for existing comments. Also did some minor style fixups, such as by putting methods together in a class definition and having fields of similar types next to each other. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118714	2022-02-01 13:45:59 -05:00
Tanya Lattner	769d634789	Update status of move.	2022-02-01 10:45:40 -08:00
Fangrui Song	30e8f83c84	[GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible Generalize D99629 for ELF. A default visibility non-local symbol is preemptible in a -shared link. `isInterposable` is an insufficient condition. Moreover, a non-preemptible alias may be referenced in a sub constant expression which intends to lower to a PC-relative relocation. Replacing the alias with a preemptible aliasee may introduce a linker error. Respect dso_preemptable and suppress optimization to fix the abose issues. With the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic` compile. ``` int aliasee; extern int alias __attribute__((alias("aliasee"), visibility("hidden"))); void foo() { alias = 345; } // intended to access the local copy ``` While here, refine the condition for the alias as well. For some binary formats like COFF, `isInterposable` is a sufficient condition. But I think canonicalization for the changed case has little advantage, so I don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or `getPICLevel/getPIELevel` complexity. For instrumentations, it's recommended not to create aliases that refer to globals that have a weak linkage or is preemptible. However, the following is supported and the IR needs to handle such cases. ``` int aliasee __attribute__((weak)); extern int alias __attribute__((alias("aliasee"))); ``` There are other places where GlobalAlias isInterposable usage may need to be fixed. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D107249	2022-02-01 10:41:16 -08:00
Mahesh Ravishankar	a2361eb281	Avoid doing tile + fuse if tile sizes are zero. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D118576	2022-02-01 18:34:06 +00:00
Chris Bieneman	3615182025	[NFC] Add CFGuard to opt build If you don't include a target that directly references CFGuard it doesn't get built into opt or the llvm library build, which causes some test cases to fail. Including this in opt explicitly resolve those issues.	2022-02-01 12:32:27 -06:00
Fangrui Song	1494d064fa	[AMDGPU][test] Add dso_local to prevent preemptible alias resolution	2022-02-01 10:23:45 -08:00
Fangrui Song	fbf2f66400	[ELF] Update flag propagation rule to ignore discarded output sections See the updated insert-before.test for the effects: many synthetic sections are SHF_ALLOC\|SHF_WRITE. If they are discarded, we don't want to propagate their flags to subsequent output section descriptions. `getFirstInputSection(sec) == nullptr` can technically be merged into `isDiscardable` but I'd like to postpone that as not sharing code may give more refactoring opportunity. Depends on D118529. Reviewed By: peter.smith, bluca Differential Revision: https://reviews.llvm.org/D118530	2022-02-01 10:19:30 -08:00
Fangrui Song	a0318711c8	[ELF] Rename adjustSectionsBeforeSorting to adjustOutputSections and make it affect INSERT commands adjustSectionsBeforeSorting updates some output section attributes (alignment/flags) and removes discardable empty sections. When it is called, INSERT commands have not been processed. Therefore the flags propagation rule may not affect output sections defined in an INSERT command properly. Fix this by moving processInsertCommands before adjustSectionsBeforeSorting. adjustSectionsBeforeSorting is somewhat misnamed. The order between it and sortInputSections does not matter. With the pass shuffle, the name of adjustSectionsBeforeSorting becomes wrong. Therefore rename it. The new name is not set into stone. The function mixes several tasks and the code may be refactored in a way that we may give them more meaningful names. With this patch, I think the behavior of attribute propagation becomes more reasonable. In particular, in the absence of non-INSERT SECTIONS, inserting a section after a SHF_ALLOC one will give us a SHF_ALLOC section, not a non-SHF_ALLOC one (see linkerscript/insert-after.test). Reviewed By: peter.smith, bluca Differential Revision: https://reviews.llvm.org/D118529	2022-02-01 10:16:12 -08:00
David Green	c40744d4d6	[AArch64] Add some CCMP testing. NFC	2022-02-01 18:15:34 +00:00
Fangrui Song	0c3704fdbd	[ELF] Deduplicate names of local symbols only with -O2 The deduplication requires a DenseMap of the same size of the local part of .strtab . I optimized it in `e205445434` but it is still quite slow. For Release build of clang, deduplication makes .strtab 1.1% smaller and makes the link 3% slower. For chrome, deduplication makes .strtab 0.1% smaller and makes the link 6% slower. I suggest that we only perform the optimization with -O2 (default is -O1). Not deduplicating local symbol names will simplify parallel symbol table write. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D118577	2022-02-01 10:10:22 -08:00
Fangrui Song	ceb9094a49	[llvm-ar] -s: don't convert a thin archive to a regular one In binutils, ar -s and randlib don't convert a thin archive to a regular one. This behavior makes sense and this patch ports the behavior. Reviewed By: gbreynoo Differential Revision: https://reviews.llvm.org/D117443	2022-02-01 09:59:51 -08:00
Fangrui Song	dd6e7e0d57	[llvm-ar] Add --thin for creating a thin archive In GNU ar (since 2008), the modifier 'T' means creating a thin archive. In many other ar implementations (FreeBSD, macOS, elfutils, etc), -T means "allow filename truncation of extracted files", as specified by X/Open System Interface. For portability, 'T' with thin archive semantics should be avoided. See https://sourceware.org/bugzilla/show_bug.cgi?id=28759 binutils 2.38 will deprecate 'T' (without diagnostic) and add --thin. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D116979	2022-02-01 09:56:50 -08:00
Alexey Bataev	83620bd2ad	[SLP]Alternate vectorization for cmp instructions. Added support for alternate ops vectorization of the cmp instructions. It allows to vectorize either cmp instructions with same/swapped predicate but different (swapped) operands kinds or cmp instructions with different predicates and compatible operands kinds. Differential Revision: https://reviews.llvm.org/D115955	2022-02-01 09:54:20 -08:00
Fangrui Song	17a39aecd1	[ELF] Simplify code with invokeELFT. NFC	2022-02-01 09:53:29 -08:00
Krzysztof Parzyszek	c935f6e048	[Hexagon] Punt on registers without reaching defs in addr mode opt This fixes https://github.com/llvm/llvm-project/issues/52636.	2022-02-01 09:52:59 -08:00
Josh Mottley	ce8022faa3	[flang] Upstream partial lowering of EXIT intrinsic This patch adds partial lowering of the "EXIT" intrinsic to the backend runtime hook implemented in patch D110741. It also adds a helper function to the `RuntimeCallTestBase.h` for testing for an intrinsic function call in a `mlir::Block`. Differential Revision: https://reviews.llvm.org/D118141	2022-02-01 17:48:51 +00:00
Fangrui Song	7518d38f0a	[ELF] De-template LinkerDriver::link. NFC Replace `f<ELFT>(x)` with `InvokeELFT(f, x)`. The size reduction comes from turning `link` from 4 specializations into 1. My x86-64 lld executable is 26KiB smaller. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D118551	2022-02-01 09:47:56 -08:00
Jonas Paulsson	16978d853b	[TableGen] Fix reporting from CodeGenSchedModels::checkCompleteness(). Make the check for a complete SchedModel work as expected: report any supported instruction not having scheduler info. For unclear reasons there was a variable 'HadCompleteModel' that caused e.g. new instructions for a new subtarget not to be reported. This variable is now simply removed as all in-tree targets seem to build fine without it. Review: Simon Pilgrim Differential Revision: https://reviews.llvm.org/D118628	2022-02-01 11:32:38 -06:00
Alexey Bataev	0dc33c0a9c	[SLP][NFC]Add a test for alternate vectorization in cmp instructions with same/swapped predicate.	2022-02-01 09:28:06 -08:00
Alexander Belyaev	ebc8153786	Revert "Revert "[mlir] Purge `linalg.copy` and use `memref.copy` instead."" This reverts commit `25bf6a2a9b`.	2022-02-01 18:21:21 +01:00
Nikolas Klauser	9c52a19e32	[libc++][NFC] Add namespace comments in ranges With this patch there should be no more namespaces without closing comment Reviewed By: ldionne, Quuxplusone, #libc Spies: libcxx-commits Differential Revision: https://reviews.llvm.org/D118668	2022-02-01 18:18:13 +01:00
Peter Steinfeld	93ee588232	[flang] Rename the runtime routine that reports a fatal user error As per Steve Scalpone's suggestion, I've renamed the runtime routine to better evoke its purpose. I implemented a routine called "Crash" and added a test. Differential Revision: https://reviews.llvm.org/D118703	2022-02-01 09:01:50 -08:00
Steven Wan	245b8e5691	[NFC][AIX]Disable failed tests due to aggressive byval alignment warning on AIX These tests emit unexpected diagnostics on AIX because the byval alignment warning is emitted too aggressively. https://reviews.llvm.org/D118350 is supposed to provide a proper fix to the problem, but for the time being disable the tests to unblock. Differential Revision: https://reviews.llvm.org/D118670	2022-02-01 11:49:53 -05:00

1 2 3 4 5 ...

413322 Commits All Branches Search

413322 Commits

All Branches