llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	39a0d6889d	[X86] Add a stub for Intel's alderlake. No scheduling, no autodetection.	2020-10-24 19:01:22 +02:00
Benjamin Kramer	bd2cf96c09	[X86] Add a stub for znver3 based on the little public information there is in AMD's manuals No scheduling, no autodetection. Just enough so -march=znver3 works.	2020-10-24 19:01:22 +02:00
Benjamin Kramer	b8d2b6f6cf	Unbreak the clang-interpreter example after `0aec49c853`	2020-10-24 19:01:21 +02:00
dfukalov	9068c20965	[AMDGPU][CostModel] Refine cost model for half- and quarter-rate instructions. 1. Throughput and codesize costs estimations was separated and updated. 2. Updated fdiv cost estimation for different cases. 3. Added scalarization processing for types that are treated as !isSimple() to improve codesize estimation in getArithmeticInstrCost() and getArithmeticInstrCost(). The code was borrowed from TCK_RecipThroughput path of base implementation. Next step is unify scalarization part in base class that is currently works for TCK_RecipThroughput path only. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D89973	2020-10-24 19:53:08 +03:00
David Green	92205bf122	[ARM] Remove some dead code. NFC	2020-10-24 17:22:49 +01:00
Andrzej Warzynski	cbb7f1420b	[flang][tests] Fix Python bug in the lit config Without this change LIT tests for Flang fail with: ``` TypeError: append() takes exactly one argument (2 given) ```	2020-10-24 17:04:25 +01:00
Stefan Gränitz	66abe650ff	Reapply "[jitlink][ELF] Add zero-fill blocks for symbols in section SHN_COMMON" Root cause of the test failure was fixed with: [JITLink][ELF] PCRel32GOTLoad edge offset can be smaller three This reverts commit `10b1a61baf`.	2020-10-24 16:58:06 +02:00
Stefan Gränitz	b6ef40891c	[JITLink][ELF] PCRel32GOTLoad edge offset can be smaller three Offset is 2 for MOVL instruction in test ELF_x86-64_common. This should fix the test failures. Differential Revision: https://reviews.llvm.org/D89795	2020-10-24 16:57:48 +02:00
Caroline Concatto	4c5906cffd	[Flang][Driver] Add infrastructure for basic frontend actions and file I/O This patch introduces the dependencies required to read and manage input files provided by the command line option. It also adds the infrastructure to create and write to output files. The output is sent to either stdout or a file (specified with the `-o` flag). Separately, in order to be able to test the code for file I/O, it adds infrastructure to create frontend actions. As a basic testable example, it adds the `InputOutputTest` FrontendAction. The sole purpose of this action is to read a file from the command line and print it either to stdout or the output file. This action is run by using the `-test-io` flag also introduced in this patch (available for `flang-new` and `flang-new -fc1`). With this patch: ``` flang-new -test-io input-file.f90 ``` will read input-file.f90 and print it in the output file. The `InputOutputTest` frontend action has been introduced primarily to facilitate testing. It is hidden from users (i.e. it's only displayed with `--help-hidden`). Currently Clang doesn’t have an equivalent action. `-test-io` is used to trigger the InputOutputTest action in the Flang frontend driver. This patch makes sure that “flang-new” forwards it to “flang-new -fc1" by creating a preprocessor job. However, in Flang.cpp, `-test-io` is passed to “flang-new -fc1” without `-E`. This way we make sure that the preprocessor is _not_ run in the frontend driver. This is the desired behaviour: `-test-io` should only read the input file and print it to the output stream. co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com> Differential Revision: https://reviews.llvm.org/D87989	2020-10-24 14:58:32 +01:00
TaWeiTu	65a36bbc3d	[NPM] Port -loop-versioning-licm to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89371	2020-10-24 21:51:18 +08:00
Stefan Gränitz	10b1a61baf	Revert "[jitlink][ELF] Add zero-fill blocks for symbols in section SHN_COMMON" This reverts commit `e9955b0843`. Cannot reproduce the buildbot failures yet. Reverting in the meantime.	2020-10-24 15:43:06 +02:00
TaWeiTu	060a4fccf1	[LoopVersioning] Form dedicated exits for versioned loop to preserve simplify form The exit blocks of the versioned and non-versioned loops are not dedicated and thus the two loops are not in simplify form. Insert dummy exit blocks after loop versioning with `formDedicatedExits()` to preserve the simplify form for subsequence passes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89569	2020-10-24 21:40:46 +08:00
Stefan Gränitz	e9955b0843	[jitlink][ELF] Add zero-fill blocks for symbols in section SHN_COMMON Symbols with special section index SHN_COMMON (0xfff2) haven't been handled so far and caused an invalid section error. This is a more or less straightforward use of the code commented out at the end of the function. I checked with the ELF spec, that the symbol value gives the alignment. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D89795	2020-10-24 14:54:38 +02:00
Stefan Gränitz	138b9f1928	[JITLink][ELF] PCRel32GOTLoad relocations are resolved like regular PCRel32 ones The difference is that the former are indirect and go to the GOT while the latter go to the target directly. This info can be used to relax indirect ones that don't need the GOT (because the target is in range). We check for this optimization beforehand. For formal correctness and to avoid confusion, we should only change the relocation kind if we actually apply the relaxation.	2020-10-24 14:54:38 +02:00
Simon Pilgrim	b481e00bf4	Fix some signed/unsigned comparison gcc warnings from D87930	2020-10-24 12:51:51 +01:00
Simon Pilgrim	310f62b4ff	[InstCombine] narrowFunnelShift - fold trunc/zext or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) (PR35155) As discussed on PR35155, this extends narrowFunnelShift (recently renamed from narrowRotate) to support basic funnel shift patterns. Unlike matchFunnelShift we don't include the computeKnownBits limitation as extracting the pattern from the zext/trunc layers should be a indicator of reasonable funnel shift codegen, in D89139 we demonstrated how to efficiently promote funnel shifts to wider types. Differential Revision: https://reviews.llvm.org/D89542	2020-10-24 12:42:43 +01:00
Simon Pilgrim	ce356e1546	[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method. I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns. Differential Revision: https://reviews.llvm.org/D87930	2020-10-24 12:23:09 +01:00
Simon Pilgrim	62b17a7697	[LegalizeTypes] Legalize vector rotate operations Lower vector rotate operations as long as the legalization occurs outside of LegalizeVectorOps. This fixes https://bugs.llvm.org/show_bug.cgi?id=47320 Patch By: @rsanthir.quic (Ryan Santhirarajan) Differential Revision: https://reviews.llvm.org/D89497	2020-10-24 11:30:32 +01:00
Nikita Popov	1a7a9efec3	[BasicAA] Avoid duplicate cache lookup (NFCI) Rather than performing the cache lookup with both possible orders for the locations, use the same canonicalization as the other AliasCache lookups in BasicAA.	2020-10-24 10:19:02 +02:00
Nikita Popov	d09c592142	[BasicAA] Fix caching in the presence of phi cycles Any time we insert a block into VisitedPhiBBs, previously cached values may no longer be valid for the recursive alias queries. As such, perform them using an empty AAQueryInfo. Note that if we recurse to the same phi, the block will already be inserted, so we reuse the old AAQueryInfo, and thus still protect against infinite recursion. This problem can appear with with an without BatchAA, but is more likely to occur with BatchAA, as more values are cached. Differential Revision: https://reviews.llvm.org/D90066	2020-10-24 09:58:02 +02:00
Jonas Paulsson	7c026a83ee	[SystemZ] Define MaxInstLength to have the value of 6. This value had the default value of 4 which caused branch relaxation to fail. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D90065	2020-10-24 09:19:34 +02:00
Michał Górny	d96cb52830	[lldb] [Process/NetBSD] Use XStateRegSet for all FPU registers Unify the x86 regset API to use XStateRegSet for all FPU registers, therefore eliminating the legacy API based on FPRegSet. This makes the code a little bit simpler but most notably, it provides future compatibility for register caching. Since the NetBSD kernel takes care of providing compatibility with pre-XSAVE processors, PT_{G,S}ETXSTATE can be used on systems supporting only FXSAVE or even plain FSAVE (and unlike PT_{G,S}ETXMMREGS, it clearly indicates that XMM registers are not supported). Differential Revision: https://reviews.llvm.org/D90034	2020-10-24 09:17:53 +02:00
Martin Storsjö	84ce6b9991	[lldb] Fix building with GCC 7. NFC.	2020-10-24 09:33:01 +03:00
Tony	bf6518a806	[AMDGPU] Cleanup AMDGPUUsage.rst - Layout and typo improvements. - Add memory spaces section. - reStructure syntax fixes. Differential Revision: https://reviews.llvm.org/D90002	2020-10-24 06:21:27 +00:00
Michael Kruse	d590c85430	[flang] Fix pimpl idiom for IntrinsicProcTable. The class IntrinsicProcTable uses the pimpl idiom and manages its own pointer-to-implementation. However, it violates the rule-of-five and does not implement a move-constructor or assignment-operator. Due to differences between compilers in implementation copy elision, these may or may not be used. Due to the missing user implementation for resource handling, using the results in runtime errors. Fix my using `std::unique_ptr` instead of custom resource management. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D88794	2020-10-24 00:28:05 -05:00
Med Ismail Bennani	64c4dac60e	[llvm/DebugInfo] Emit DW_OP_implicit_value when tuning for LLDB This patch enables emitting DWARF `DW_OP_implicit_value` opcode when tuning debug information for LLDB (`-debugger-tune=lldb`). This will also propagate to Darwin platforms, since they use LLDB tuning as a default. rdar://67406059 Differential Revision: https://reviews.llvm.org/D90001 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-10-24 06:45:33 +02:00
Vitaly Buka	21d64c32ec	[NFC][UBSAN] Refine CHECK pattern in test As-is it was failed by unrelated linker warning with filename in the output.	2020-10-23 21:11:03 -07:00
Peter Collingbourne	fa66bcf4bc	hwasan: Disable operator {new,delete} interceptors when interceptors are disabled. Differential Revision: https://reviews.llvm.org/D89827	2020-10-23 21:03:47 -07:00
Michael Kruse	0b671a44ad	[flang][msvc] Fix lambda capture ambiguity. NFC. Patch D88695 introduces a new local variable inside a lambda with the same name as a variable outside of it. In some of the if constexpr regions, msvc prioritizes the outer declaration and emits the error. ``` C:\Users\meinersbur\src\llvm-project\flang\lib\Evaluate\fold-implementation.h(1200): error C3493: 'context' cannot be implicitly captured because no default capture mode has been specified ``` This is fixed by giving the inner variable a different name. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D89367	2020-10-23 22:58:40 -05:00
Michael Kruse	b57937861f	[flang][windows] Support platform-specific path separator. Remove the assumption that the path separator is `/`. Use functions from `llvm::sys::path` instead. Reviewed By: isuruf, klausler Differential Revision: https://reviews.llvm.org/D89369	2020-10-23 22:22:37 -05:00
Zequan Wu	e92eeaf3c2	[llvm-cov] don't include all source files when provided source files are filtered out When all provided source files are filtered out either due to `--ignore-filename-regex` or not part of binary, don't generate coverage reults for all source files. Because if users want to generate coverage results for all source files, they don't even need to provid selected source files or `--ignore-filename-regex`. Differential Revision: https://reviews.llvm.org/D89359	2020-10-23 19:32:16 -07:00
David Blaikie	0b05732045	fix lldb for recent libDebugInfoDWARF API change	2020-10-23 19:20:38 -07:00
Vitaly Buka	776a15d8ae	[NFC][UBSAN] Avoid "not FileCheck" in tests It's not clear if "not FileCheck" succeeded because input is empty or because input does not match "CHECK:" pattern.	2020-10-23 19:13:01 -07:00
Duncan P. N. Exon Smith	74910cbbd8	HeaderSearch: Simplify use of FileEntryRef in HeaderSearch::LookupFile, NFC Simplify `HeaderSearch::LookupFile`. Instead of deconstructing a `FileEntryRef` into a name and `FileEntry` and then rebuilding it later, use it as is. This helps to unblock making the constructor of `FileEntryRef` private to `FileManager`. Differential Revision:	2020-10-23 22:10:50 -04:00
David Blaikie	0ec5baa132	llvm-dwarfdump: Support verbose printing DW_OP_convert to print the CU local offset before the resolved absolute offset	2020-10-23 18:50:15 -07:00
Duncan P. N. Exon Smith	434f3774f6	clangd: Stop calling FileEntryRef::FileEntryRef In `ReplayPreamble::replay`, use `getFileRef` instead of `getFile`, and then use that `FileEntryRef` later to avoid needing `FileEntryRef::FileEntryRef`. The latter is going to become private to `FileManager` in a later commit.	2020-10-23 21:28:09 -04:00
Mehdi Amini	4bde9aa964	Add CMake dependency from MLIRJitRunner on all dialects This dependency was already existing indirectly, but is now more direct since the registration relies on a inline function. This fixes the link of the tools with BFD.	2020-10-24 01:24:05 +00:00
Duncan P. N. Exon Smith	81ac81f864	FileManager: Reorder declarations of FileEntry and FileEntryRef, NFC This reduces noise in a future patch, but shouldn't change anything otherwise. Differential Revision: https://reviews.llvm.org/D89521	2020-10-23 20:47:15 -04:00
Hongtao Yu	a16cbdd676	[AutoFDO] Remove a broken assert in merging inlinee samples Duplicated callsites share the same callee profile if the original callsite was inlined. The sharing also causes the profile of callee's callee to be shared. This breaks the assert introduced ealier by D84997 in a tricky way. To illustrate, I'm using an abstract example. Say we have three functions `A`, `B` and `C`. A calls B twice and B calls C once. Some optimize performed prior to the sample profile loader duplicates first callsite to `B` and the program may look like ``` A() { B(); // with nested profile B1 and C1 B(); // duplicated, with nested profile B1 and C1 B(); // with nested profile B2 and C2 } ``` For some reason, the sample profile loader inliner then decides to only inline the first callsite in `A` and transforms `A` into ``` A() { C(); // with nested profile C1 B(); // duplicated, with nested profile B1 and C1 B(); // with nested profile B2 and C2. } ``` Here is what happens next: 1. Failing to inline the callsite `C()` results in `C1`'s samples returned to `C`'s base (outlined) profile. In the meantime, `C1`'s head samples are updated to `C1`'s entry sample. This also affects the profile of the middle callsite which shares `C1` with the first callsite. 2. Failing to inline the middle callsite results in `B1` returned to `B`'s base profile, which in turn will cause `C1` merged into `B`'s base profile. Note that the nest `C` profile in `B`'s base has a non-zero head sample count now. The value actually equals to `C1`'s entry count. 3. Failing to inline last callsite results in `B2` returned to `B`'s base profile. Note that the nested `C` profile in `B`'s base now has an entry count equal to the sum of that of `C1` and `C2`, with the head count equal to that of `C1`. This will trigger the assert later on. 4. Compiling `B` using `B`'s base profile. Failing to inline `C` there triggers the returning of the nested `C` profile. Since the nested `C` profile has a non-zero head count, the returning doesn't go through. Instead, the assert goes off. It's good that `C1` is only returned once, based on using a non-zero head count to ensure an inline profile is only returned once. However C2 is never returned. While it seems hard to solve this perfectly within the current framework, I'm just removing the broken assert. This should be reasonably fixed by the upcoming CSSPGO work where counts returning is based on context-sensitivity and a distribution factor for callsite probes. The simple example is extracted from one of our internal services. In reality, why the original callsite `B()` and duplicate one having different inline behavior is a magic. It has to do with imperfect counts in profile and extra complicated inlining that makes the hotness for them different. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D90056	2020-10-23 17:42:21 -07:00
Mehdi Amini	8f492f6467	Remove unused verifyRegStateMapping() function in RegAllocFast (NFC) This fixes compiler warning when building with assertions.	2020-10-24 00:36:51 +00:00
Mehdi Amini	e7021232e6	Remove global dialect registration This has been deprecated for >1month now and removal was announced in: https://llvm.discourse.group/t/rfc-revamp-dialect-registration/1559/11 Differential Revision: https://reviews.llvm.org/D86356	2020-10-24 00:35:55 +00:00
Mehdi Amini	3a4b832b1b	Topologically sort the library to link to mlir-cpu-runner which is required with some linkers like BFD (NFC)	2020-10-24 00:35:55 +00:00
Mehdi Amini	035a6b95c3	Fix a few warnings from GCC (NFC)	2020-10-24 00:35:55 +00:00
Walter Erquinigo	48d8af9825	[intel-pt] Disable/Enable tracing to guarantee the trace is correct As mentioned in the comment inside the code, the Intel documentation states that the internal CPU buffer is flushed out to RAM only when tracing is disabled. Otherwise, the buffer on RAM might be stale. This diff disables tracing when the trace buffer is going to be read. This is a quite safe operation, as the reading is done when the inferior is paused at a breakpoint, so we are not losing any packets because there's no code being executed. After the reading is finished, tracing is enabled back. It's a bit hard to write a test for this now, but Greg Clayton and I will refactor the PT support and writing tests for it will be easier. However I tested it manually by doing a script that automates the following flow ``` (lldb) b main Breakpoint 1: where = a.out`main + 15 at main.cpp:4:7, address = 0x000000000040050f (lldb) r Process 3078226 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x000000000040050f a.out`main at main.cpp:4:7 (lldb) processor-trace start (lldb) b 5 Breakpoint 2: where = a.out`main + 22 at main.cpp:5:12, address = 0x0000000000400516 (lldb) c Process 3078226 resuming Process 3078226 stopped * thread #1, name = 'a.out', stop reason = breakpoint 2.1 frame #0: 0x0000000000400516 a.out`main at main.cpp:5:12 (lldb) processor-trace show-instr-log thread #1: tid=3078226 0x40050f <+15>: movl $0x0, -0x8(%rbp) >>> Before, some runs of the script up to this point lead to empty traces (lldb) b 6 Breakpoint 3: where = a.out`main + 42 at main.cpp:6:14, address = 0x000000000040052a (lldb) c Process 3092991 resuming Process 3092991 stopped * thread #1, name = 'a.out', stop reason = breakpoint 3.1 frame #0: 0x000000000040052a a.out`main at main.cpp:6:14 (lldb) processor-trace show-instr-log thread #1: tid=3092991 0x40050f <+15>: movl $0x0, -0x8(%rbp) 0x400516 <+22>: movl $0x0, -0xc(%rbp) 0x40051d <+29>: cmpl $0x2710, -0xc(%rbp) ; imm = 0x2710 0x400524 <+36>: jge 0x400546 ; <+70> at main.cpp 0x400524 <+36>: jge 0x400546 ; <+70> at main.cpp >>> The trace was re-enabled correctly and includes the instruction of the first reading. ``` Those instructions correspond to these lines ``` 3 int main() { 4 int z = 0; 5 for (int i = 0; i < 10000; i++) { 6 z += fun(z) ... ``` Differential Revision: https://reviews.llvm.org/D85241	2020-10-23 16:36:42 -07:00
Richard Smith	ccca93b5a2	Don't allow structured binding declarations to decompose a lambda-expression's captures. The built-in structured binding rules for classes require that all fields can be accessed by name, and the fields introduced for lambda captures are unnamed, so decomposing a capturing lambda is ill-formed.	2020-10-23 16:28:25 -07:00
Krzysztof Parzyszek	1b5baa42bc	[Hexagon] Handle selection between HVX vector predicates Make sure that (select i1 q0 q1) is handled properly.	2020-10-23 18:22:03 -05:00
Max Moroz	dc62d5ec97	[libFuzzer] Added -print_full_coverage flag. -print_full_coverage=1 produces a detailed branch coverage dump when run on a single file. Uses same infrastructure as -print_coverage flag, but prints all branches (regardless of coverage status) in an easy-to-parse format. Usage: For internal use with machine learning fuzzing models which require detailed coverage information on seed files to generate mutations. Differential Revision: https://reviews.llvm.org/D85928	2020-10-23 16:05:54 -07:00
Teresa Johnson	eeba325b12	[MemProf] Attempt to debug avr bot failure Reverts the XFAIL added in `b67a2aef8a`, which had no effect. Adjust the test to make sure all output is dumped to stderr, so that hopefully I can get a better idea of where/why this is failing. Remove some redundant checking while here.	2020-10-23 16:00:08 -07:00
Arthur Eubanks	baffd052b0	[StructurizeCFG][NewPM] Port -structurizecfg to NPM This doesn't support -structurizecfg-skip-uniform-regions since that would require porting LegacyDivergenceAnalysis. The NPM doesn't support adding a non-analysis pass as a dependency of another, so I had to add -lowerswitch to some tests or pin them to the legacy PM. This is the only RegionPass in tree, so I simply copied the logic for finding all Regions from the legacy PM's RGManager into StructurizeCFG::run(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D89026	2020-10-23 15:54:03 -07:00
Arthur Eubanks	ba22c403b2	[Inliner][NPM] Properly pass callee AAResults Fixes noalias-calls.ll under NPM. Differential Revision: https://reviews.llvm.org/D89592	2020-10-23 15:37:18 -07:00

1 2 3 4 5 ...

369963 Commits All Branches Search

369963 Commits

All Branches