llvm-project

Commit Graph

Author	SHA1	Message	Date
Vedant Kumar	4960fbf391	[test] Require 'asserts' for a test which uses -debug-only Without this line, bots which run check-all on Release compilers will break. llvm-svn: 266386	2016-04-14 23:32:40 +00:00
Matt Arsenault	9c499c3a74	AMDGPU: Remove custom load/store scalarization llvm-svn: 266385	2016-04-14 23:31:26 +00:00
Adrian McCarthy	f6034c4f3d	Don't disable stdin buffering on Windows Disabling buffering exposes a bug in the MS VS 2015 CRT implementation of fgets, where you sometimes have to hit Enter twice, depending on if the input had an odd or even number of characters. This was hidden until a few days ago by the Python initialization which was re-enabling buffering on the streams. A few days ago, Enrico make the Python initialization on-demand, which exposed this problem. llvm-svn: 266384	2016-04-14 23:31:17 +00:00
Matt Arsenault	1dd2752e7a	AMDGPU: Add test for generic builtin behavior llvm-svn: 266383	2016-04-14 22:34:39 +00:00
Matt Arsenault	fd8ab09c0e	AMDGPU: Include LDS size in printed comment llvm-svn: 266382	2016-04-14 22:11:51 +00:00
Michael Kuperstein	16f13e252b	[AliasSetTracker] Correctly handle changing the size of an entry If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381	2016-04-14 22:00:11 +00:00
Mehdi Amini	dc4c095d51	Nuke getGlobalContext() from LLVM (but the C API) The only use for getGlobalContext() is in the C API. Let's just move the static global here and nuke the C++ API. Differential Revision: http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266380	2016-04-14 21:59:18 +00:00
Mehdi Amini	03b42e41bf	Remove every uses of getGlobalContext() in LLVM (but the C API) At the same time, fixes InstructionsTest::CastInst unittest: yes you can leave the IR in an invalid state and exit when you don't destroy the context (like the global one), no longer now. This is the first part of http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266379	2016-04-14 21:59:01 +00:00
Matt Arsenault	3d1c1deb04	AMDGPU: Run SIFoldOperands after PeepholeOptimizer PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378	2016-04-14 21:58:24 +00:00
Matt Arsenault	4ac341c8b3	AMDGPU: Directly emit m0 initialization with s_mov_b32 Currently what comes out of instruction selection is a register initialized to -1, and then copied to m0. MachineCSE doesn't consider copies, but we want these to be CSEed. This isn't much of a problem currently, because SIFoldOperands is run immediately after. This avoids regressions when SIFoldOperands is run later from leaving all copies to m0. llvm-svn: 266377	2016-04-14 21:58:15 +00:00
Matt Arsenault	7900334dd5	AMDGPU: Fold bitcasts of scalar constants to vectors This cleans up some messes since the individual scalar components can be CSEed. llvm-svn: 266376	2016-04-14 21:58:07 +00:00
Rui Ueyama	34b60f1c40	Do not use llvm::getGlobalContext(). llvm-svn: 266375	2016-04-14 21:41:44 +00:00
Mehdi Amini	03a5042e4c	Revert "Do not use llvm::getGlobalContext(), trying to nuke it from LLVM" This reverts commit r266365 and r266367, the contexts in the two places have to match. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266373	2016-04-14 21:31:46 +00:00
Geoff Berry	6381713b37	[ScheduleDAGInstrs] Re-factor for based on review feedback. NFC. Summary: Re-factor some code to improve clarity and style based on review comments from http://reviews.llvm.org/D18093. Reviewers: MatzeB, mcrosier Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19128 llvm-svn: 266372	2016-04-14 21:31:07 +00:00
Marcin Koscielnicki	9e09355477	[sanitizer] [SystemZ] Fix stack traces. On s390, the return address is in %r14, which is saved 14 words from the frame pointer. Unfortunately, there's no way to do a proper fast backtrace on SystemZ with current LLVM - the saved %r15 in fixed-layout register save area points to the containing frame itself, and not to the next one. Likewise for %r11 - it's identical to %r15, unless alloca is used (and even if it is, it's still useless). There's just no way to determine frame size / next frame pointer. -mbackchain would fix that (and make the current code just work), but that's not yet supported in LLVM. We will thus need to XFAIL some asan tests (Linux/stack-trace-dlclose.cc, deep_stack_uaf.cc). Differential Revision: http://reviews.llvm.org/D18895 llvm-svn: 266371	2016-04-14 21:19:27 +00:00
Marcin Koscielnicki	20bf94209e	[sanitizer] [SystemZ] Add/fix kernel and libc type definitions. This is the first part of upcoming asan support for s390 and s390x. Note that there are bits for 31-bit support in this and subsequent patches - while LLVM itself doesn't support it, gcc should be able to make use of it just fine. Differential Revision: http://reviews.llvm.org/D18888 llvm-svn: 266370	2016-04-14 21:17:19 +00:00
Samuel Benzaquen	4fa2d57c6d	[clang-tidy] Add check misc-multiple-statement-macro Summary: The check detects multi-statement macros that are used in unbraced conditionals. Only the first statement will be part of the conditionals and the rest will fall outside of it and executed unconditionally. Reviewers: alexfh Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18766 llvm-svn: 266369	2016-04-14 21:15:57 +00:00
Simon Atanasyan	1ca263c890	[ELF][MIPS] Make R_MIPS_LO16 a relative relocation if it references _gp_disp symbol The _gp_disp symbol designates offset between start of function and 'gp' pointer into GOT. The following code is a typical MIPS function preamble used to setup $gp register: lui $gp, %hi(_gp_disp) addi $gp, $gp, %lo(_gp_disp) To calculate R_MIPS_HI16 / R_MIPS_LO16 relocations results we use the following formulas: %hi(_gp - P + A) %lo(_gp - P + A + 4), where _gp is a value of _gp symbol, A is addend, and P current address. The R_MIPS_LO16 relocation references _gp_disp symbol is always the second instruction. That is why we need four byte adjustments. The patch assigns R_PC type for R_MIPS_LO16 relocation and adjusts its addend by 4. That fix R_MIPS_LO16 calculation. For details see p. 4-19 at ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf Differential Revision: http://reviews.llvm.org/D19115 llvm-svn: 266368	2016-04-14 21:10:05 +00:00
Mehdi Amini	9a15293ec1	Use fully qualified name to refer to LLVMContext From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266367	2016-04-14 21:09:10 +00:00
Reid Kleckner	e1a16467a3	In vector comparisons, handle scalar LHS just as we handle scalar RHS Summary: Fixes PR27258 Reviewers: rsmith Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19123 llvm-svn: 266366	2016-04-14 21:03:38 +00:00
Mehdi Amini	f1e7ff9d0e	Do not use llvm::getGlobalContext(), trying to nuke it from LLVM From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266365	2016-04-14 20:53:39 +00:00
Rafael Espindola	7f0b727235	Specialize the symbol table data structure a bit. We never need to iterate over the K,V pairs, so we can avoid copying the key as MapVector does. This is a small speedup on most benchmarks. llvm-svn: 266364	2016-04-14 20:42:43 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	e998b91d86	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Greg Clayton	dfa63248c7	Fix Xcode project after recent s390x changes. llvm-svn: 266361	2016-04-14 20:05:21 +00:00
Dehao Chen	34cc676732	Fix null pointer access for discriminator assignment. Summary: This fixes the buildbot failure. Reviewers: dnovillo, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19129 llvm-svn: 266360	2016-04-14 19:46:38 +00:00
Richard Smith	f32cc293fa	Make this code less brittle. The benefits of a fixed-size array aren't worth the maintenance cost. llvm-svn: 266359	2016-04-14 19:45:19 +00:00
Aaron Ballman	b602ee7d8f	Add support for type aliases to modernize-redundant-void-arg.cpp Patch by Clement Courbet. llvm-svn: 266358	2016-04-14 19:28:13 +00:00
Rafael Espindola	c9157d35d9	Hash symbol names only once per global SymbolBody. The DenseMap doesn't store hash results. This means that when it is resized it has to recompute them. This patch is a small hack that wraps the StringRef in a struct that remembers the hash value. That way we can be sure it is only hashed once. llvm-svn: 266357	2016-04-14 19:17:16 +00:00
Tom Stellard	000c5af3e6	AMDGPU: Add skeleton GlobalIsel implementation Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356	2016-04-14 19:09:28 +00:00
Rafael Espindola	3151d89595	Simplify handling of size relocations. NFC. llvm-svn: 266355	2016-04-14 18:39:44 +00:00
Dehao Chen	46f8fbbb1b	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Richard Smith	9c4fb0a833	Fix off-by-one error in worst-case number of offsets needed for an AST record. llvm-svn: 266353	2016-04-14 18:32:54 +00:00
Ulrich Weigand	acc50e0a99	Fix regression in gnu_libstdcpp.py introduced by r266313 CreateChildAtOffset needs a byte offset, not an element number. llvm-svn: 266352	2016-04-14 18:31:12 +00:00
Reid Kleckner	28865809fe	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
Davide Italiano	96d2a1c603	[ValueMapper] Range-loopify to improve readability. NFC. llvm-svn: 266350	2016-04-14 18:07:32 +00:00
Jacques Pienaar	ad1db3597e	[lanai] Add custom lowering for SRL_PARTS i32. llvm-svn: 266349	2016-04-14 17:59:22 +00:00
Tom Stellard	cef0fe4245	[GlobalISel] Move GISelAccessor class into public headers Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348	2016-04-14 17:45:38 +00:00
Nicolai Haehnle	13d90f324c	[DivergenceAnalysis] Treat PHI with incoming undef as constant Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347	2016-04-14 17:42:47 +00:00
Nicolai Haehnle	05b127da06	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
Nicolai Haehnle	723b73b4eb	AMDGPU: Remove SIFixSGPRLiveRanges pass Summary: This pass is unnecessary and overly conservative. It was motivated by situations like def %vreg0:SGPR_32 ... if-block: .. def %vreg1:SGPR_32 ... else-block: ... use %vreg0:SGPR_32 ... and similar situations with uses after the non-uniform control flow, where we are not allowed to assign %vreg0 and %vreg1 to the same physical register, even though in the original, thread/workitem-based CFG, it looks like the live ranges of these registers do not overlap. However, by the time register allocation runs, we have moved to a wave-based CFG that accurately represents the fact that the wave may run through both the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already overlap even without the SIFixSGPRLiveRanges pass. In addition to proving this change correct, I have tested it with Piglit and a small number of other tests. Reviewers: arsenm, tstellarAMD Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19041 llvm-svn: 266345	2016-04-14 17:42:29 +00:00
Nicolai Haehnle	19f0f5177d	AMDGPU: change a redundant if () to an assert(). NFC Summary: I've been carrying this change around with me for a while, because the if () managed to confuse me while following the code. All callers ensure that the assertion holds. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19042 llvm-svn: 266344	2016-04-14 17:42:18 +00:00
Ulrich Weigand	4f42310dfb	Disable LinuxCoreTestCase.test_s390x This seems to hang on non-s390x hosts. Disable for now to get the build bots going again. llvm-svn: 266343	2016-04-14 17:36:41 +00:00
Tom Stellard	b72a65ff53	[GlobalISel] Coding style and whitespace fixes Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D19119 llvm-svn: 266342	2016-04-14 17:23:33 +00:00
Ulrich Weigand	da70c17bfc	Revert r266311 - Fix usage of APInt.getRawData for big-endian systems Try to get 32-bit build bots running again. llvm-svn: 266341	2016-04-14 17:22:18 +00:00
George Rimar	5cfd306e00	Move variables closer to code scopes that uses them. NFC. llvm-svn: 266340	2016-04-14 17:05:56 +00:00
Tim Northover	cdf1529c01	AArch64: expand cmpxchg after regalloc at -O0. FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339	2016-04-14 17:03:29 +00:00
Jacques Pienaar	add4a274ba	[lanai] Add areMemAccessesTriviallyDisjoint, getMemOpBaseRegImmOfs and getMemOpBaseRegImmOfsWidth. Summary: Add getMemOpBaseRegImmOfsWidth to enable determining independence during MiSched. Reviewers: eliben, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18903 llvm-svn: 266338	2016-04-14 16:47:42 +00:00
Tom Stellard	79a1fd718c	AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337	2016-04-14 16:27:07 +00:00
Tom Stellard	f110f8f9f7	AMDGPU/SI: Use the correct scratch wave offset register for shaders. Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336	2016-04-14 16:27:03 +00:00

1 2 3 4 5 ...

227885 Commits All Branches Search

227885 Commits

All Branches