llvm-project

Commit Graph

Author	SHA1	Message	Date
Sean Fertile	3c8c385a77	[PPC] cleanup of mayLoad/mayStore flags and memory operands. 1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store instructions. 2) Updated the flags on a number of intrinsics indicating that they write memory. 3) Added SDNPMemOperand flags for some target dependent SDNodes so that they propagate their memory operand Review: https://reviews.llvm.org/D28818 llvm-svn: 293200	2017-01-26 18:59:15 +00:00
Tony Jiang	3a2f00b024	[PowerPC] Implement missing ISA 2.06 instructions. Instructions: fctidu[.], fctiwu[.], ftdiv, ftsqrt are not implemented. Implement them and add corresponding test cases in this patch. llvm-svn: 291116	2017-01-05 15:00:45 +00:00
Nemanja Ivanovic	552c8e960e	[Power9] Allow AnyExt immediates for XXSPLTIB In some situations, the BUILD_VECTOR node that builds a v18i8 vector by a splat of an i8 constant will end up with signed 8-bit values and other situations, it'll end up with unsigned ones. Handle both situations. Fixes PR31340. llvm-svn: 289804	2016-12-15 11:16:20 +00:00
Nemanja Ivanovic	df1cb520df	[PowerPC] Improvements for BUILD_VECTOR Vol. 1 This patch corresponds to review: https://reviews.llvm.org/D25912 This is the first patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. llvm-svn: 288152	2016-11-29 16:11:34 +00:00
Ehsan Amiri	c90b02cf50	[PPC] Generate positive FP zero using xor insn instead of loading from constant area https://reviews.llvm.org/D23614 Currently we load +0.0 from constant area. That can change to be generated using XOR instruction. llvm-svn: 284995	2016-10-24 17:31:09 +00:00
Nemanja Ivanovic	11049f8f07	[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. llvm-svn: 283190	2016-10-04 06:59:23 +00:00
Nemanja Ivanovic	d2c3c51a70	[Power9] Exploit move and splat instructions for build_vector improvement This patch corresponds to review: https://reviews.llvm.org/D21135 This patch exploits the following instructions: mtvsrws lxvwsx mtvsrdd mfvsrld In order to improve some build_vector and extractelement patterns. llvm-svn: 282246	2016-09-23 13:25:31 +00:00
Hal Finkel	522e4d9d66	[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571	2016-09-03 02:31:44 +00:00
Hal Finkel	28842b96f3	[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560	2016-09-02 23:42:01 +00:00
Hal Finkel	277736eee6	[PowerPC] Add support for the extended dcbf form and mnemonics dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559	2016-09-02 23:41:54 +00:00
Hal Finkel	a39fd4bc53	[PowerPC] Add a pattern for a runtime bit check Following a suggestion by Sanjay, we should lower: %shl = shl i32 1, %y %and = and i32 %x, %shl %cmp = icmp eq i32 %and, %shl ret i1 %cmp into: subfic r4, r4, 32 rlwnm r3, r3, r4, 31, 31 Add this pattern and some associated patterns for the 64-bit case and the not-equal case. Fixes PR27356. llvm-svn: 280454	2016-09-02 02:34:44 +00:00
Hal Finkel	5728200f33	[PowerPC] Implement lowering for atomicrmw min/max/umin/umax Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818. llvm-svn: 279933	2016-08-28 16:17:58 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Nemanja Ivanovic	b43bb6141e	[Power9] Add codegen for VSX word insert/extract instructions This patch corresponds to review: http://reviews.llvm.org/D20239 It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that are useful in some cases for inserting and extracting vector elements of v4[if]32 vectors. llvm-svn: 275215	2016-07-12 21:00:10 +00:00
David Majnemer	e61e4bfd87	Replace silly uses of 'signed' with 'int' llvm-svn: 273244	2016-06-21 05:10:24 +00:00
Eric Christopher	1dbb23e162	Add aliases for mfvrsave/mtvrsave. Update a test as we're now going to emit it for easier reading of generated assembly as well. llvm-svn: 272339	2016-06-09 23:27:48 +00:00
Nemanja Ivanovic	1a2b2f03e7	[PowerPC] Generate VSX version of splat word This patch corresponds to review: http://reviews.llvm.org/D18592 It allows the PPC back end to generate the xxspltw instruction where we previously only emitted vspltw. llvm-svn: 268516	2016-05-04 16:04:02 +00:00
Marcin Koscielnicki	7b32957852	[PowerPC] Fix the EH_SjLj_Setup pseudo. This instruction is just a control flow marker - it should not actually exist in the object file. Unfortunately, nothing catches it before it gets to AsmPrinter. If integrated assembler is used, it's considered to be a normal 4-byte instruction, and emitted as an all-0 word, crashing the program. With external assembler, a comment is emitted. Fixed by setting Size to 0 and handling it in MCCodeEmitter - this means the comment will still be emitted if integrated assembler is not used. This broke an ASan test, which has been disabled for a long time as a result (see the discussion on D19657). We can reenable it once this lands. llvm-svn: 267943	2016-04-28 21:24:37 +00:00
Kit Barton	7a1a9e01ad	This reverts commit r265505. Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance". This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe. llvm-svn: 267927	2016-04-28 20:00:42 +00:00
Nemanja Ivanovic	87bcae366d	[PowerPC] Basic support for P9 byte comparison and count trailing zero insns This patch corresponds to review: http://reviews.llvm.org/D17850 This patch implements the following instructions: cmprb, cmpeqb, cnttzw, cnttzw., cnttzd, cnttzd. llvm-svn: 266228	2016-04-13 18:51:18 +00:00
Chuang-Yu Cheng	024a623c55	[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance This patch implement the following instructions: - addpcis subpcis - maddhd maddhdu maddld - modsw moduw modsd modud - darn - extswsli extswsli. - setb - dtstsfi dtstsfiq Total 15 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton http://reviews.llvm.org/D17885 llvm-svn: 265505	2016-04-06 01:47:02 +00:00
Chuang-Yu Cheng	eaf4b3d75c	[Power9] Implement copy-paste, msgsync, slb, and stop instructions This patch implements the following BookII and Book III instructions: - copy copy_first cp_abort paste paste. paste_last - msgsync - slbieg slbsync - stop Total 10 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton llvm-svn: 265504	2016-04-06 01:46:45 +00:00
Nemanja Ivanovic	a621a7f9c3	[PowerPC] Basic support for P9 atomic loads and stores This patch corresponds to review: http://reviews.llvm.org/D18032 This patch provides asm implementation for the following instructions: lwat, ldat, stwat, stdat, ldmx, mcrxrx llvm-svn: 265022	2016-03-31 15:26:37 +00:00
Chuang-Yu Cheng	80722719eb	[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat This change implements the following vsx instructions: - Scalar Insert/Extract xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp - Vector Insert/Extract xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp xxextractuw xxinsertw - Scalar/Vector Test Data Class xststdcdp xststdcsp xststdcqp xvtstdcdp xvtstdcsp - Maximum/Minimum xsmaxcdp xsmaxjdp xsmincdp xsminjdp - Vector Byte-Reverse/Permute/Splat xxbrd xxbrh xxbrq xxbrw xxperm xxpermr xxspltib 30 instructions Thanks Nemanja for invaluable discussion! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16842 llvm-svn: 264567	2016-03-28 08:34:28 +00:00
Kit Barton	ba532dc816	[Power9] Implement new vsx instructions: load, store instructions for vector and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906	2016-03-08 03:49:13 +00:00
Nemanja Ivanovic	2314e83227	Prevent renaming of CR fields in AADB when a CR restore is present This patch corresponds to review: http://reviews.llvm.org/D15930 Moves to and from CR fields depend on shifts/masks that depend on the target/source CR field. Thus, post-ra anti-dep breaking must not later change that CR register assignment. llvm-svn: 257168	2016-01-08 13:09:54 +00:00
Yury Gribov	d7dbb66eb8	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics. The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404	2015-12-01 11:40:55 +00:00
Hal Finkel	f4052340a4	[PowerPC] Replace cntlz[.] with cntlzw[.] cntlz is the old POWER mnemonic. cntlzw is the PowerPC mnemonic. This change fixes an issue when -no-integrated-as: The opcode cntlz is unrecognized by gas Alias the POWER mnemonic cntlz[.] to the PowerPC mnemonic cntlzw[.] This is done for because the POWER cntlz mnemonic has be used by LLVM for a very long time. We need to make sure that assembly programs that are using the cntlz[.] do not break with this change. Change PowerPC tests to reflect the insn change from cntlz to cntlzw. Add assembly test to verify cntlz[.] is encoded correctly. Patch by Tom Rix! llvm-svn: 251489	2015-10-28 03:26:45 +00:00
Hal Finkel	a2cdbce661	[PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands There were really two problems here. The first was that we had the truth tables for signed i1 comparisons backward. I imagine these are not very common, but if you have: setcc i1 x, y, LT this has the '0 1' and the '1 0' results flipped compared to: setcc i1 x, y, ULT because, in the signed case, '1 0' is really '-1 0', and the answer is not the same as in the unsigned case. The second problem was that we did not have patterns (at all) for the unsigned comparisons select_cc nodes for i1 comparison operands. This was the specific cause of PR24552. These had to be added (and a missing Altivec promotion added as well) to make sure these function for all types. I've added a bunch more test cases for these patterns, and there are a few FIXMEs in the test case regarding code-quality. Fixes PR24552. llvm-svn: 246400	2015-08-30 22:12:50 +00:00
Kit Barton	4f79f96fd7	Properly handle the mftb instruction. The mftb instruction was incorrectly marked as deprecated in the PPC Backend. Instead, it should not be treated as deprecated, but rather be implemented using the mfspr instruction. A similar patch was put into GCC last year. Details can be found at: https://sourceware.org/ml/binutils/2014-11/msg00383.html. This change will replace instances of the mftb instruction with the mfspr instruction for all CPUs except 601 and pwr3. This will also be the default behaviour. Additional details can be found in: https://llvm.org/bugs/show_bug.cgi?id=23680 Phabricator review: http://reviews.llvm.org/D10419 llvm-svn: 239827	2015-06-16 16:01:15 +00:00
Bill Schmidt	e26236eed9	[PPC64] Add support for clrbhrb, mfbhrbe, rfebb. This patch adds support for the ISA 2.07 additions involving the branch history rolling buffer and event-based branching. These will not be used by typical applications, so built-in support is not required. They will only be available via inline assembly. Assembly/disassembly tests are included in the patch. llvm-svn: 238032	2015-05-22 16:44:10 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Hal Finkel	d86e90abdd	[PowerPC] Use sync inst alias when printing So long as the choice between printing msync and sync is not ambiguous, we can print 'sync 0' and just 'sync'. llvm-svn: 235663	2015-04-23 23:05:08 +00:00
Hal Finkel	fefcfffe68	[PowerPC] Add asm/disasm support for dcbt with hint Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint field specified (non-zero). Unforunately, the syntax for this instruction is special in that it differs for server vs. embedded cores: dcbt ra, rb, th [server] dcbt th, ra, rb [embedded] where th can be omitted when it is 0. dcbtst is the same. Thus we need to play games in the parser and the printer to flip the operands around on the embedded cores. We'll use the server syntax as the default (binutils currently uses the embedded form by default, but IBM is changing that). We also stop marking dcbtst as having unmodeled side effects (this is not necessary, it is just a hint like dcbt -- noticed by inspection, so no separate test case). llvm-svn: 235657	2015-04-23 22:47:57 +00:00
Nemanja Ivanovic	c09047916a	Add LLVM support for remaining integer divide and permute instructions from ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. llvm-svn: 234546	2015-04-09 23:54:37 +00:00
Hal Finkel	6e9110abe9	[PowerPC] Add asm parser support for bitmask forms of rotate-and-mask instructions The asm syntax for the 32-bit rotate-and-mask instructions can take a 32-bit bitmask instead of an (mb, me) pair. This syntax is not specified in the Power ISA manual, but is accepted by GNU as, and is documented in IBM's Assembler Language Reference. The GNU Multiple Precision Arithmetic Library (gmp) contains assembly that uses this syntax. To implement this, I moved the isRunOfOnes utility function from PPCISelDAGToDAG.cpp to PPCMCTargetDesc.h. llvm-svn: 233483	2015-03-28 19:42:41 +00:00
Kit Barton	535e69de34	Add Hardware Transactional Memory (HTM) Support This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204	2015-03-25 19:36:23 +00:00
Hal Finkel	6a778fb7c2	[PowerPC] Remove canFoldAsLoad from instruction definitions The PowerPC backend had a number of loads that were marked as canFoldAsLoad (and I'm partially at fault here for copying around the relevant line of TableGen definitions without really looking at what it meant). This is not right; PPC (non-memory) instructions don't support direct memory operands, and so there is nothing a 'foldable' instruction could be folded into. Noticed by inspection, no test case. The one thing we might lose by doing this is ability to fold some loads into stackmap/patchpoint pseudo-instructions. However, this was untested, and would not obviously have worked for extending loads, and I'd rather re-add support for that once it can be tested. llvm-svn: 231982	2015-03-11 23:28:38 +00:00
Nemanja Ivanovic	0adf26b9b0	Add support for part-word atomics for PPC http://reviews.llvm.org/D8090#inline-67337 llvm-svn: 231843	2015-03-10 20:51:07 +00:00
Nemanja Ivanovic	e8effe1edb	Add LLVM support for PPC cryptography builtins Review: http://reviews.llvm.org/D7955 llvm-svn: 231285	2015-03-04 20:44:33 +00:00
Hal Finkel	cf59921670	[PowerPC] Make LDtocL and friends invariant loads LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node, represent loads from the TOC, which is invariant. As a result, these loads can be hoisted out of loops, etc. In order to do this, we need to generate GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory intrinsic node type. Once this is done, the MMO transfer is automatically handled for TableGen-driven instruction selection, and for nodes generated directly in PPCISelDAGToDAG, we need to transfer the MMOs manually. Also, we were not transferring MMOs associated with pre-increment loads, so do that too. Lastly, this fixes an exposed bug where R30 was not added as a defined operand of UpdateGBR. This problem was highlighted by an example (used to generate the test case) posted to llvmdev by Francois Pichet. llvm-svn: 230553	2015-02-25 21:36:59 +00:00
Hal Finkel	0746211811	[PowerPC] Cleanup unused target-specific SDAG nodes We had somehow accumulated a few target-specific SDAG nodes dealing with PPC64 TOC access that were referenced only in TableGen patterns. The associated (pseudo-)instructions are used, but are being generated directly. NFC. llvm-svn: 230518	2015-02-25 18:06:45 +00:00
Hal Finkel	c93a9a2cb4	[PowerPC] Add support for the QPX vector instruction set This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes wide, holding 4 double-precision floating-point values. Boolean values, modeled here as <4 x i1> are actually also represented as floating-point values (essentially { -1, 1 } for { false, true }). QPX shares many features with Altivec and VSX, but is distinct from both of them. One major difference is that, instead of adding completely-separate vector registers, QPX vector registers are extensions of the scalar floating-point registers (lane 0 is the corresponding scalar floating-point value). The operations supported on QPX vectors mirrors that supported on the scalar floating-point values (with some additional ones for permutations and logical/comparison operations). I've been maintaining this support out-of-tree, as part of the bgclang project, for several years. This is not the entire bgclang patch set, but is most of the subset that can be cleanly integrated into LLVM proper at this time. Adding this to the LLVM backend is part of my efforts to rebase bgclang to the current LLVM trunk, but is independently useful (especially for codes that use LLVM as a JIT in library form). The assembler/disassembler test coverage is complete. The CodeGen test coverage is not, but I've included some tests, and more will be added as follow-up work. llvm-svn: 230413	2015-02-25 01:06:45 +00:00
Bill Schmidt	82f1c775a0	[PowerPC] Fix reverted patch r227976 to avoid register assignment issues See full discussion in http://reviews.llvm.org/D7491. We now hide the add-immediate and call instructions together in a separate pseudo-op, which is tagged to define GPR3 and clobber the call-killed registers. The PPCTLSDynamicCall pass prior to RA now expands this op into the two separate addi and call ops, with explicit definitions of GPR3 on both instructions, and explicit clobbers on the call instruction. The pass is now marked as requiring and preserving the LiveIntervals and SlotIndexes analyses, and fixes these up after the replacement sequences are introduced. Self-hosting has been verified on LE P8 and BE P7 with various optimization levels, etc. It has also been verified with the --no-tls-optimize flag workaround removed. llvm-svn: 228725	2015-02-10 19:09:05 +00:00
Hal Finkel	57c6ac5e41	[PowerPC] Support the (old) cntlz instruction alias Some old assembly code uses the cntlz alias for cntlzw, binutils supports this, and we should too. Fixes PR22519. llvm-svn: 228719	2015-02-10 18:45:02 +00:00
Hal Finkel	0d2a1515d5	Revert "r227976 - [PowerPC] Yet another approach to __tls_get_addr" and related fixups Unfortunately, even with the workaround of disabling the linker TLS optimizations in Clang restored (which has already been done), this still breaks self-hosting on my P7 machine (-O3 -DNDEBUG -mcpu=native). Bill is currently working on an alternate implementation to address the TLS issue in a way that also fully elides the linker bug (which, unfortunately, this approach did not fully), so I'm reverting this now. llvm-svn: 228460	2015-02-06 23:07:40 +00:00
Sylvestre Ledru	9be0b77f3f	Fix an incorrect identifier Summary: EIEIO is not a correct declaration and breaks the build under Debian HURD. Instead, E_IEIO is used. // http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html Some additional classes of identifier names are reserved for future extensions to the C language or the POSIX.1 environment. While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of the C or POSIX standards, so you should avoid these names. ... Names beginning with a capital ‘E’ followed a digit or uppercase letter may be used for additional error code names. See Error Reporting.// Reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776965 And patch wrote by Svante Signell With this patch, LLVM, Clang & LLDB build under Debian HURD: https://buildd.debian.org/status/fetch.php?pkg=llvm-toolchain-3.6&arch=hurd-i386&ver=1%3A3.6~%2Brc2-2&stamp=1423040039 Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7437 llvm-svn: 228331	2015-02-05 18:57:02 +00:00
Bill Schmidt	685aa8b0c5	[PowerPC] Yet another approach to __tls_get_addr This patch is a third attempt to properly handle the local-dynamic and global-dynamic TLS models. In my original implementation, calls to __tls_get_addr were hidden from view until the asm-printer phase, at which point the underlying branch-and-link instruction was created with proper relocations. This mostly worked well, but I used some repellent techniques to ensure that the TLS_GET_ADDR nodes at the SD and MI levels correctly received input from GPR3 and produced output into GPR3. This proved to work badly in the presence of multiple TLS variable accesses, with the copies to and from GPR3 being scheduled incorrectly and generally creating havoc. In r221703, I addressed that problem by representing the calls to __tls_get_addr as true calls during instruction lowering. This had the advantage of removing all of the bad hacks and relying on the existing call machinery to properly glue the copies in place. It looked like this was going to be the right way to go. However, as a side effect of the recent discovery of problems with linker optimizations for TLS, we discovered cases of suboptimal code generation with this strategy. The problem comes when tls_get_addr is called for the same address, and there is a resulting CSE opportunity. It turns out that in such cases MachineCSE will common the addis/addi instructions that set up the input value to tls_get_addr, but will not common the calls themselves. MachineCSE does not have any machinery to common idempotent calls. This is perfectly sensible, since presumably this would be done at the IR level, and introducing calls in the back end isn't commonplace. In any case, we end up with two calls to __tls_get_addr when one would suffice, and that isn't good. I presumed that the original design would have allowed commoning of the machine-specific nodes that hid the __tls_get_addr calls, so as suggested by Ulrich Weigand, I went back to that design and cleaned it up so that the copies were properly held together by glue nodes. However, it turned out that this didn't work either...the presence of copies to physical registers kept the machine-specific nodes from being commoned also. All of which leads to the design presented here. This is a return to the original design, except that no attempt is made to introduce copies to and from GPR3 during instruction lowering. Virtual registers are used until prior to register allocation. At that point, a special pass is run that identifies the machine-specific nodes that hide the tls_get_addr calls and introduces the copies to and from GPR3 around them. The register allocator then coalesces these copies away. With this design, MachineCSE succeeds in commoning tls_get_addr calls where possible, and we get nice optimal code generation (better than GCC at the moment, which does not common these calls). One additional problem must be dealt with: After introducing the mentions of the physical register GPR3, the aggressive anti-dependence breaker sees opportunities to improve scheduling by selecting a different register instead. Flags must be used on the instruction descriptions to tell the anti-dependence breaker to keep its hands in its pockets. One thing missing from the original design was recording a definition of the link register on the GET_TLS_ADDR nodes. Doing this was found to be insufficient to force a stack frame to be created, which led to looping behavior because two different LR values were stored at the same address. This appears to have been an oversight in PPCFrameLowering::determineFrameLayout(), which is repaired here. Because MustSaveLR() returns true for calls to builtin_return_address, this changed the expected behavior of test/CodeGen/PowerPC/retaddr2.ll, which now stacks a frame but formerly did not. I've fixed the test case to reflect this. There are existing TLS tests to catch regressions; the checks in test/CodeGen/PowerPC/tls-store2.ll proved to be too restrictive in the face of instruction scheduling with these changes, so I fixed that up. I've added a new test case based on the PrettyStackTrace module that demonstrated the original problem. This checks that we get correct code generation and that CSE of the calls to __get_tls_addr has taken place. llvm-svn: 227976	2015-02-03 16:16:01 +00:00

1 2 3 4 5 ...

496 Commits