llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	04c5a34f4f	[CGObjCGNU] Rename GetSelector helper method to fix -Woverloaded-virtual warning (PR38210) As suggested by @theraven on PR38210, this patch fixes the gcc -Woverloaded-virtual warnings by renaming the extra CGObjCGNU::GetSelector method to CGObjCGNU::GetTypedSelector Differential Revision: https://reviews.llvm.org/D50448 llvm-svn: 339264	2018-08-08 15:53:14 +00:00
Balaji V. Iyer	749e8285a5	[CodeGen] IncompleteArray Support Added code to support ArrayType that is not ConstantArray. https://reviews.llvm.org/D49952 rdar://42476155 llvm-svn: 339207	2018-08-08 00:01:21 +00:00
JF Bastien	b9cc1fcf6b	[NFC] CGDecl factor out constant emission The code is cleaner this way, and with some changes I'm playing with it makes sense to split it out so we can reuse it. llvm-svn: 339191	2018-08-07 21:55:13 +00:00
Alexey Bataev	bf8fe71b91	[OPENMP] Mark variables captured in declare target region as implicitly declare target. According to OpenMP 5.0, variables captured in lambdas in declare target regions must be considered as implicitly declare target. llvm-svn: 339152	2018-08-07 16:14:36 +00:00
Scott Linder	f8b3df4dec	[OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-return Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150	2018-08-07 15:52:49 +00:00
David Chisnall	9e31036302	[objc-gnustep] Don't emit .guess ivar offset vars. These were intended to allow non-fragile and fragile ABI code to be mixed, as long as the fragile classes were higher up the hierarchy than the non-fragile ones. Unfortunately: - No one actually wants to do this. - Recent versions of Linux's run-time linker break it. llvm-svn: 339128	2018-08-07 12:02:46 +00:00
Hsiangkai Wang	ea1b0e0960	Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)" Build failed in http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/27258 In lib/CodeGen/LiveDebugVariables.cpp:589, it uses std::prev(MBBI) to get DebugValue's SlotIndex. however, the previous instruction may be also a debug instruction. llvm-svn: 338992	2018-08-06 07:07:18 +00:00
Hsiangkai Wang	3bec3abf38	[DebugInfo] Generate debug information for labels. (Fix PR37395) Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. After fixing PR37395. Differential Revision: https://reviews.llvm.org/D45045 llvm-svn: 338989	2018-08-06 05:58:59 +00:00
Hsiangkai Wang	e7b3da2dc5	[DebugInfo] Use DbgVariableIntrinsic as the base class of variables. After refactoring DbgInfoIntrinsic class hierarchy, we use DbgVariableIntrinsic as the base class of variable debug info. In resolveTopLevelMetadata() in CGVTables.cpp, we only care about dbg.value, so we try to cast the instructions to DbgVariableIntrinsic before resolving variables. Differential Revision: https://reviews.llvm.org/D50226 llvm-svn: 338985	2018-08-06 04:00:08 +00:00
Richard Smith	aa140bf164	Avoid creating conditional cleanup blocks that contain only @llvm.lifetime.end calls When a non-extended temporary object is created in a conditional branch, the lifetime of that temporary ends outside the conditional (at the end of the full-expression). If we're inserting lifetime markers, this means we could end up generating if (some_cond) { lifetime.start(&tmp); Tmp::Tmp(&tmp); } // ... if (some_cond) { lifetime.end(&tmp); } ... for a full-expression containing a subexpression of the form `some_cond ? Tmp().x : 0`. This patch moves the lifetime start for such a temporary out of the conditional branch so that we don't need to generate an additional basic block to hold the lifetime end marker. This is disabled if we want precise lifetime markers (for asan's stack-use-after-scope checks) or of the temporary has a non-trivial destructor (in which case we'd generate an extra basic block anyway to hold the destructor call). Differential Revision: https://reviews.llvm.org/D50286 llvm-svn: 338945	2018-08-04 01:25:06 +00:00
Sergey Dmitriev	bde9cf942b	[OpenMP] Encode offload target triples into comdat key for offload initialization code Encoding offload target triples onto comdat group key for offload initialization code guarantees that it will be executed once per each unique combination of offload targets. Differential Revision: https://reviews.llvm.org/D50218 llvm-svn: 338916	2018-08-03 20:19:28 +00:00
Erich Keane	ed69e1bc98	[NFC] Initialize a variable to prevent future invalid deref. Found by KlockWorks, this variable is properly protected, however the conditions in the test that initializes it and the one that uses it could diverge, it seems to me that this is a 'free' init that will prevent issues if one of the conditions is ever modified without the other. llvm-svn: 338909	2018-08-03 18:08:36 +00:00
Vlad Tsyrklevich	c7d3d34b98	Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin" This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904	2018-08-03 17:47:58 +00:00
Scott Linder	91f578467c	[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899	2018-08-03 15:50:52 +00:00
Michael Kruse	cba47b4978	[CodeGen] Emit parallel_loop_access for each loop in the loop stack. Summary: Emit !llvm.mem.parallel_loop_access metadata for memory accesses even if the parallel loop is not the top on the loop stack. Fixes llvm.org/PR37558. Reviewers: ABataev, hfinkel, amusman, tyler.nowicki Reviewed By: hfinkel Subscribers: Meinersbur, hfinkel, cfe-commits Differential Revision: https://reviews.llvm.org/D48808 llvm-svn: 338810	2018-08-03 04:42:52 +00:00
Heejin Ahn	00aa81b4df	[WebAssembly] Support for atomic.wait / atomic.wake builtins Summary: Add support for atomic.wait / atomic.wake builtins based on the Wasm thread proposal. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits Differential Revision: https://reviews.llvm.org/D49396 llvm-svn: 338771	2018-08-02 21:44:40 +00:00
Matt Arsenault	c65f966d76	Try to make builtin address space declarations not useless The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707	2018-08-02 12:14:28 +00:00
David Green	c8e3924b3b	[UnrollAndJam] Add unroll_and_jam pragma handling This adds support for the unroll_and_jam pragma, to go with the recently added unroll and jam pass. The name of the pragma is the same as is used in the Intel compiler, and most of the code works the same as for unroll. #pragma clang loop unroll_and_jam has been separated into a different patch. This part adds #pragma unroll_and_jam with an optional count, and #pragma no_unroll_and_jam to disable the transform. Differential Revision: https://reviews.llvm.org/D47267 llvm-svn: 338566	2018-08-01 14:36:12 +00:00
Alexey Bataev	62a4cb06a9	[OPENMP] Change linkage of offloading symbols to support dropping offload targets. Changed the linkage of omp_offloading.img_start.<triple> and omp_offloading.img_end.<triple> symbols from external to external weak to allow dropping of some targets during linking. llvm-svn: 338413	2018-07-31 18:27:42 +00:00
Alexey Bataev	3823514b56	[OPENMP] Prevent problems with linking of the static variables. No need to change the linkage, we can avoid the problem using special variable. That points to the original variable and, thus, prevent some of the optimizations that might break the compilation. llvm-svn: 338399	2018-07-31 16:40:15 +00:00
Eric Christopher	a2227a3f9a	Revert "Add a definition for FieldSize that seems to make sense here." This reverts commit r338327, the problem was previously fixed in r338321. llvm-svn: 338328	2018-07-30 23:21:51 +00:00
Eric Christopher	8f02a9ed3c	Add a definition for FieldSize that seems to make sense here. This could be sunk out of the if statements, but fix the warning for now. llvm-svn: 338327	2018-07-30 23:17:27 +00:00
Scott Linder	a7c4568583	Fix use of uninitialized variable in r338299 llvm-svn: 338321	2018-07-30 22:52:07 +00:00
Scott Linder	2b5cf04180	[DebugInfo][OpenCL] Generate correct block literal debug info for OpenCL OpenCL block literal structs have different fields which are now correctly identified in the debug info. Differential Revision: https://reviews.llvm.org/D49930 llvm-svn: 338299	2018-07-30 20:31:11 +00:00
Fangrui Song	6907ce2f8f	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338291	2018-07-30 19:24:48 +00:00
Roman Lebedev	b69ba22773	[clang][ubsan] Implicit Conversion Sanitizer - integer truncation - clang part Summary: C and C++ are interesting languages. They are statically typed, but weakly. The implicit conversions are allowed. This is nice, allows to write code while balancing between getting drowned in everything being convertible, and nothing being convertible. As usual, this comes with a price: ``` unsigned char store = 0; bool consume(unsigned int val); void test(unsigned long val) { if (consume(val)) { // the 'val' is `unsigned long`, but `consume()` takes `unsigned int`. // If their bit widths are different on this platform, the implicit // truncation happens. And if that `unsigned long` had a value bigger // than UINT_MAX, then you may or may not have a bug. // Similarly, integer addition happens on `int`s, so `store` will // be promoted to an `int`, the sum calculated (0+768=768), // and the result demoted to `unsigned char`, and stored to `store`. // In this case, the `store` will still be 0. Again, not always intended. store = store + 768; // before addition, 'store' was promoted to int. } // But yes, sometimes this is intentional. // You can either make the conversion explicit (void)consume((unsigned int)val); // or mask the value so no bits will be implicitly lost. (void)consume((~((unsigned int)0)) & val); } ``` Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda noisy, since it warns on everything (unlike sanitizers, warning on an actual issues), and second, there are cases where it does not warn. So a Sanitizer is needed. I don't have any motivational numbers, but i know i had this kind of problem 10-20 times, and it was never easy to track down. The logic to detect whether an truncation has happened is pretty simple if you think about it - https://godbolt.org/g/NEzXbb - basically, just extend (using the new, not original!, signedness) the 'truncated' value back to it's original width, and equality-compare it with the original value. The most non-trivial thing here is the logic to detect whether this `ImplicitCastExpr` AST node is actually an implicit conversion, //or// part of an explicit cast. Because the explicit casts are modeled as an outer `ExplicitCastExpr` with some `ImplicitCastExpr`'s as direct children. https://godbolt.org/g/eE1GkJ Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`. So if that flag is not set, then it is an actual implicit conversion. As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`. There are potentially some more implicit conversions to be warned about. Namely, implicit conversions that result in sign change; implicit conversion between different floating point types, or between fp and an integer, when again, that conversion is lossy. One thing i know isn't handled is bitfields. This is a clang part. The compiler-rt part is D48959. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 \| PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 \| PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 \| PR35409 ]]. Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 \| PR9821 ]]. Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions) Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane Reviewed By: rsmith, vsk, erichkeane Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D48958 llvm-svn: 338288	2018-07-30 18:58:30 +00:00
Momchil Velikov	20208cc046	[ARM, AArch64]: Use unadjusted alignment when passing composites as arguments The "Procedure Call Procedure Call Standard for the ARM® Architecture" (https://static.docs.arm.com/ihi0042/f/IHI0042F_aapcs.pdf), specifies that composite types are passed according to their "natural alignment", i.e. the alignment before alignment adjustment on the entire composite is applied. The same applies for AArch64 ABI. Clang, however, used the adjusted alignment. GCC already implements the ABI correctly. With this patch Clang becomes compatible with GCC and passes such arguments in accordance with AAPCS. Differential Revision: https://reviews.llvm.org/D46013 llvm-svn: 338279	2018-07-30 17:48:23 +00:00
Stefan Maksimovic	b9da8a5dff	[mips64][clang] Provide the signext attribute for i32 return values Additional info: see r338019. Differential Revision: https://reviews.llvm.org/D49289 llvm-svn: 338239	2018-07-30 10:44:46 +00:00
Chandler Carruth	1f82d9ba6e	Revert r337456: [CodeGen] Disable aggressive structor optimizations at -O0, take 3 This commit increases the number of sections and overall output size of .o files by 10% and sometimes a bit more. This alone is challenging for some users, but it also appears to trigger an as-yet unexplained behavior in the Gold linker where the memory usage increases considerably more than 10% (we think). The increase is also frustrating because in many (if not all) cases we end up with almost all of the growth coming from the ELF overhead of -ffunction-sections and such, not from actual extra code being emitted. Richard Smith and Eric Christopher are both going to investigate this and try to get to the bottom of what is triggering this and whether the kinds of increases here are sustainable or what options we might have to minimize the impact they have. However, this is currently breaking a pretty large number of our users' builds so reverting it while we sort out how to make progress here. I've seen a longer and more detailed update to the commit thread. llvm-svn: 338209	2018-07-29 03:05:07 +00:00
Serge Pavlov	376051820d	[UBSan] Strengthen pointer checks in 'new' expressions With this change compiler generates alignment checks for wider range of types. Previously such checks were generated only for the record types with non-trivial default constructor. So the types like: struct alignas(32) S2 { int x; }; typedef __attribute__((ext_vector_type(2), aligned(32))) float float32x2_t; did not get checks when allocated by 'new' expression. This change also optimizes the checks generated for the arrays created in 'new' expressions. Previously the check was generated for each invocation of type constructor. Now the check is generated only once for entire array. Differential Revision: https://reviews.llvm.org/D49589 llvm-svn: 338199	2018-07-28 15:33:03 +00:00
Yaxun Liu	a4005e13f7	[CUDA][HIP] Allow function-scope static const variable CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__ function, only __shared__ variables or variables without any device memory qualifiers may be declared with static storage class. It is unclear how a function-scope non-const static variable without device memory qualifier is implemented, therefore only static const variable without device memory qualifier is allowed, which can be emitted as a global variable in constant address space. Currently clang only allows function-scope static variable with __shared__ qualifier. This patch also allows function-scope static const variable without device memory qualifier and emits it as a global variable in constant address space. Differential Revision: https://reviews.llvm.org/D49931 llvm-svn: 338188	2018-07-28 03:05:25 +00:00
George Karpenkov	39e5137f43	[AST] Add a convenient getter from QualType to RecordDecl Differential Revision: https://reviews.llvm.org/D49951 llvm-svn: 338187	2018-07-28 02:16:13 +00:00
Sanjin Sijaric	56391d6f84	[ARM64] [Windows] Follow MS X86_64 C++ ABI when passing structs Summary: Microsoft's C++ object model for ARM64 is the same as that for X86_64. For example, small structs with non-trivial copy constructors or virtual function tables are passed indirectly. Currently, they are passed in registers when compiled with clang. Reviewers: rnk, mstorsjo, TomTan, haripul, javed.absar Reviewed By: rnk, mstorsjo Subscribers: kristof.beyls, chrib, llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D49770 llvm-svn: 338076	2018-07-26 22:18:28 +00:00
Mandeep Singh Grang	2a153101bf	[COFF, ARM64] Decide when to mark struct returns as SRet Summary: Refer the MS ARM64 ABI Convention for the behavior for struct returns: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions#return-values Reviewers: mstorsjo, compnerd, rnk, javed.absar, yinma, efriedma Reviewed By: rnk, efriedma Subscribers: haripul, TomTan, yinma, efriedma, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49464 llvm-svn: 338050	2018-07-26 18:07:59 +00:00
Ana Pazos	1eee1b771f	[RISCV] Add support for interrupt attribute Summary: Clang supports the GNU style ``__attribute__((interrupt))`` attribute on RISCV targets. Permissible values for this parameter are user, supervisor, and machine. If there is no parameter, then it defaults to machine. Reference: https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html Based on initial patch by Zhaoshi Zheng. Reviewers: asb, aaron.ballman Reviewed By: asb, aaron.ballman Subscribers: rkruppe, the_o, aaron.ballman, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, cfe-commits Differential Revision: https://reviews.llvm.org/D48412 llvm-svn: 338045	2018-07-26 17:37:45 +00:00
Akira Hatanaka	cb6a933c9b	[CodeGen][ObjC] Make block copy/dispose helper functions exception-safe. When an exception is thrown in a block copy helper function, captured objects that have previously been copied should be destructed or released. Similarly, captured objects that are yet to be released should be released when an exception is thrown in a dispose helper function. rdar://problem/42410255 Differential Revision: https://reviews.llvm.org/D49718 llvm-svn: 338041	2018-07-26 16:51:21 +00:00
Alexey Bataev	8521ff6ec4	[OPENMP] ThreadId in serialized parallel regions is 0. The first argument for the parallel outlined functions, called as serialized parallel regions, should be a pointer to the global thread id that always is 0. llvm-svn: 337957	2018-07-25 20:03:01 +00:00
JF Bastien	6508929da9	CodeGen: use non-zero memset when possible for automatic variables Summary: Right now automatic variables are either initialized with bzero followed by a few stores, or memcpy'd from a synthesized global. We end up encountering a fair amount of code where memcpy of non-zero byte patterns would be better than memcpy from a global because it touches less memory and generates a smaller binary. The optimizer could reason about this, but it's not really worth it when clang already knows. This code could definitely be more clever but I'm not sure it's worth it. In particular we could track a histogram of bytes seen and figure out (as we do with bzero) if a memset could be followed by a handful of stores. Similarly, we could tune the heuristics for GlobalSize, but using the same as for bzero seems conservatively OK for now. <rdar://problem/42563091> Reviewers: dexonsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D49771 llvm-svn: 337887	2018-07-25 04:29:03 +00:00
Shiva Chen	0ed11a9792	Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)" This reverts commit 4288dd3bf082482e02c8a044c611c18168cb0180. llvm-svn: 337803	2018-07-24 02:57:11 +00:00
Shiva Chen	c50fbb9da7	[DebugInfo] Generate debug information for labels. (Fix PR37395) Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. After fixing PR37395. Differential Revision: https://reviews.llvm.org/D45045 Patch by Hsiangkai Wang. llvm-svn: 337800	2018-07-24 02:23:59 +00:00
Thomas Anderson	b6d87cfe5f	Borrow visibility from __fundamental_type_info for generated fundamental type infos This is necessary so the clang gives hidden visibility to fundamental types when -fvisibility=hidden is passed. Fixes https://bugs.llvm.org/show_bug.cgi?id=35066 Differential Revision: https://reviews.llvm.org/D49109 llvm-svn: 337788	2018-07-24 00:43:47 +00:00
Richard Smith	f66e4f7dbd	Support lifetime-extension of conditional temporaries. llvm-svn: 337767	2018-07-23 22:56:45 +00:00
Aaron Smith	044326c7fc	[CodeGen] Record if a C++ record is a trivial type Summary: This has a dependence on D45122 Reviewers: rnk, zturner, llvm-commits, aleksandr.urakov Reviewed By: rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45124 llvm-svn: 337736	2018-07-23 20:49:07 +00:00
Ivan A. Kosarev	8264bb8d34	[NEON] Fix support for vrndi_f32(), vrndiq_f32() and vrndns_f32() intrinsics This patch adds support for vrndi_f32() and vrndiq_f32() intrinsics in AArch32 mode and for vrndns_f32() intrinsic in AArch64 mode. Differential Revision: https://reviews.llvm.org/D48829 llvm-svn: 337690	2018-07-23 13:26:37 +00:00
Yaxun Liu	e1bfbc589f	[HIP] Support -fcuda-flush-denormals-to-zero for amdgcn Differential Revision: https://reviews.llvm.org/D48287 llvm-svn: 337639	2018-07-21 02:02:22 +00:00
JF Bastien	ed92f608bb	[NFC] CodeGen: rename memset to bzero The optimization looks for opportunities to emit bzero, not memset. Rename the functions accordingly (and clang-format the diff) because I want to add a fallback optimization which actually tries to generate memset. bzero is still better and it would confuse the code to merge both. llvm-svn: 337636	2018-07-20 23:37:12 +00:00
Yaxun Liu	f99752b66b	[HIP] Register/unregister device fat binary only once HIP generates one fat binary for all devices after linking. However, for each compilation unit a ctor function is emitted which register the same fat binary. Measures need to be taken to make sure the fat binary is only registered once. Currently each ctor function calls __hipRegisterFatBinary and stores the returned value to __hip_gpubin_handle. This patch changes the linkage of __hip_gpubin_handle to be linkonce so that they are shared between LLVM modules. Then this patch adds check of value of __hip_gpubin_handle to make sure __hipRegisterFatBinary is only called once. The code is equivalent to void *_gpubin_handle; void ctor() { if (__hip_gpubin_handle == 0) { __hip_gpubin_handle = __hipRegisterFatBinary(...); } // register kernels and variables. } The patch also does similar change to dtors so that __hipUnregisterFatBinary is called once. Differential Revision: https://reviews.llvm.org/D49083 llvm-svn: 337631	2018-07-20 22:45:24 +00:00
Reid Kleckner	891b2714bc	[codeview] Don't emit variable templates as class members MSVC doesn't, so neither should we. Fixes PR38004, which is a crash that happens when we try to emit debug info for a still-dependent partial variable template specialization. As a follow-up, we should review what we're doing for function and class member templates. It looks like we don't filter those out, but I can't seem to get clang to emit any. llvm-svn: 337616	2018-07-20 20:55:00 +00:00
Akira Hatanaka	dbfa453e41	[CodeGen][ObjC] Make copying and disposing of a non-escaping block no-ops. A non-escaping block on the stack will never be called after its lifetime ends, so it doesn't have to be copied to the heap. To prevent a non-escaping block from being copied to the heap, this patch sets field 'isa' of the block object to NSConcreteGlobalBlock and sets the BLOCK_IS_GLOBAL bit of field 'flags', which causes the runtime to treat the block as if it were a global block (calling _Block_copy on the block just returns the original block and calling _Block_release is a no-op). Also, a new flag bit 'BLOCK_IS_NOESCAPE' is added, which allows the runtime or tools to distinguish between true global blocks and non-escaping blocks. rdar://problem/39352313 Differential Revision: https://reviews.llvm.org/D49303 llvm-svn: 337580	2018-07-20 17:10:32 +00:00
Erich Keane	3efe00206f	Implement cpu_dispatch/cpu_specific Multiversioning As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning. This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions. This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units. The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions. Differential Revision: https://reviews.llvm.org/D47474 llvm-svn: 337552	2018-07-20 14:13:28 +00:00

1 2 3 4 5 ...

11775 Commits