llvm-project

Commit Graph

Author	SHA1	Message	Date
Shuai Wang	02bfd89c93	Fix build failure caused by D52157 llvm-svn: 342408	2018-09-17 20:10:33 +00:00
Shuai Wang	e0248aecbe	[ASTMatchers] Let isArrow also support UnresolvedMemberExpr, CXXDependentScopeMemberExpr Reviewers: aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D52157 llvm-svn: 342407	2018-09-17 18:48:43 +00:00
Michael Kruse	534c87df82	[Loopinfo] Remove one latch-case in getLoopID. NFC. getLoopID has different control flow for two cases: If there is a single loop latch and for any other number of loop latches (0 and more than one). The latter case should return the same result if there is only a single latch. We can save the preceding redundant search for a latch by handling both cases with the same code. Differential Revision: https://reviews.llvm.org/D52118 llvm-svn: 342406	2018-09-17 18:40:29 +00:00
Jessica Paquette	bd72988c3a	[MachineOutliner][NFC] Don't map more illegal instrs than you have to We were mapping an instruction every time we saw something we couldn't map before this. Since each illegal mapping is unique, we only have to do this once. This makes it so that we don't map illegal instructions when the previous mapped instruction was illegal. In CTMark (AArch64), this results in 240 fewer instruction mappings on average over 619 files in total. The largest improvement is 12576 fewer mappings in one file, and the smallest is 0. The median improvement is 101 fewer mappings. llvm-svn: 342405	2018-09-17 18:40:21 +00:00
Davide Italiano	d405d2792d	Revert "[IRInterpreter] Minor cleanups, add comments. NFCI." This breaks buildbots. llvm-svn: 342404	2018-09-17 18:14:38 +00:00
Shuai Wang	4b8452998a	[clang-tidy] Remove duplicated logic in UnnecessaryValueParamCheck and use FunctionParmMutationAnalyzer instead. Reviewers: alexfh, JonasToth, george.karpenkov Subscribers: xazax.hun, kristof.beyls, chrib, a.sidorin, Szelethus, cfe-commits Differential Revision: https://reviews.llvm.org/D52158 llvm-svn: 342403	2018-09-17 17:59:51 +00:00
Keno Fischer	c8ccaed325	[X86ISel] Implement byval lowering for Win64 calling convention Summary: The IR reference for the `byval` attribute states: ``` This indicates that the pointer parameter should really be passed by value to the function. The attribute implies that a hidden copy of the pointee is made between the caller and the callee, so the callee is unable to modify the value in the caller. This attribute is only valid on LLVM pointer arguments. ``` However, on Win64, this attribute is unimplemented and the raw pointer is passed to the callee instead. This is problematic, because frontend authors relying on the implicit hidden copy (as happens for every other calling convention) will see the passed value silently (if mutable memory) or loudly (by means of a crash) modified because the callee treats the location as scratch memory space it is allowed to mutate. At this point, it's worth taking a step back to understand the context. In most calling conventions, aggregates that are too large to be passed in registers, instead get copied to the stack at a fixed (computable from the signature) offset of the stack pointer. At the LLVM, we hide this hidden copy behind the byval attribute. The caller passes a pointer to the desired data and the callee receives a pointer, but these pointers are not the same. In particular, the pointer that the callee receives points to temporary stack memory allocated as part of the call lowering. In most calling conventions, this pointer is never realized in registers or memory. The temporary memory is simply defined by an implicit offset from the stack pointer at function entry. Win64, uniquely, works differently. The structure is still passed in memory, but instead of being stored at an implicit memory offset, the caller computes a pointer to the temporary memory and passes it to the callee as a regular pointer (taking up a register, or if all registers are taken up, an additional stack slot). Presumably, this was done to allow eliding the copy when passing aggregates through several functions on the stack. This explains why ignoring the `byval` attribute mostly works on Win64. The argument simply gets passed as a pointer and as long as we're ok with the callee trampling all over that memory, there are no ill effects. However, it does contradict the documentation of the `byval` attribute which specifies that there is to be an implicit copy. Frontends can of course work around this by never emitting the `byval` attribute for Win64 and creating `alloca`s for the requisite temporary stack slots (and that does appear to be what frontends are doing). However, the presence of the `byval` attribute is not a trap for frontend authors, since it seems to work, but silently modifies the passed memory contrary to documentation. I see two solutions: - Disallow the `byval` attribute in the verifier if using the Win64 calling convention. - Make it work by simply emitting a temporary stack copy as we would with any other calling convention (frontends can of course always not use the attribute if they want to elide the copy). This patch implements the second option (make it work), though I would be fine with the first also. Ref: https://github.com/JuliaLang/julia/issues/28338 Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51842 llvm-svn: 342402	2018-09-17 17:37:14 +00:00
Nico Weber	5ffd8cedf4	lld-link: Also demangle undefined dllimported symbols. dllimported symbols go through an import stub that's called __imp_ followed by the name the stub points to. Make that work. Differential Revision: https://reviews.llvm.org/D52145 llvm-svn: 342401	2018-09-17 16:31:20 +00:00
Stanislav Mekhanoshin	06d3b4139e	[AMDGPU] Initialize instruction itinerary from GCNSubtarget I need to use it in the GCN codegen. Differential Revision: https://reviews.llvm.org/D52123 llvm-svn: 342400	2018-09-17 16:04:32 +00:00
Alexander Kornienko	e74e0f11d1	Revert "[DWARF] reposting r342048, which was reverted in r342056 due to buildbot errors. Adjusted 2 test cases for ARM and darwin and fixed a bug with the original change in dsymutil." This reverts commit r342218. Due to a number of failures under TSAN. An isolated test case is being worked on. llvm-svn: 342399	2018-09-17 15:40:01 +00:00
Xin Tong	8a505c64b0	[CVP] Handle instructions with no user. No need to create CVPLattice state. This handles terminator instructions and more. Summary: I tested this patch by compiling sqlite3.ll (clang -O3 -mllvm -disable-llvm-optzns sqlite3.c.) opt -called-value-propagation sqlite3.ll -time-passes -f -o out.ll I get 10+% speedup for the pass. I expect some of the gain come from skipping terminator instructions. === BEFORE THE PATCH === ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 0.5562 seconds (0.5582 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.2485 ( 46.4%) 0.0120 ( 57.7%) 0.2605 ( 46.8%) 0.2615 ( 46.8%) Bitcode Writer 0.1607 ( 30.0%) 0.0079 ( 37.7%) 0.1685 ( 30.3%) 0.1693 ( 30.3%) Called Value Propagation 0.1262 ( 23.6%) 0.0009 ( 4.5%) 0.1271 ( 22.9%) 0.1275 ( 22.8%) Module Verifier 0.5353 (100.0%) 0.0209 (100.0%) 0.5562 (100.0%) 0.5582 (100.0%) Total === AFTER THE PATCH === ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 0.5338 seconds (0.5355 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.2498 ( 48.6%) 0.0118 ( 59.3%) 0.2615 ( 49.0%) 0.2629 ( 49.1%) Bitcode Writer 0.1377 ( 26.8%) 0.0075 ( 37.8%) 0.1452 ( 27.2%) 0.1455 ( 27.2%) Called Value Propagation 0.1264 ( 24.6%) 0.0006 ( 3.0%) 0.1270 ( 23.8%) 0.1271 ( 23.7%) Module Verifier 0.5139 (100.0%) 0.0199 (100.0%) 0.5338 (100.0%) 0.5355 (100.0%) Total Reviewers: davide, mssimpso Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49108 llvm-svn: 342398	2018-09-17 15:28:01 +00:00
Amara Emerson	91c2913522	Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397	2018-09-17 14:40:13 +00:00
Jonas Devlieghere	9d7cecfcbf	[DebugInfo] Remove redundant argument. [NFC] Removes the redundant UnitType parameter from verifyUnitContents. I also fixed some formatting issues as I was touching the file. llvm-svn: 342396	2018-09-17 14:23:47 +00:00
Sam Parker	481cdab919	[ARM] Cleanup ARM CGP isSupportedValue isSupportedValue explicitly checked and accepted many types of value, primarily for debugging reasons. Remove most of these checks and do a bit of refactoring now that the pass is more stable. This also enables ZExts to be sources, but this has very little practical benefit at the moment extend instructions will still be introduced. Differential Revision: https://reviews.llvm.org/D52080 llvm-svn: 342395	2018-09-17 13:57:39 +00:00
Simon Pilgrim	a2fd56c3e4	Fix "not all control paths return a value" MSVC warning. NFCI. llvm-svn: 342394	2018-09-17 13:56:42 +00:00
Jonas Toth	b1efe51dd9	[clang-tidy] fix PR37913, templated exception factory diagnosed correctly Summary: PR37913 documents wrong behaviour for a templated exception factory function. The check does misidentify dependent types as not derived from std::exception. The fix to this problem is to ignore dependent types, the analysis works correctly on the instantiated function. Reviewers: aaron.ballman, alexfh, hokein, ilya-biryukov Reviewed By: alexfh Subscribers: lebedev.ri, nemanjai, mgorny, kbarton, xazax.hun, cfe-commits Differential Revision: https://reviews.llvm.org/D48714 llvm-svn: 342393	2018-09-17 13:55:10 +00:00
Sam Parker	76d25d7f55	[ARM] Disallow icmp with negative imm and overflow We allow overflowing instructions if they're decreasing and only used by an unsigned compare. Add the extra condition that the icmp cannot be using a negative immediate. Differential Revision: https://reviews.llvm.org/D52102 llvm-svn: 342392	2018-09-17 13:48:25 +00:00
Dan Liew	fb310c0af9	[UBSan] Partially fix `test/ubsan/TestCases/Misc/log-path_test.cc` so that it can run on devices. Summary: In order for this test to work the log file needs to be removed from both from the host and device. To fix this the `rm` `RUN` lines have been replaced with `RUN: rm` followed by `RUN: %device_rm`. Initially I tried having it so that `RUN: %run rm` implicitly runs `rm` on the host as well so that only one `RUN` line is needed. This simplified writing the test however that had two large drawbacks. * It's potentially very confusing (e.g. for use of the device scripts outside of the lit tests) if asking for `rm` to run on device also causes files on the host to be deleted. * This doesn't work well with the glob patterns used in the test. The host shell expands the `%t.log.` glob pattern and not on the device so we could easily miss deleting old log files from previous test runs if the corresponding file doesn't exist on the host. So instead deletion of files on the device and host are explicitly separate commands. The command to delete files from a device is provided by a new substitution `%device_rm` as suggested by Filipe Cabecinhas. The semantics of `%device_rm` are that: It provides a way remove files from a target device when the host is not the same as the target. In the case that the host and target are the same it is a no-op. * It interprets shell glob patterns in the context of the device file system instead of the host file system. This solves the globbing problem provided the argument is quoted so that lit's underlying shell doesn't try to expand the glob pattern. * It supports the `-r` and `-f` flags of the `rm` command, with the same semantics. Right now an implementation of `%device_rm` is provided only for ios devices. For all other devices a lit warning is emitted and the `%device_rm` is treated as a no-op. This done to avoid changing the behaviour for other device types but leaves room for others to implement `%device_rm`. The ios device implementation uses the `%run` wrapper to do the work of removing files on a device. The `iossim_run.py` script has been fixed so that it just runs `rm` on the host operating system because the device and host file system are the same. rdar://problem/41126835 Reviewers: vsk, kubamracek, george.karpenkov, eugenis Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D51648 llvm-svn: 342391	2018-09-17 13:33:44 +00:00
Matt Arsenault	80ea6dd1d5	Fix vectorization of canonicalize llvm-svn: 342390	2018-09-17 13:24:30 +00:00
Idriss Riouak	87242f1052	Fix llvm-svn: 342389	2018-09-17 12:58:19 +00:00
Idriss Riouak	09767acae1	[Clang-Tidy: modernize] Fix for modernize-redundant-void-arg: complains about variable cast to void Summary: Hello, i would like to suggest a fix for one of the checks in clang-tidy.The bug was reported in https://bugs.llvm.org/show_bug.cgi?id=32575 where you can find more information. For example: ``` template <typename T0> struct S { template <typename T> void g() const { int a; (void)a; } }; void f() { S<int>().g<int>(); } ``` this piece of code should not trigger any warning by the check modernize-redundant-void-arg but when we execute the following command ``` clang_tidy -checks=-*,modernize-redundant-void-arg test.cpp -- -std=c++11 ``` we obtain the following warning: /Users/eco419/Desktop/clang-tidy.project/void-redundand_2/test.cpp:6:6: warning: redundant void argument list in function declaration [modernize-redundant-void-arg] (void)a; ^~~~ Reviewers: aaron.ballman, hokein, alexfh, JonasToth Reviewed By: aaron.ballman, JonasToth Subscribers: JonasToth, lebedev.ri, cfe-commits Tags: #clang-tools-extra Differential Revision: https://reviews.llvm.org/D52135 llvm-svn: 342388	2018-09-17 12:29:29 +00:00
Alexandros Lamprineas	8a1c374b2e	[GVNHoist] Re-enable GVNHoist by default Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912 has been fixed by rL342055. Precommit testing performed: * Overnight runs of csmith comparing the output between programs compiled with gvn-hoist enabled/disabled. * Bootstrap builds of clang with UbSan/ASan configurations. llvm-svn: 342387	2018-09-17 12:24:55 +00:00
Alexander Kornienko	a195de8659	Use createTemporaryFile in SampleProfTest Create a temporary file in the system temporary directory instead of creating a file in the current directory, which may be not writable. (Fix for an issue introduced in r342283.) llvm-svn: 342386	2018-09-17 12:11:01 +00:00
Raphael Isemann	9dd34c8385	Add descriptions to completed expressions Summary: Completing inside the expression command now uses the new description API to also provide additional information to the user. For now this information are the types of variables/fields and the signatures of completed function calls. Reviewers: #lldb, JDevlieghere Reviewed By: JDevlieghere Subscribers: JDevlieghere, lldb-commits Differential Revision: https://reviews.llvm.org/D52103 llvm-svn: 342385	2018-09-17 12:06:07 +00:00
Gabor Marton	ac3a5d66c6	[ASTImporter] Fix import of VarDecl init Summary: The init expression of a VarDecl is overwritten in the "To" context if we import a VarDecl without an init expression (and with a definition). Please refer to the added tests, especially InitAndDefinitionAreInDifferentTUs. This patch fixes the malfunction by importing the whole Decl chain similarly as we did that in case of FunctionDecls. We handle the init expression similarly to a definition, alas only one init expression will be in the merged ast. Reviewers: a_sidorin, xazax.hun, r.stahl, a.sidorin Subscribers: rnkovacs, dkrupp, cfe-commits Differential Revision: https://reviews.llvm.org/D51597 llvm-svn: 342384	2018-09-17 12:04:52 +00:00
Andrew Savonichev	83ace12e86	[OpenCL] Allow blocks to capture arrays in OpenCL Summary: Patch by Egor Churaev Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: asavonic, bader, cfe-commits Differential Revision: https://reviews.llvm.org/D51722 llvm-svn: 342370	2018-09-17 11:19:42 +00:00
Guillaume Chatelet	cd488efe7e	[llvm-exegesis] Add predefined floating point values so we can test impact of special values on latency. Summary: This will be useful to generate many configurations and test instruction regimes (NaN, Inf, subnormal, normal). Reviewers: courbet Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D51858 llvm-svn: 342369	2018-09-17 11:09:32 +00:00
Strahinja Petrovic	488fd4e625	[PowerPC] Fix label address calculation for ppc64 This patch fixes calculating address of label for non-pic ppc64. Differential Revision: https://reviews.llvm.org/D50965 llvm-svn: 342368	2018-09-17 11:03:40 +00:00
Andrew Savonichev	1a5623489b	Merge two attribute diagnostics into one Summary: Merged the recently added `err_attribute_argument_negative` diagnostic with existing `err_attribute_requires_positive_integer` diagnostic: the former allows only strictly positive integer, while the latter also allows zero. Reviewers: aaron.ballman Reviewed By: aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D51853 llvm-svn: 342367	2018-09-17 10:39:46 +00:00
James Henderson	e29e40854b	Reland r342233: [ThinLTO] Allow setting of maximum cache size with 64-bit number The original was reverted due to an apparent build-bot test failure, but it looks like this is just a flaky test. Also added a C-interface function for large values, and updated llvm-lto's --thinlto-cache-max-size-bytes switch to take a type larger than int. The maximum cache size in terms of bytes is a 64-bit number. However, the methods to set it only took unsigned previously, which meant that the maximum cache size could not be specified above 4GB. That's quite small compared to the output of some projects, so it makes sense to provide the ability to set larger values in that field. We also needed a C-interface function that provides a greater range than the existing thinlto_codegen_set_cache_size_bytes, which also only takes an unsigned, so this change also adds hinlto_codegen_set_cache_size_megabytes. Reviewed by: mehdi_amini, tejohnson, steven_wu Differential Revision: https://reviews.llvm.org/D52023 llvm-svn: 342366	2018-09-17 10:21:26 +00:00
Mikhail Maltsev	c704f4d561	[Analyzer] Define and use diff_plist in tests, NFC This patch defines a new substitution and uses it to reduce duplication in the Clang Analyzer test cases. Differential Revision: https://reviews.llvm.org/D52036 llvm-svn: 342365	2018-09-17 10:19:46 +00:00
Alexander Shaposhnikov	1de445c71c	[llvm-objcopy] Add missing alias for --strip-all-gnu This diff adds -S as an alias for --strip-all-gnu (for compatibility with binutils' objcopy). Patch by Dmitry Golovin! Test plan: make check-all Differential revision: https://reviews.llvm.org/D52163 llvm-svn: 342364	2018-09-17 09:45:12 +00:00
Ilya Biryukov	370eff85b9	[clang-Format] Fix indentation of member call after block Summary: before patch: > echo "test() {([]() -> {int b = 32;return 3;}).as("");});" \| clang-format -style=Google ``` test() { ([]() -> { int b = 32; return 3; }) .as(); }); ``` after patch: > echo "test() {([]() -> {int b = 32;return 3;}).as("");});" \| clang-format -style=Google ``` test() { ([]() -> { int b = 32; return 3; }).as(); }); ``` Patch by Anders Karlsson (ank)! Reviewers: klimek Reviewed By: klimek Subscribers: danilaml, acoomans, klimek, cfe-commits Differential Revision: https://reviews.llvm.org/D45719 llvm-svn: 342363	2018-09-17 07:46:20 +00:00
Eric Liu	a57afd091f	[clangd] Get rid of AST matchers in SymbolCollector. NFC Reviewers: ilya-biryukov, kadircet Subscribers: MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D52089 llvm-svn: 342362	2018-09-17 07:43:49 +00:00
Fangrui Song	cac6d21731	Fix typo llvm-svn: 342361	2018-09-17 07:40:42 +00:00
Max Kazantsev	5fe3620261	[NFC] Turn unsigned counters into boolean flags llvm-svn: 342360	2018-09-17 06:33:29 +00:00
Sylvestre Ledru	ae48d78054	scan-build: Add support of the option --exclude like in scan-build-py Summary: To exclude thirdparty code. To test: With /tmp/foo.c ``` void test() { int x; x = 1; // warn } ``` ``` $ scan-build --exclude non-existing/ --exclude /tmp/ -v gcc -c foo.c scan-build: Using '/usr/lib/llvm-7/bin/clang' for static analysis scan-build: Emitting reports for this run to '/tmp/scan-build-2018-09-16-214531-8410-1'. foo.c:3:3: warning: Value stored to 'x' is never read x = 1; // warn ^ ~ 1 warning generated. scan-build: File '/tmp/foo.c' deleted: part of an ignored directory. scan-build: 0 bugs found. ``` Reviewers: jroelofs Reviewed By: jroelofs Subscribers: whisperity, cfe-commits Differential Revision: https://reviews.llvm.org/D52153 llvm-svn: 342359	2018-09-17 06:31:46 +00:00
Petr Hosek	00f51c0904	[Lexer] Add xray_instrument feature This can be used to detect whether the code is being built with XRay instrumentation using the __has_feature(xray_instrument) predicate. Differential Revision: https://reviews.llvm.org/D52159 llvm-svn: 342358	2018-09-17 05:25:47 +00:00
Petr Hosek	040ab65c53	[sanitizer_common] Fuchsia now supports .preinit_array Support for .preinit_array has been implemented in Fuchsia's libc, add Fuchsia to the list of platforms that support this feature. Differential Revision: https://reviews.llvm.org/D52155 llvm-svn: 342357	2018-09-17 05:22:26 +00:00
Dean Michael Berris	1a23d3bbce	[XRay] Simplify FDR buffer management Summary: This change makes XRay FDR mode use a single backing store for the buffer queue, and have indexes into that backing store instead. We also remove the reliance on the internal allocator implementation in the FDR mode logging implementation. In the process of making this change we found an inconsistency with the way we're returning buffers to the queue, and how we're setting the extents. We take the chance to simplify the way we're managing the extents of each buffer. It turns out we do not need the indirection for the extents, so we co-host the atomic 64-bit int with the buffer object. It also seems that we've not been returning the buffers for the thread running the flush functionality when writing out the files, so we can run into a situation where we could be missing data. We consolidate all the allocation routines now into xray_allocator.h, where we used to have routines defined in xray_buffer_queue.cc. Reviewers: mboerger, eizan Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52077 llvm-svn: 342356	2018-09-17 03:09:01 +00:00
Dean Michael Berris	d5577aea07	[XRay] Fix FDR initialization Follow-up to D51606. llvm-svn: 342355	2018-09-17 02:49:17 +00:00
Kristina Brooks	46c6d3fe75	[DebugInfo] Fix build when std::vector::iterator is a pointer std::vector::iterator type may be a pointer, then iterator::value_type fails to compile since iterator is not a class, namespace, or enumeration. Patch by orivej (Orivej Desh) Differential Revision: https://reviews.llvm.org/D52142 llvm-svn: 342354	2018-09-16 22:21:59 +00:00
Shuai Wang	aaaa310de2	[NFC] Minor refactoring to setup the stage for supporting pointers in ExprMutationAnalyzer llvm-svn: 342353	2018-09-16 21:09:50 +00:00
Simon Pilgrim	cffa206423	[X86][SSE] Always enable ISD::SRL -> ISD::MULHU for v8i16 For constant non-uniform cases we'll never introduce more and/andn/or selects than already occur in generic pre-SSE41 ISD::SRL lowering. llvm-svn: 342352	2018-09-16 20:28:38 +00:00
Sylvestre Ledru	9e3818af26	scan-build: remove trailing whitespaces llvm-svn: 342351	2018-09-16 19:51:16 +00:00
Sylvestre Ledru	757a1fae34	Also manages clang-X as tool for scan-build Summary: This will make scan-build-7 clang-7 -c foo.c &> /dev/null Reviewers: jroelofs Subscribers: kristina, cfe-commits Differential Revision: https://reviews.llvm.org/D52151 llvm-svn: 342350	2018-09-16 19:36:59 +00:00
Simon Pilgrim	ea069ffd44	[X86][AVX] Enable ISD::SRL -> ISD::MULHU for v16i16 Now that rL340913 has landed with improved v16i16 selects as shuffles. llvm-svn: 342349	2018-09-16 19:20:47 +00:00
Sanjay Patel	3eaf500a6d	[DAGCombiner] try to convert pow(x, 1/3) to cbrt(x) This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348	2018-09-16 16:50:26 +00:00
Sanjay Patel	bfee5a9b42	[x86] fix uses check in broadcast transform (PR38949) https://bugs.llvm.org/show_bug.cgi?id=38949 It's not clear to me that we even need a one-use check in this fold. Ie, 2 independent loads might be better than a load+dependent shuffle. Note that the existing re-use tests are not affected. We actually do form a broadcast node in those tests now because there's no extra use of the insert_subvector node in those cases. But something later in isel pattern matching decides that it is not worth using a broadcast for the full load in those tests: Legalized selection DAG: %bb.0 'test_broadcast_2f64_4f64_reuse:' t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64 t18: v4f64 = insert_subvector undef:v4f64, t7, Constant:i64<0> t20: v4f64 = insert_subvector t18, t7, Constant:i64<2> Becomes: t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64 t21: v4f64 = X86ISD::SUBV_BROADCAST t7 ISEL: Starting selection on root node: t21: v4f64 = X86ISD::SUBV_BROADCAST t7 ... Created node: t27: v4f64 = INSERT_SUBREG IMPLICIT_DEF:v4f64, t7, TargetConstant:i32<7> Morphed node: t21: v4f64 = VINSERTF128rr t27, t7, TargetConstant:i8<1> llvm-svn: 342347	2018-09-16 15:41:56 +00:00
Sanjay Patel	3e095174b0	[x86] add failure to splat test (PR38949); NFC llvm-svn: 342346	2018-09-16 14:59:04 +00:00

1 2 3 4 5 ...

299035 Commits All Branches Search

299035 Commits

All Branches