llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	41f89c3484	[NFC][InstCombine] New tests: unrecognized_three-way-comparison.ll is ignorant about commutative variants D66232 "exposes" the problem. llvm-svn: 369667	2019-08-22 16:46:16 +00:00
Craig Topper	898a0e9b84	[X86] Remove MCInstLower code that drops operands from some CALL and TAILJMP instructions. Add asserts to verify operand count It appears the FIXME here was handled at some point. r159728 from 2012 seems to be at least aportion of fixing it. Differential Revision: https://reviews.llvm.org/D66570 llvm-svn: 369665	2019-08-22 16:23:35 +00:00
Guozhi Wei	51f48295cb	[MBP] Disable aggressive loop rotate in plain mode Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664	2019-08-22 16:21:32 +00:00
Amaury Sechet	95cf66de7c	[DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662	2019-08-22 15:35:45 +00:00
Andrea Di Biagio	c9649eb9da	[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions. Single operand MUL instructions that implicitly set EAX have the following latency/throughput profile (see below): imul %cl # latency: 3cy - uOPs: 1 - 1 JMul imul %cx # latency: 3cy - uOPs: 3 - 3 JMul imul %ecx # latency: 3cy - uOPs: 2 - 2 JMul imul %rcx # latency: 6cy - uOPs: 2 - 4 JMul mul %cl # latency: 3cy - uOPs: 1 - 1 JMul mul %cx # latency: 3cy - uOPs: 3 - 3 JMul mul %ecx # latency: 3cy - uOPs: 2 - 2 JMul mul %rcx # latency: 6cy - uOPs: 2 - 4 JMul Excluding the 64bit variant, which has a latency of 6cy, every other instruction has a latency of 3cy. However, the number of decoded macro-opcodes (as well as the resource cyles) depend on the MUL size. The two operand MULs have a more predictable profile (see below): imul %dx, %dx # latency: 3cy - uOPs: 1 - 1 JMul imul %edx, %edx # latency: 3cy - uOPs: 1 - 1 JMul imul %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul imul $3, %dx, %dx # latency: 4cy - uOPs: 2 - 2 JMul imul $3, %ecx, %ecx # latency: 3cy - uOPs: 1 - 1 JMul imul $3, %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul This patch updates the values in the Jaguar scheduling model and regenerates llvm-mca tests. Differential Revision: https://reviews.llvm.org/D66547 llvm-svn: 369661	2019-08-22 15:20:16 +00:00
Simon Pilgrim	ab2f68d5ad	[PowerPC] Regenerate reciprocal tests, as discussed on D66548 llvm-svn: 369659	2019-08-22 15:14:52 +00:00
Sean Fertile	5f85a7b1cf	[PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC] A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more readable I've added a couple functions that combine checking for the ELF abi and 64-bit/32-bit code at once. As we add more AIX support I intend to add similar functions for the AIX ABI. Differential Revision: https://reviews.llvm.org/D65814 llvm-svn: 369658	2019-08-22 15:11:28 +00:00
Sean Fertile	18fd1b0b49	[PowerPC][XCOFF][MC] Explicitly set containing csect on symbols. [NFC] Previously we would get the csect a symbol was contained in through its fragment. This works only if we are writing an object file, and only for defined symbols. To fix this we set the contating csect explicitly on the MCSymbolXCOFF object. Differential Revision: https://reviews.llvm.org/D66032 llvm-svn: 369657	2019-08-22 15:11:23 +00:00
Hideto Ueno	70576cac52	[Attributor][NFC] Move DerefState to header and use StateWrapper Summary: In D65402, I want to get DerefState from AADereferenceable but it was not allowed. This patch moves DerefState definition into Attributor.h and makes AADerefenceable inherit StateWrapper. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66585 llvm-svn: 369653	2019-08-22 14:18:29 +00:00
Jinsong Ji	545e993b8b	[SlotIndexes] Add print-slotindexes to disable printing slotindexes Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650	2019-08-22 13:44:47 +00:00
Andrea Di Biagio	589cb004de	[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI llvm-svn: 369648	2019-08-22 13:32:17 +00:00
George Rimar	91208447d0	[yaml2obj] - Lookup relocation symbols in dynamic symbol when .dynsym referenced. This fixes https://bugs.llvm.org/show_bug.cgi?id=40337. Previously, it was always assumed that relocations referenced symbols in the static symbol table. Now, if the Link field references a section called ".dynsym" it will look up these symbols in the dynamic symbol table. This patch is heavily based on D59097 by James Henderson Differential revision: https://reviews.llvm.org/D66532 llvm-svn: 369645	2019-08-22 12:39:56 +00:00
Sylvestre Ledru	c2ca965c89	Fix some regressions caused by r369553 on old versions of Debian and Ubuntu It was causing some errors like: Encoding error: 'ascii' codec can't decode byte 0xe2 in position 341: ordinal not in range(128) The full traceback has been saved in /tmp/sphinx-err-y2fq4dtb.log, if you want to report the issue to the developers. llvm-svn: 369644	2019-08-22 12:16:08 +00:00
Andrea Di Biagio	c6744055ad	[X86][BtVer2] Fix latency and throughput of XCHG and XADD. On Jaguar, XCHG has a latency of 1cy and decodes to 2 macro-opcodes. Maximum throughput for XCHG is 1 IPC. The byte exchange has worse latency and decodes to 1 extra uOP; maximum observed throughput is 0.5 IPC. ``` xchgb %cl, %dl # Latency: 2cy - uOPs: 3 - 2 ALU xchgw %cx, %dx # Latency: 1cy - uOPs: 2 - 2 ALU xchgl %ecx, %edx # Latency: 1cy - uOPs: 2 - 2 ALU xchgq %rcx, %rdx # Latency: 1cy - uOPs: 2 - 2 ALU ``` The reg-mem forms of XCHG are atomic operations with an observed latency of 16cy. The resource usage is similar to the XCHGrr variants. The biggest difference is obviously the bus-locking, which prevents the LS to issue other memory uOPs in parallel until the unlocking store uOP is executed. ``` xchgb %cl, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgw %cx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgl %ecx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgq %rcx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy ``` The exchanged in/out register operand becomes available after 11cy from the start of execution. Added test xchg.s to verify that we correctly see that register write committed in 11cy (and not 16cy). Reg-reg XADD instructions have the same latency/throughput than the byte exchange (register-register variant). ``` xaddb %cl, %dl # latency: 2cy - uOPs: 3 - 3 ALU xaddw %cx, %dx # latency: 2cy - uOPs: 3 - 3 ALU xaddl %ecx, %edx # latency: 2cy - uOPs: 3 - 3 ALU xaddq %rcx, %rdx # latency: 2cy - uOPs: 3 - 3 ALU ``` The non-atomic RM variants have a latency of 11cy, and decode to 4 macro-opcodes. They still consume 2 ALU pipes, and the exchange in/out register operand becomes available in 3cy (it matches the 'load-to-use latency'). ``` xaddb %cl, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddw %cx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddl %ecx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddq %rcx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU ``` The atomic XADD variants execute in 16cy. The in/out register operand is available after 11cy from the start of execution. ``` lock xaddb %cl, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddw %cx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddl %ecx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddq %rcx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy ``` Added test xadd.s to verify those latencies as well as read-advance values. Differential Revision: https://reviews.llvm.org/D66535 llvm-svn: 369642	2019-08-22 11:32:47 +00:00
Simon Pilgrim	6dd51c2f19	[MVT] Add MVT equivalent to EVT::getHalfNumVectorElementsVT() helper. NFCI. Allows for some cleanup in a lot of SSE/AVX vector splitting code llvm-svn: 369640	2019-08-22 11:14:30 +00:00
Sam Tebbs	a69d9d6156	Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit `cd53ff6`, reapplying `7c6b229`. llvm-svn: 369638	2019-08-22 10:29:20 +00:00
Serguei Katkov	036e636aa7	[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637	2019-08-22 10:06:46 +00:00
Hans Wennborg	cd53ff6c0d	Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32" It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636	2019-08-22 09:16:53 +00:00
George Rimar	26f4262398	[llvm-objdump] - Remove an outdated "FIXME". NFC. The bug mentioned in this test case was fixed in D63779 (r364955), which also provides a test case. llvm-svn: 369634	2019-08-22 09:10:17 +00:00
George Rimar	e54d37153d	[llvm-readobj] - Remove `reportError(std::error_code EC, StringRef Input)` helper. We do not need it, std::error_code is used mostly for COFF and this patch rewrites the calls to use a different overload. Having reportError(std::error_code EC, ... is excessive by itself, because API that use error codes actually needs refactoring to use Error/Expected<> instead. DIfferential revision: https://reviews.llvm.org/D66521 llvm-svn: 369630	2019-08-22 08:56:24 +00:00
Craig Topper	d420616313	[X86] Lower the cost of v2i32->v2f64 sint_to_fp under vector widening legalization. I don't really understand the costs we're using for fp_to_sint, but prior to widening legalization we used 20 as the cost for this via the v2i64->v2f64 entry. That number seems better than the 40 we got with widening legalization. So now we need either a v2i32->v2f64 entry or a v4i32->v2f64 entry depending on whether AVX is enabled or not since we skip the first SSE2 table look up under AVX. llvm-svn: 369628	2019-08-22 08:18:45 +00:00
Pavel Labath	1b30ea2c50	[Support] Improve readNativeFile(Slice) interface Summary: There was a subtle, but pretty important difference between the Slice and regular versions of this function. The Slice function was zero-initializing the rest of the buffer when the read syscall returned less bytes than expected, while the regular function did not. This patch removes the inconsistency by making both functions not zero-initialize the buffer. The zeroing code is moved to the MemoryBuffer class, which is currently the only user of this code. This makes the API more consistent, and the code shorter. While in there, I also refactor the functions to return the number of bytes through the regular return value (via Expected<size_t>) instead of a separate by-ref argument. Reviewers: aganea, rnk Subscribers: kristina, Bigcheese, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66471 llvm-svn: 369627	2019-08-22 08:13:30 +00:00
Sam Tebbs	7c6b229204	[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626	2019-08-22 08:12:06 +00:00
Shiva Chen	72a41e7b0d	[TargetLowering] Remove optional arguments passing to makeLibCall The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622	2019-08-22 04:59:43 +00:00
Joel E. Denny	3c577bb415	[lit] Diagnose insufficient args to internal env Without this patch, failing to provide a subcommand to lit's internal `env` results in either a python `IndexError` or an attempt to execute the final `env` argument, such as `FOO=1`, as a command. This patch diagnoses those cases with a more helpful message. Reviewed By: stella.stamenova Differential Revision: https://reviews.llvm.org/D66482 llvm-svn: 369620	2019-08-22 03:42:01 +00:00
Pengfei Wang	7630e24492	[X86] Making X86OptimizeLEAs pass public. NFC Reviewers: wxiao3, LuoYuanke, andrew.w.kaylor, craig.topper, annita.zhang, liutianle, pengfei, xiangzhangllvm, RKSimon, spatel, andreadb Reviewed By: RKSimon Subscribers: andreadb, hiraditya, llvm-commits Tags: #llvm Patch by Gen Pei (gpei) Differential Revision: https://reviews.llvm.org/D65933 llvm-svn: 369612	2019-08-22 02:29:27 +00:00
Fangrui Song	246750c2a9	[COFF] Fix section name for constants larger than 64 bits on Windows APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610	2019-08-22 01:48:34 +00:00
Nico Weber	6e8b79e308	gn build: Merge r369605 llvm-svn: 369608	2019-08-22 00:40:55 +00:00
Nico Weber	edb08da450	gn build: Merge r369600 llvm-svn: 369603	2019-08-22 00:01:59 +00:00
Cyndy Ishida	9443d0e2c0	[Object] FIX: update PlatformKind name in TapiFile Buildbots that use GCC failed to compile because overwritten namespace with variable name llvm-svn: 369602	2019-08-21 23:57:57 +00:00
Cyndy Ishida	c20d1f90b5	[Object] Add tapi files to object Summary: The intention for this is to allow reading and printing symbols out from llvm-nm. Tapi file, and Tapi universal follow a similiar format to their respective MachO Object format. The tests are dependent on llvm-nm processing tbd files which is why its in D66160 Reviewers: ributzka, steven_wu, lhames Reviewed By: ributzka, lhames Subscribers: mgorny, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66159 llvm-svn: 369600	2019-08-21 23:30:53 +00:00
Craig Topper	78e6507b0a	[X86] Correct the scheduler classes for TAILJMP and TCRETURN CodeGenOnly instructions. We had an odd combination of WriteJump applied to some memory instructions and WriteJumpLd applied to register and immediate instructions. Thsi should hopefully assign them all correctly. llvm-svn: 369599	2019-08-21 23:17:52 +00:00
Craig Topper	303bbc3be2	[X86] Replace a couple hardcoded '5's with X86::AddrNumOperands for readability. NFC llvm-svn: 369598	2019-08-21 22:40:07 +00:00
Nico Weber	40902b48dd	gn build: Merge r369591 llvm-svn: 369594	2019-08-21 22:26:02 +00:00
Nico Weber	e1f27e4ad1	gn build: Merge r369587 llvm-svn: 369593	2019-08-21 22:25:57 +00:00
Johannes Doerfert	92dee44d77	[Attributor] FIX: Try to make bots happy Locally the tight iterations bounds work fine but the bots seem unhappy. Try to get green bots and some time to determine the underlying problem. llvm-svn: 369592	2019-08-21 22:21:13 +00:00
Luis Marques	f7cdff4ffd	[RISCV] Remove fix introduced by r369573, superseded by r369580 llvm-svn: 369590	2019-08-21 22:02:56 +00:00
Johannes Doerfert	d98f975089	[Attributor] Fix: Gracefully handle non-instruction users Function can have users that are not instructions, e.g., bitcasts. For now, we simply give up when we see them. llvm-svn: 369588	2019-08-21 21:48:56 +00:00
Greg Clayton	bf9ee07afa	Add FileWriter to GSYM and encode/decode functions to AddressRange and AddressRanges The full GSYM patch started with: https://reviews.llvm.org/D53379 This patch add the ability to encode data using the new llvm::gsym::FileWriter class. FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data. AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address. Differential Revision: https://reviews.llvm.org/D63828 llvm-svn: 369587	2019-08-21 21:48:11 +00:00
Johannes Doerfert	a41b239081	[Attributor][NFCI] Introduce tight iteration bounds in the tests Summary: To be able to track how many iterations we need to manifest all information we check for we now make the maximum iteration count explicit. The count is set tightly now and should be kept that way. Reviewers: uenoku, sstefan1 Subscribers: bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66554 llvm-svn: 369586	2019-08-21 21:42:46 +00:00
Benjamin Kramer	81faa5e6a2	Use C++14 heteregenous lookup for a couple of std::map<std::string, ...> These call find with a StringRef, heterogenous lookup saves a temporary std::string there. llvm-svn: 369581	2019-08-21 21:17:34 +00:00
Luis Marques	4f488b594a	[RISCV] Fix use of side-effects in asserts in decoder functions llvm-svn: 369580	2019-08-21 21:11:37 +00:00
Cyndy Ishida	359840a6e4	[BinaryFormat] Teach identify_magic about Tapi files. Summary: Tapi files are YAML files that start with the !tapi tag. The only execption are TBD v1 files, which don't have a tag. In that case we have to scan a little further and check if the first key "archs" exists. This is the first patch in a series of patches to add libObject support for text-based dynamic library (.tbd) files. This patch is practically exactly the same as D37820, that was never pushed to master, and is needed for future commits related to reading tbd files for llvm-nm Reviewers: ributzka, steven_wu, bollu, espindola, jfb, shafik, jdoerfert Reviewed By: steven_wu Subscribers: dexonsmith, llvm-commits Tags: #llvm, #clang, #sanitizers, #lldb, #libc, #openmp Differential Revision: https://reviews.llvm.org/D66149 llvm-svn: 369579	2019-08-21 21:00:16 +00:00
Johannes Doerfert	5427aa843b	[Attributor][NFC] Fix copy & paste error llvm-svn: 369577	2019-08-21 20:57:20 +00:00
Johannes Doerfert	2db8528fb4	[Attributor][NFC] Remove leftover semicolon llvm-svn: 369576	2019-08-21 20:56:56 +00:00
Johannes Doerfert	d410805d57	[Attributor] Use existing unreachable instead of introducing new ones So far we split the unreachable off and placed a new one, this is not necessary. llvm-svn: 369575	2019-08-21 20:56:41 +00:00
Richard Smith	b73cd33625	Fix -Werror=unused-variable error after r369528. llvm-svn: 369573	2019-08-21 20:42:37 +00:00
Nico Weber	d7887cf849	gn build: Merge r369568 llvm-svn: 369572	2019-08-21 20:20:36 +00:00
Nico Weber	fe7eca239b	gn build: Make sync script not exit 1 if it writes changes llvm-svn: 369571	2019-08-21 20:13:00 +00:00
Florian Hahn	b5e52bfd83	[GVN] Do PHI translations across all edges between the load and the unavailable pred. Currently we do not properly translate addresses with PHIs if LoadBB != LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument, because it considers all instructions outside of the current block to not requiring translation. The amount of cases that trigger this should be very low, as most single predecessor blocks should be folded into their predecessor by GVN before we actually start with value numbering. It is still not guaranteed to happen, so we should do PHI translation along all edges between the loads' block and the predecessor where we have to place a load. There are a few test cases showing current limits of the PHI translation, which could be improved later. Reviewers: spatel, reames, efriedma, john.brawn Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D65020 llvm-svn: 369570	2019-08-21 20:06:50 +00:00

1 2 3 4 5 ...

183773 Commits