llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	897129dc3f	[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target. Enable full support for the debug info. Differential revision: https://reviews.llvm.org/D46189 llvm-svn: 351974	2019-01-23 18:59:54 +00:00
Alexey Bataev	25624e2e5b	Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target." This reverts commit r351972. Some pieces of the patch was not applied correctly. llvm-svn: 351973	2019-01-23 18:48:36 +00:00
Alexey Bataev	fe0b356063	[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target. Enable full support for the debug info. Recommit to fix the emission of the not required closing brace. Differential revision: https://reviews.llvm.org/D46189 llvm-svn: 351972	2019-01-23 18:28:59 +00:00
Marshall Clow	8681a3bc85	Commit D11348: 'Win32 support: wcsnrtombs and mbsnrtowcs don't handle null output buffers correctly' which has been hanging around for a long time llvm-svn: 351971	2019-01-23 18:27:22 +00:00
Craig Topper	aa0e74c1fc	[X86] Autogenerate complete checks. NFC llvm-svn: 351970	2019-01-23 18:25:49 +00:00
Aaron Ballman	b0d74bfe81	Merge similar target diagnostics for interrupt attribute into one; NFC Patch by Kristina Bessonova! llvm-svn: 351969	2019-01-23 18:02:17 +00:00
James Henderson	25ce596cd1	[llvm-symbolizer] Improve compatibility of --functions with GNU addr2line This fixes https://bugs.llvm.org/show_bug.cgi?id=40072. GNU addr2line's --functions switch is off by default, has a short alias of -f, and does not take an argument. This patch changes llvm-symbolizer to allow the second and third point (changing the default behaviour may have negative impacts on users). If the option is missing a value, it now treats it as "linkage". This change does cause one previously valid command-line to behave differently. Before --functions <value> was accepted, but now only --functions=<value> is allowed (as well as --functions). The old behaviour will result in the value being treated as a positional argument. The previous testing for --functions=short has been pulled out into a new test that also tests the other accepted values and option formats. Reviewed by: ruiu Differential Revision: https://reviews.llvm.org/D57049 llvm-svn: 351968	2019-01-23 17:27:48 +00:00
Haojian Wu	15a77418a9	Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target." This reverts commit r351846. This patch may generate illegal assembly code, see ``` $ ./bin/clang -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-grtev4-linux-gnu -S -disable-free -disable-llvm-verifier -discard-value-names -main-file-name new.cc -mrelocation-model pic -pic-level 2 -mthread-model posix -fmerge-all-constants -mdisable-fp-elim -relaxed-aliasing -no-integrated-as -mpie-copy-relocations -munwind-tables -fcuda-is-device -target-feature +ptx60 -target-cpu sm_35 -dwarf-column-info -debug-info-kind=line-directives-only -dwarf-version=2 -debugger-tuning=gdb -o empty.s -x cuda empty.cc $ cat empty.s // // Generated by LLVM NVPTX Back-End // .version 6.0 .target sm_35 .address_size 64 } ``` llvm-svn: 351966	2019-01-23 16:39:57 +00:00
Andrea Di Biagio	d768d35515	[MC][X86] Correctly model additional operand latency caused by transfer delays from the integer to the floating point unit. This patch adds a new ReadAdvance definition named ReadInt2Fpu. ReadInt2Fpu allows x86 scheduling models to accurately describe delays caused by data transfers from the integer unit to the floating point unit. ReadInt2Fpu currently defaults to a delay of zero cycles (i.e. no delay) for all x86 models excluding BtVer2. That means, this patch is only a functional change for the Jaguar cpu model only. Tablegen definitions for instructions (V)PINSR* have been updated to account for the new ReadInt2Fpu. That read is mapped to the the GPR input operand. On Jaguar, int-to-fpu transfers are modeled as a +6cy delay. Before this patch, that extra delay was added to the opcode latency. In practice, the insert opcode only executes for 1cy. Most of the actual latency is actually contributed by the so-called operand-latency. According to the AMD SOG for family 16h, (V)PINSR* latency is defined by expression f+1, where f is defined as a forwarding delay from the integer unit to the fpu. When printing instruction latency from MCA (see InstructionInfoView.cpp) and LLC (only when flag -print-schedule is speified), we now need to account for any extra forwarding delays. We do this by checking if scheduling classes declare any negative ReadAdvance entries. Quoting a code comment in TargetSchedule.td: "A negative advance effectively increases latency, which may be used for cross-domain stalls". When computing the instruction latency for the purpose of our scheduling tests, we now add any extra delay to the formula. This avoids regressing existing codegen and mca schedule tests. It comes with the cost of an extra (but very simple) hook in MCSchedModel. Differential Revision: https://reviews.llvm.org/D57056 llvm-svn: 351965	2019-01-23 16:35:07 +00:00
James Henderson	21ed868390	[llvm-readelf] Don't suppress static symbol table with --dyn-symbols + --symbols In r287786, a bug was introduced into llvm-readelf where it didn't print the static symbol table if both --symbols and --dyn-symbols were specified, even if there was no dynamic symbol table. This is obviously incorrect. This patch fixes this issue, by delegating the decision of which symbol tables should be printed to the final dumper, rather than trying to decide in the command-line option handling layer. The decision was made to follow the approach taken in this patch because the LLVM style dumper uses a different order to the original GNU style behaviour (and GNU readelf) for ELF output. Other approaches resulted in behaviour changes for other dumpers which felt wrong. In particular, I wanted to avoid changing the order of the output for --symbols --dyn-symbols for LLVM style, keep what is emitted by --symbols unchanged for all dumpers, and avoid having different orders of .dynsym and .symtab dumping for GNU "--symbols" and "--symbols --dyn-symbols". Reviewed by: grimar, rupprecht Differential Revision: https://reviews.llvm.org/D57016 llvm-svn: 351960	2019-01-23 16:15:39 +00:00
Simon Pilgrim	ac5b775522	Fix indentation. NFCI. llvm-svn: 351958	2019-01-23 16:01:19 +00:00
Simon Pilgrim	f87226eb70	[IR] Match intrinsic parameter by scalar/vectorwidth This patch replaces the existing LLVMVectorSameWidth matcher with LLVMScalarOrSameVectorWidth. The matching args must be either scalars or vectors with the same number of elements, but in either case the scalar/element type can differ, specified by LLVMScalarOrSameVectorWidth. I've updated the _overflow intrinsics to demonstrate this - allowing it to return a i1 or <N x i1> overflow result, matching the scalar/vectorwidth of the other (add/sub/mul) result type. The masked load/store/gather/scatter intrinsics have also been updated to use this, although as we specify the reference type to be llvm_anyvector_ty we guarantee the mask will be <N x i1> so no change in behaviour Differential Revision: https://reviews.llvm.org/D57090 llvm-svn: 351957	2019-01-23 16:00:22 +00:00
Krzysztof Parzyszek	036715408a	[Hexagon] Remove incorrect bit negation llvm-svn: 351956	2019-01-23 15:36:33 +00:00
Benjamin Kramer	4ebed81fc4	[AArch64] Fix out of bounds strlen CFIInst is not zero-terminated. This is one of more annoying functional differences between StringRef and ArrayRef. Found by asan. llvm-svn: 351955	2019-01-23 14:51:21 +00:00
Clement Courbet	c7956346da	Re-land rL322538 "Add a value_type to ArrayRef." llvm-svn: 351954	2019-01-23 14:20:59 +00:00
Simon Pilgrim	0e08b6f017	Move saturated arithmetic intrinsics to other integer intrinsics. NFCI. They were in the floating point group. llvm-svn: 351953	2019-01-23 13:49:10 +00:00
Nico Weber	8874aef822	Disable test better. llvm-svn: 351952	2019-01-23 13:43:42 +00:00
George Rimar	617adef933	[llvm-objdump] - Move common code to a new printRelocation() helper. NFC. This extracts the common code for printing relocations into a new helper function. llvm-svn: 351951	2019-01-23 13:39:12 +00:00
Tim Renouf	f64f8efe13	[AMDGPU] With XNACK, cannot clause a load with result coalesced with operand Summary: With XNACK, an smem load whose result is coalesced with an operand (thus it overwrites its own operand) cannot appear in a clause, because some other instruction might XNACK and restart the whole clause. The clause breaker already realized that an smem that overwrites an operand cannot appear in a clause, and broke the clause. The problem that this commit fixes is that the SIFormMemoryClauses optimization formed a bundle with early clobber, which caused the earlier code that set up the coalesced operand to be removed as dead. Differential Revision: https://reviews.llvm.org/D57008 Change-Id: I703c4d5b0bf7d6060222bec491f45c18bb3c0016 llvm-svn: 351950	2019-01-23 13:38:06 +00:00
Nico Weber	53c3c2c61c	Disable test added in r351916. It doesn't pass on Windows: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/3627 FAIL: lld :: ELF/stdout.s (1521 of 1966) ****************** TEST 'lld :: ELF/stdout.s' FAILED **************** Script: -- : 'RUN: at line 3'; C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\llvm-mc.EXE -filetype=obj -triple=x86_64-unknown-linux C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\lld\test\ELF\stdout.s -o C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o : 'RUN: at line 4'; c:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\ld.lld.EXE C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o -o - > C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1 : 'RUN: at line 5'; C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\llvm-objdump.EXE -d C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1 \| C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\FileCheck.EXE C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\lld\test\ELF\stdout.s : 'RUN: at line 10'; c:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\ld.lld.EXE C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o -o C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp2 : 'RUN: at line 11'; diff C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1 C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp2 -- Exit Code: 1 Command Output (stdout): -- $ ":" "RUN: at line 3" $ "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\llvm-mc.EXE" "-filetype=obj" "-triple=x86_64-unknown-linux" "C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\lld\test\ELF\stdout.s" "-o" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o" $ ":" "RUN: at line 4" $ "c:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\ld.lld.EXE" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o" "-o" "-" $ ":" "RUN: at line 5" $ "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\llvm-objdump.EXE" "-d" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1" $ "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\FileCheck.EXE" "C:\b\slave\clang-x64-windows-msvc\build\llvm.src\tools\lld\test\ELF\stdout.s" $ ":" "RUN: at line 10" $ "c:\b\slave\clang-x64-windows-msvc\build\build\stage1\bin\ld.lld.EXE" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp.o" "-o" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp2" $ ":" "RUN: at line 11" $ "diff" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1" "C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp2" # command output: * C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp1 --- C:\b\slave\clang-x64-windows-msvc\build\build\stage1\tools\lld\test\ELF\Output\stdout.s.tmp2 ************* * 1 **** llvm-svn: 351949	2019-01-23 12:35:08 +00:00
Martin Storsjo	0d19a399a3	[llvm-objcopy] [COFF] Error out on use of unhandled options Prefer erroring out than silently not doing what was requested. Differential Revision: https://reviews.llvm.org/D57045 llvm-svn: 351948	2019-01-23 11:54:55 +00:00
Martin Storsjo	1be91958b3	[llvm-objcopy] [COFF] Fix handling of aux symbols for big objects The aux symbols were stored in an opaque std::vector<uint8_t>, with contents interpreted according to the rest of the symbol. All aux symbol types but one fit in 18 bytes (sizeof(coff_symbol16)), and if written to a bigobj, two extra padding bytes are written (as sizeof(coff_symbol32) is 20). In the storage agnostic intermediate representation, store the aux symbols as a series of coff_symbol16 sized opaque blobs. (In practice, all such aux symbols only consist of one aux symbol, so this is more flexible than what reality needs.) The special case is the file aux symbols, which are written in potentially more than one aux symbol slot, without any padding, as one single long string. This can't be stored in the same opaque vector of fixed sized aux symbol entries. The file aux symbols will occupy a different number of aux symbol slots depending on the type of output object file. As nothing in the intermediate process needs to have accurate raw symbol indices, updating that is moved into the writer class. Differential Revision: https://reviews.llvm.org/D57009 llvm-svn: 351947	2019-01-23 11:54:51 +00:00
Martin Storsjo	481334056f	[llvm-objcopy] [COFF] Remove testcase debugging lines. NFC. These are no longer necessary as the testcase now seems to run fine on the buildbots that previously failed on this case, after SVN r351934. llvm-svn: 351946	2019-01-23 11:54:36 +00:00
Florian Hahn	68cea130df	[HotColdSplitting] Remove unused SSAUpdater.h include (NFC). llvm-svn: 351945	2019-01-23 11:51:38 +00:00
Kamil Rytarowski	776cf71d4f	Mark more tests flaky Reported on the NetBSD 8 buildbot llvm-svn: 351944	2019-01-23 11:36:19 +00:00
Ilya Biryukov	88d7d296f4	[clangd] Workaround a test failure after r351941 This should fix failing buildbots. llvm-svn: 351943	2019-01-23 11:32:07 +00:00
George Rimar	fd383e7e22	[llvm-objdump] - Move variable. NFC. It was too far from the place where it is used. llvm-svn: 351942	2019-01-23 10:52:38 +00:00
Ilya Biryukov	4e0c400a47	[clangd] Fix crash due to ObjCPropertyDecl With ObjCPropertyDecl, ASTNode.OrigD can be a ObjCPropertyImplDecl which is not a NamedDecl, leading to a crash since the code incorrectly assumes ASTNode.OrigD will always be a NamedDecl. Change by dgoldman (David Goldman)! Differential Revision: https://reviews.llvm.org/D56916 llvm-svn: 351941	2019-01-23 10:35:12 +00:00
George Rimar	bcbe98bcb9	[llvm-objdump] - Split disassembleObject() into two methods. NFCI. Currently, disassembleObject() is a ~550 lines length function. This patch splits it into two, where first do all helper objects initializations and calls the second which does all the rest job. This is a straightforward split. Differential revision: https://reviews.llvm.org/D57020 llvm-svn: 351940	2019-01-23 10:33:26 +00:00
Jonas Paulsson	6046d087c5	[SystemZ] Fix test case for buildbot. llvm-clang-x86_64-expensive-checks-win triggered this assert: "llvm.dbg.value intrinsic requires a !dbg attachment" Hopefully, adding reasonable !dbg operands solves this. llvm-svn: 351939	2019-01-23 10:29:12 +00:00
David Green	6a858a9425	[ARM] Alter the register allocation order for minsize on Thumb2 Currently in Arm code, we allocate LR first, under the assumption that it needs to be saved anyway. Unfortunately this has the disadvantage that it will require any instructions using it to be the longer thumb2 instructions, not the shorter thumb1 ones. This switches the order when we are optimising for minsize, returning to the default order so that more lower registers can be used. It can end up requiring more pushed registers, but on average produces smaller code. Differential Revision: https://reviews.llvm.org/D56008 llvm-svn: 351938	2019-01-23 10:18:30 +00:00
Kamil Rytarowski	fe1991b55e	Mark thread.condition.condvarany/wait_for.pass.cpp as flaky Reported on the NetBSD 8 buildbot. llvm-svn: 351937	2019-01-23 10:11:41 +00:00
Dmitry Venikov	cce66874a8	[llvm-symbolizer] Allow single letter command flags grouping. Summary: Currently llvm-symbolizer doesn't allow flags combining. This patch allows such grouping behavior just like addr2line. Motivation: https://bugs.llvm.org/show_bug.cgi?id=40304 Reviewers: jhenderson, ruiu Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Differential Revision: https://reviews.llvm.org/D57046 llvm-svn: 351936	2019-01-23 09:49:37 +00:00
Sam Parker	31bef63bb4	[ARM][CGP] Check trunc type before replacing In the last stage of type promotion, we replace any zext that uses a new trunc with the operand of the trunc. This is okay when we only allowed one type to be optimised, but now its the case that the trunc maybe needed to produce a more narrow type than the one we were optimising for. So we need to check this before doing the replacement. Differential Revision: https://reviews.llvm.org/D57041 llvm-svn: 351935	2019-01-23 09:18:44 +00:00
Martin Storsjo	8f52c0376b	[llvm-objcopy] [COFF] Clear the unwritten tail of coff_section::Header::Name This should fix the add-gnu-debuglink test on all buildbots. llvm-svn: 351934	2019-01-23 09:12:53 +00:00
Sam Parker	9a2a89d58f	[DAGCombine] Enable more pre-indexed stores The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933	2019-01-23 09:11:49 +00:00
Kristof Beyls	e70ba0a1bc	[SLH][AArch64] Remove accidentally retained -debug-only line from test. llvm-svn: 351932	2019-01-23 09:10:12 +00:00
Martin Storsjo	12b6b80208	Reapply: [llvm-objcopy] [COFF] Implement --add-gnu-debuglink This was reverted since it broke a couple buildbots. The reason for the breakage is not yet known, but this time, the test has got more diagnostics added, to hopefully allow figuring out what goes wrong. Differential Revision: https://reviews.llvm.org/D57007 llvm-svn: 351931	2019-01-23 08:25:28 +00:00
Kristof Beyls	3ff5dfd735	[SLH] AArch64: correctly pick temporary register to mask SP As part of speculation hardening, the stack pointer gets masked with the taint register (X16) before a function call or before a function return. Since there are no instructions that can directly mask writing to the stack pointer, the stack pointer must first be transferred to another register, where it can be masked, before that value is transferred back to the stack pointer. Before, that temporary register was always picked to be x17, since the ABI allows clobbering x17 on any function call, resulting in the following instruction pattern being inserted before function calls and returns/tail calls: mov x17, sp and x17, x17, x16 mov sp, x17 However, x17 can be live in those locations, for example when the call is an indirect call, using x17 as the target address (blr x17). To fix this, this patch looks for an available register just before the call or terminator instruction and uses that. In the rare case when no register turns out to be available (this situation is only encountered twice across the whole test-suite), just insert a full speculation barrier at the start of the basic block where this occurs. Differential Revision: https://reviews.llvm.org/D56717 llvm-svn: 351930	2019-01-23 08:18:39 +00:00
Haojian Wu	87b0e3fda9	[clangd] Link clangTidy into clangd tests Patch by Nathan Ridge! Differential Revision: https://reviews.llvm.org/D57077 llvm-svn: 351929	2019-01-23 08:04:17 +00:00
Jonas Paulsson	961c47ec98	[SystemZ] Handle DBG_VALUE instructions in two places in backend. Two backend optimizations failed to handle cases when compiled with -g, due to failing to consider DBG_VALUE instructions. This was in SystemZTargetLowering::emitSelect() and SystemZElimCompare::getRegReferences(). This patch makes sure that DBG_VALUEs are recognized so that they do not affect these optimizations. Tests for branch-on-count, load-and-trap and consecutive selects. Review: Ulrich Weigand https://reviews.llvm.org/D57048 llvm-svn: 351928	2019-01-23 07:42:26 +00:00
Martin Storsjo	b5a5055704	Fix building sanitizers for MinGW The /EHsc flag is MSVC specific, not generic to the windows target. llvm-svn: 351927	2019-01-23 07:23:16 +00:00
Max Kazantsev	d9aee3c0d1	[IRCE] Support narrow latch condition for wide range checks This patch relaxes restrictions on types of latch condition and range check. In current implementation, they should match. This patch allows to handle wide range checks against narrow condition. The motivating example is the following: int N = ... for (long i = 0; (int) i < N; i++) { if (i >= length) deopt; } In this patch, the option that enables this support is turned off by default. We'll wait until it is switched to true. Differential Revision: https://reviews.llvm.org/D56837 Reviewed By: reames llvm-svn: 351926	2019-01-23 07:20:56 +00:00
Zinovy Nis	79639e9737	[doc] Fix svn property for bugprone-parent-virtual-call.rst llvm-svn: 351925	2019-01-23 06:46:27 +00:00
Richard Smith	cfa79b27b5	[ubsan] Check the correct size when sanitizing array new. We previously forgot to multiply the element size by the array bound. llvm-svn: 351924	2019-01-23 03:37:29 +00:00
Brendon Cahoon	59d9973146	[Pipeliner] Add two pragmas to control software pipelining optimization #pragma clang loop pipeline(disable) Disable SWP optimization for the next loop. “disable” is the only possible value. #pragma clang loop pipeline_initiation_interval(number) Set value of initiation interval for SWP optimization to specified number value for the next loop. Number is the positive value greater than 0. These pragmas could be used for debugging or reducing compile time purposes. It is possible to disable SWP for concrete loops to save compilation time or to find bugs by not doing SWP to certain loops. It is possible to set value of initiation interval to concrete number to save compilation time by not doing extra pipeliner passes or to check created schedule for specific initiation interval. That is llvm part of the fix Clang part of fix: https://reviews.llvm.org/D55710 Patch by Alexey Lapshin! Differential Revision: https://reviews.llvm.org/D56403 llvm-svn: 351923	2019-01-23 03:26:10 +00:00
Stephane Moore	05a449b481	Revert rCTE351921 to fix documentation geneeration. Original review: https://reviews.llvm.org/D56945 llvm-svn: 351922	2019-01-23 02:58:59 +00:00
Stephane Moore	be9eca442e	[clang-tidy] Delete obsolete objc-property-declaration options ✂️ Summary: The `Acronyms` and `IncludeDefaultAcronyms` options were deprecated in https://reviews.llvm.org/D51832. These options can be removed. Tested by running the clang-tidy tests. Reviewers: benhamilton, aaron.ballman Reviewed By: aaron.ballman Subscribers: Eugene.Zelenko, xazax.hun, cfe-commits Tags: #clang-tools-extra Differential Revision: https://reviews.llvm.org/D56945 llvm-svn: 351921	2019-01-23 02:34:21 +00:00
Peter Collingbourne	73078ecd38	hwasan: Move memory access checks into small outlined functions on aarch64. Each hwasan check requires emitting a small piece of code like this: https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses The problem with this is that these code blocks typically bloat code size significantly. An obvious solution is to outline these blocks of code. In fact, this has already been implemented under the -hwasan-instrument-with-calls flag. However, as currently implemented this has a number of problems: - The functions use the same calling convention as regular C functions. This means that the backend must spill all temporary registers as required by the platform's C calling convention, even though the check only needs two registers on the hot path. - The functions take the address to be checked in a fixed register, which increases register pressure. Both of these factors can diminish the code size effect and increase the performance hit of -hwasan-instrument-with-calls. The solution that this patch implements is to involve the aarch64 backend in outlining the checks. An intrinsic and pseudo-instruction are created to represent a hwasan check. The pseudo-instruction is register allocated like any other instruction, and we allow the register allocator to select almost any register for the address to check. A particular combination of (register selection, type of check) triggers the creation in the backend of a function to handle the check for specifically that pair. The resulting functions are deduplicated by the linker. The pseudo-instruction (really the function) is specified to preserve all registers except for the registers that the AAPCS specifies may be clobbered by a call. To measure the code size and performance effect of this change, I took a number of measurements using Chromium for Android on aarch64, comparing a browser with inlined checks (the baseline) against a browser with outlined checks. Code size: Size of .text decreases from 243897420 to 171619972 bytes, or a 30% decrease. Performance: Using Chromium's blink_perf.layout microbenchmarks I measured a median performance regression of 6.24%. The fact that a perf/size tradeoff is evident here suggests that we might want to make the new behaviour conditional on -Os/-Oz. But for now I've enabled it unconditionally, my reasoning being that hwasan users typically expect a relatively large perf hit, and ~6% isn't really adding much. We may want to revisit this decision in the future, though. I also tried experimenting with varying the number of registers selectable by the hwasan check pseudo-instruction (which would result in fewer variants being created), on the hypothesis that creating fewer variants of the function would expose another perf/size tradeoff by reducing icache pressure from the check functions at the cost of register pressure. Although I did observe a code size increase with fewer registers, I did not observe a strong correlation between the number of registers and the performance of the resulting browser on the microbenchmarks, so I conclude that we might as well use ~all registers to get the maximum code size improvement. My results are below: Regs \| .text size \| Perf hit -----+------------+--------- ~all \| 171619972 \| 6.24% 16 \| 171765192 \| 7.03% 8 \| 172917788 \| 5.82% 4 \| 177054016 \| 6.89% Differential Revision: https://reviews.llvm.org/D56954 llvm-svn: 351920	2019-01-23 02:20:10 +00:00
Peter Collingbourne	7cf27205df	gn build: Merge r351820. llvm-svn: 351919	2019-01-23 02:19:56 +00:00

1 2 3 4 5 ...

308125 Commits All Branches Search

308125 Commits

All Branches