llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	33d44b762e	[OpaquePtr][Inline] Use byval type instead of pointee type Reviewed By: #opaque-pointers, dblaikie Differential Revision: https://reviews.llvm.org/D105711	2021-08-19 09:56:08 -07:00
Owen Anderson	06a4c85890	Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64. This allows the instruction selector to realize that it can directly broadcast the low byte of the memset value, rather than replicating it to a 64-bit GPR before broadcasting. This fixes PR50985. Differential Revision: https://reviews.llvm.org/D108354	2021-08-19 16:54:07 +00:00
Simon Pilgrim	94e1442d78	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 17:49:32 +01:00
Yi Kong	ca6d5813d1	[clang] Do not warn unused -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang Android enables zero initialisation globally by default, but also allows subprojects to override with different option. Clang complains the above flag being unused in this case. Instead of adding a 75 char long -no-* flag, don't warn unused argument for this flag. Differential Revision: https://reviews.llvm.org/D108278	2021-08-20 00:37:01 +08:00
Augie Fackler	e59c88294b	MemoryBuiltins: trailing , on collection literal This was probably bugging more than is reasonable, but it makes merging changes in this file slightly less annoying to have the trailing comma here. I only noticed this because Rust is currently carrying a patch to this file and it kept making life a little difficult.	2021-08-19 17:59:23 +02:00
Thomas Preud'homme	9d476f0af9	Fix CodeGen/X86/fsafdo_test2.ll fail in release Require debug build for CodeGen/X86/fsafdo_test2.ll since it checks for messages only printed in debug mode. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D108364	2021-08-19 16:54:04 +01:00
Simon Pilgrim	ff69c65b05	Fix empty paragraph passed to parameter Wdocumentation warning. NFC.	2021-08-19 16:48:28 +01:00
Craig Topper	84cea602f9	Revert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand." This reverts commit `add08c8741`. There was a compile time jump on tramp3d-v4 on https://llvm-compile-time-tracker.com/ Want to see if it goes away with this reverted.	2021-08-19 08:42:05 -07:00
Jinsong Ji	0541ce4ef9	[CRT][LIT] build the target_cflags for Popen properly We recently enabled crt for powerpc in https://reviews.llvm.org/rGb7611ad0b16769d3bf172e84fa9296158f8f1910. And we started to see some unexpected error message when running check-runtimes. eg: https://lab.llvm.org/buildbot/#/builders/57/builds/9488/steps/6/logs/stdio line 100 - 103: " clang-14: error: unknown argument: '-m64 -fno-function-sections' clang-14: error: unknown argument: '-m64 -fno-function-sections' clang-14: error: unknown argument: '-m64 -fno-function-sections' clang-14: error: unknown argument: '-m64 -fno-function-sections' " Looks like we shouldn't strip the space at the beginning, or else the command line passed to subprocess won't work well. Reviewed By: phosek, MaskRay Differential Revision: https://reviews.llvm.org/D108329	2021-08-19 15:39:53 +00:00
Alfsonso Gregory	b0bf0b2e79	[Clang][AST][NFC] Resolve FIXME: Make CXXRecordDecl *Record const. Differential Revision: https://reviews.llvm.org/D107477	2021-08-19 16:36:32 +01:00
Yaron Keren	1987eb9e9c	[docs] Document how to install sphinx and recommonmark on Ubuntu Differential Revision: https://reviews.llvm.org/D108374	2021-08-19 18:24:17 +03:00
David Green	d10f23a25d	[ISel] Expand saddsat and ssubsat via asr and xor This changes the lowering of saddsat and ssubsat so that instead of using: r,o = saddo x, y c = setcc r < 0 s = c ? INTMAX : INTMIN ret o ? s : r into using asr and xor to materialize the INTMAX/INTMIN constants: r,o = saddo x, y s = ashr r, BW-1 x = xor s, INTMIN ret o ? x : r https://alive2.llvm.org/ce/z/TYufgD This seems to reduce the instruction count in most testcases across most architectures. X86 has some custom lowering added to compensate for cases where it can increase instruction count. Differential Revision: https://reviews.llvm.org/D105853	2021-08-19 16:08:07 +01:00
Jinsong Ji	a9cc662722	[AIX] Remove XFAIL from macro-same-context We have enabled inline asm intergrated assembler support, this test is passing now.	2021-08-19 14:52:58 +00:00
Simon Pilgrim	87c8c8ae97	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 15:40:10 +01:00
Simon Pilgrim	ae691648b4	Fix unknown parameter Wdocumentation warning. NFC.	2021-08-19 15:40:10 +01:00
Simon Pilgrim	fd37ead386	Fix unknown parameter Wdocumentation warning. NFC.	2021-08-19 15:40:10 +01:00
Simon Pilgrim	caa282a449	Fix unknown parameter Wdocumentation warning. NFC.	2021-08-19 15:40:09 +01:00
Simon Pilgrim	9419729b6a	[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs VPOPCNTDQ + BITALG add ctpop instructions for vXi64/vXi32 + vXi16/vXi8 vector types respectively	2021-08-19 15:40:09 +01:00
Craig Topper	add08c8741	[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand. Previously we pre-calculated this and cached it for every instruction in the function. Most of the calculated results will never be used. So instead calculate it only on the first use, and then cache it. The cache was originally added to fix a compile time issue which caused r216066 to be reverted. This change exposed that we weren't pre-computing the Value for Arguments. I've explicitly disabled that for now as it seemed to regress some tests on AArch64 which has sext built into its compare instructions. Spotted while investigating how to improve heuristics to work better with RISCV preferring sign extend for unsigned compares for i32 on RV64. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107976	2021-08-19 07:18:33 -07:00
Craig Topper	c60a4c1ba5	[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC This matches how they are called and allows some isa/cast/dyn_cast to be removed. Differential Revision: https://reviews.llvm.org/D108333	2021-08-19 07:09:38 -07:00
Craig Topper	36d8316cc8	[RISCV] Reduce duplicate code for calling SimplifyDemandedBits. This encapsulates the APInt creation and worklist management into a helper function. To keep one common interface I've use Log2_32 in places that previously created a mask by subtracting 1 from a power of 2. Differential Revision: https://reviews.llvm.org/D108324	2021-08-19 07:09:38 -07:00
David Green	765a421276	[ARM] Add MVE min/max intrinsic tests. NFC	2021-08-19 14:33:34 +01:00
Alexey Lapshin	ab9d506be3	[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying. Avoid copying while access to RangesOrError.get().	2021-08-19 16:23:05 +03:00
Simon Pilgrim	2d60fdd7aa	[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs	2021-08-19 14:05:58 +01:00
Ben Shi	b10e74389e	[RISCV][test] Improve tests for (add (mul x, c1), c2) Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107710	2021-08-19 21:04:35 +08:00
Matthias Springer	76a1861816	[mlir][SparseTensor] Split scf.for loop into masked/unmasked parts Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores. Differential Revision: https://reviews.llvm.org/D107733	2021-08-19 21:53:11 +09:00
Sanjay Patel	ec54e275f5	Revert "[CVP] processSwitch: Remove default case when switch cover all possible values." This reverts commit `9934a5b2ed`. This patch may cause miscompiles because it missed a constraint as shown in the examples from: https://llvm.org/PR51531	2021-08-19 08:43:51 -04:00
Sanjay Patel	eee0ded337	[InstCombine] add min/max intrinsics as freely invertible candidates In the optimized test, we are able to peak through the min/max that has 2 min/max operands and invert them all: https://alive2.llvm.org/ce/z/7gYMN5	2021-08-19 08:41:38 -04:00
Sanjay Patel	610d3d512a	[InstCombine] add tests for min/max with inverts; NFC	2021-08-19 08:41:38 -04:00
Sanjay Patel	e10c3beca5	[InstCombine] add one-use check for min/max fold with not operands; NFC This makes the intrinsic logic match the cmp+select idiom folds just below. It's not clearly a win either way unless we think that a 'not' op costs more than min/max. The cmp+select folds on these patterns are more extensive than the intrinsics currently and may have some complicated interactions, so I'm trying to make those line up and bring the optimizations for intrinsics up to parity.	2021-08-19 08:41:38 -04:00
Jon Chesterfield	77579b99e9	[openmp][nfc] Replace OMPGridValues array with struct [nfc] Replaces enum indices into an array with a struct. Named the fields to match the enum, leaves memory layout and initialization unchanged. Motivation is to later safely remove dead fields and replace redundant ones with (compile time) computation. It should also be possible to factor some common fields into a base and introduce a gfx10 amdgpu instance with less duplication than the arrays of integers require. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D108339	2021-08-19 13:25:42 +01:00
Rosie Sumpter	d1aa075129	[LoopFlatten] Fix assertion failure There is an assertion failure in computeOverflowForUnsignedMul (used in checkOverflow) due to the inner and outer trip counts having different types. This occurs when the IV has been widened, but the loop components are not successfully rediscovered. This is fixed by some refactoring of the code in findLoopComponents which identifies the trip count of the loop. Differential Revision: https://reviews.llvm.org/D108107	2021-08-19 13:18:57 +01:00
Fraser Cormack	e6b1ac8546	[LegalizeTypes][VP] Add widening support for binary VP ops This patch adds the beginnings of more thorough support in the legalizers for vector-predicated (VP) operations. The first step is the ability to widen illegal vectors. The more complicated scenario in which the result/operands need widening but the mask doesn't has not been handled here. That would require a lot of code without an in-tree target on which to test it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107904	2021-08-19 13:08:47 +01:00
Sam McCall	cab7c52acd	[CodeCompletion] Provide placeholders for known attribute arguments Completion now looks more like function/member completion: used alias(Aliasee) abi_tag(Tags...) Differential Revision: https://reviews.llvm.org/D108109	2021-08-19 14:03:41 +02:00
Matthew Devereau	734708e04f	[AArch64][SVE] Teach cost model that masked loads/stores are cheap Reduce the cost of VLS masked loads/stores to make the vectorizor emit them more frequently.	2021-08-19 13:01:33 +01:00
Ben Shi	9e40a32620	[RISCV][test] Add new tests for add optimization in the zba extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D108188	2021-08-19 19:59:23 +08:00
Simon Pilgrim	c1d9c2fb87	[X86] Regenerate store_op_load_fold.ll test checks	2021-08-19 12:42:09 +01:00
Sam McCall	a1ebae08f4	[CodeComplete] Only complete attributes that match the current LangOpts Differential Revision: https://reviews.llvm.org/D108111	2021-08-19 13:35:07 +02:00
Marco Elver	303d278ad2	[tsan] Fix pthread_once() on Mac OS X Change `636428c727` enabled BlockingRegion hooks for pthread_once(). Unfortunately this seems to cause crashes on Mac OS X which uses pthread_once() from locations that seem to result in crashes: \| ThreadSanitizer:DEADLYSIGNAL \| ==31465==ERROR: ThreadSanitizer: stack-overflow on address 0x7ffee73fffd8 (pc 0x00010807fd2a bp 0x7ffee7400050 sp 0x7ffee73fffb0 T93815) \| #0 __tsan::MetaMap::GetSync(__tsan::ThreadState, unsigned long, unsigned long, bool, bool) tsan_sync.cpp:195 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x78d2a) \| #1 __tsan::MutexPreLock(__tsan::ThreadState, unsigned long, unsigned long, unsigned int) tsan_rtl_mutex.cpp:143 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x6cefc) \| #2 wrap_pthread_mutex_lock sanitizer_common_interceptors.inc:4240 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x3dae0) \| #3 flockfile <null>:2 (libsystem_c.dylib:x86_64+0x38a69) \| #4 puts <null>:2 (libsystem_c.dylib:x86_64+0x3f69b) \| #5 wrap_puts sanitizer_common_interceptors.inc (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x34d83) \| #6 __tsan::OnPotentiallyBlockingRegionBegin() cxa_guard_acquire.cpp:8 (foo:x86_64+0x100000e48) \| #7 wrap_pthread_once tsan_interceptors_posix.cpp:1512 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2f6e6) From the stack trace it can be seen that the caller is unknown, and the resulting stack-overflow seems to indicate that whoever the caller is does not have enough stack space or otherwise is running in a limited environment not yet ready for full instrumentation. Fix it by reverting behaviour on Mac OS X to not call BlockingRegion hooks from pthread_once(). Reported-by: azharudd Reviewed By: glider Differential Revision: https://reviews.llvm.org/D108305	2021-08-19 13:17:45 +02:00
Frederik Gossen	c20cb5547d	Avoid unused variable when NDEBUG	2021-08-19 13:00:16 +02:00
Sven van Haastregt	7bda1a0711	[OpenCL] Fix as_type(vec3) invalid store creation With -fpreserve-vec3-type enabled, a cast was not created when converting from a vec3 type to a non-vec3 type, even though a conversion to vec4 was performed. This resulted in creation of invalid store instructions. Differential Revision: https://reviews.llvm.org/D107963	2021-08-19 11:57:09 +01:00
Bjorn Pettersson	36d5138619	[NewPM] Make some sanitizer passes parameterized in the PassRegistry Refactored implementation of AddressSanitizerPass and HWAddressSanitizerPass to use pass options similar to passes like MemorySanitizerPass. This makes sure that there is a single mapping from class name to pass name (needed by D108298), and options like -debug-only and -print-after makes a bit more sense when (despite that it is the unparameterized pass name that should be used in those options). A result of the above is that some pass names are removed in favor of the parameterized versions: - "khwasan" is now "hwasan<kernel;recover>" - "kasan" is now "asan<kernel>" - "kmsan" is now "msan<kernel>" Differential Revision: https://reviews.llvm.org/D105007	2021-08-19 12:43:37 +02:00
Yaron Keren	23b16d2453	[docs] Document that psutil should be installed in non-user location Differential Revision: https://reviews.llvm.org/D108356	2021-08-19 13:42:31 +03:00
Renato Golin	894ad26bd5	Update {Small}BitVector size_type definition SmallBitVector implements a level of indirection over BitVector by storing a smaller bit-vector in a pointer-sized element, or in case the number of elements exceeds the bucket size, it creates a new pointer to a BitVector and uses that as its storage. However, the functions returning the vector size were using `unsigned`, which is ok for BitVector, but not for SmallBitVector, which is actually `uintptr_t`. This commit reuses the `size_type` definition to more than just `count` and propagates them into range iteration, size calculation, etc. This is a continuation of D108124. I haven't changed all occurrences of `unsigned` or `uintptr_t` to `size_type`, just those that were directly related. Following directions from clang-tidy on case of variables. Differential Revision: https://reviews.llvm.org/D108290	2021-08-19 11:13:38 +01:00
Andrzej Warzynski	dcc6b7b1d5	[OptTable] Refine how `printHelp` treats empty help texts Currently, `printHelp` behaves differently for options that: * do not define `HelpText` (such options _are not printed_), and * define its `HelpText` as `HelpText<"">` (such options _are printed_). In practice, both approaches lead to no help text and `printHelp` should treat them consistently. This patch addresses that by making `printHelpt` check the length of the help text to be printed. All affected tests have been updated accordingly. The option definitions for llvm-cvtres have been updated with a short description or "Not implemented" for options that are ignored by the tool. Differential Revision: https://reviews.llvm.org/D107557	2021-08-19 09:30:15 +00:00
Martin Storsjö	cc3affd8b0	[clang] [MSVC] Implement __mulh and __umulh builtins for aarch64 The code is based on the same __mulh and __umulh intrinsics for x86. This should fix PR51128. Differential Revision: https://reviews.llvm.org/D106721	2021-08-19 11:29:55 +03:00
David Sherwood	f4122398e7	[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64 I have added a new TTI interface called enableOrderedReductions() that controls whether or not ordered reductions should be enabled for a given target. By default this returns false, whereas for AArch64 it returns true and we rely upon the cost model to make sensible vectorisation choices. It is still possible to override the new TTI interface by setting the command line flag: -force-ordered-reductions=true\|false I have added a new RUN line to show that we use ordered reductions by default for SVE and Neon: Transforms/LoopVectorize/AArch64/strict-fadd.ll Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll Differential Revision: https://reviews.llvm.org/D106653	2021-08-19 09:29:40 +01:00
Stuart Ellis	520e5db26a	[flang][driver] Add print function name Plugin example Replacing Hello World example Plugin with one that counts and prints the names of functions and subroutines. This involves changing the `PluginParseTreeAction` Plugin base class to inherit from `PrescanAndSemaAction` class to get access to the Parse Tree so that the Plugin can walk it. Additionally, there are tests of this new Plugin to check it prints the correct things in different circumstances. Depends on: D106137 Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D107089	2021-08-19 08:25:34 +00:00
Matthias Springer	8e8b70aa84	[mlir][scf] Simplify affine.min ops after loop peeling Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body. affine.min ops such as: ``` map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)> %r = affine.min #affine.min #map(%iv)[%step, %ub] ``` are rewritten them into (in the case the peeled loop): ``` %r = %step ``` To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized. Differential Revision: https://reviews.llvm.org/D107222	2021-08-19 17:24:53 +09:00
Diana Picus	3330b2532f	[flang] Add POSIX implementation for SYSTEM_CLOCK This is very similar to CPU_TIME, except that we return nanoseconds rather than seconds. This means we're potentially dealing with rather large numbers, so we'll have to wrap around to avoid overflows. Differential Revision: https://reviews.llvm.org/D105970	2021-08-19 07:39:37 +00:00

... 2 3 4 5 6 ...

397106 Commits All Branches Search

397106 Commits

All Branches