The function strcmp() can return any value, not just {-1,0,1} : "The strcmp(const char *s1, const char *s2) function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2." [C11 7.24.4.2p3]
https://llvm.org/bugs/show_bug.cgi?id=23790http://reviews.llvm.org/D16317
llvm-svn: 270154
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Reviewers: tra, rsmith
Subscribers: jhen, cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D19990
llvm-svn: 270150
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constructor or a
> non-empty destructor.
Clang already deals with device-side constructors (see D15305).
This patch enforces similar rules for destructors.
Differential Revision: http://reviews.llvm.org/D20140
llvm-svn: 270108
Codegen tests for device-side variable initialization are subset of test
cases used to verify Sema's part of the job.
Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to
keep both sides in sync.
Differential Revision: http://reviews.llvm.org/D20139
llvm-svn: 270107
This matches default nvcc behavior and gives substantial
performance boost on GPU where fmad is much cheaper compared to add+mul.
Differential Revision: http://reviews.llvm.org/D20341
llvm-svn: 270094
All additional include directories are relative to the toolchain install
folder. So let's do not pass this folder to each callback to simplify
and slightly reduce the code.
llvm-svn: 270069
CodeSourcery toolchain is a standalone toolchain which always uses
the same triple name in its paths. It is independent from target
triple used by the driver.
llvm-svn: 270067
- Fixed cdp intrinsic to only accept compile time
constant values previously you could pass in a
variable to the builtin which would result in
illegal llvm assembly output
Differential Revision: http://reviews.llvm.org/D20394
llvm-svn: 270058
This reversal is being done with r267453's author's (i.e. Richard Smith's) permission.
This fixes https://llvm.org/bugs/show_bug.cgi?id=27601
Also, per Richard's request the examples from the bug report have been added to our test suite.
llvm-svn: 270016
an identifier table lookup, *and* copy the LangOptions (including various
std::vector<std::string>s). Twice. We call this function once each time we start
parsing a declaration specifier sequence, and once for each call to Sema::Diag.
This reduces the compile time for a sample .c file from the linux kernel by 20%.
llvm-svn: 270009
instance method.
When diagnosing unimplemented class property, make sure we emit
a warning when we only see an instance method with the right selector.
Also warn when we only see a class method for an instance property.
rdar://26141719
llvm-svn: 269968
Summary:
-fembed-bitcode was only checking for old style LTO flag (-flto) but not
considering the new -flto= style option. That makes clang output bitcode
embedded in bitcode object when using -flto= and -fembed-bitcode= together.
Now clang should output normal bitcode file when using LTO and ignores
-fembed-bitcode option.
Reviewers: joker.eph
Subscribers: joker.eph, cfe-commits
Differential Revision: http://reviews.llvm.org/D20374
llvm-svn: 269961
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction
execution and enter an implementation-dependent optimized state until
occurrence of a class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper
Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits
Differential Revision: http://reviews.llvm.org/D19796
llvm-svn: 269907