llvm-project/clang/lib
Chandler Carruth 4c5e8ccf74 [x86] Fix a really nasty bug introduced in r276417 where alignment
constraints were added to _mm256_broadcast_{pd,ps} intel intrinsics.

The spec for these intrinics is ... pretty much silent on alignment.
This is especially frustrating considering the amount of discussion of
alignment in the load and store instrinsics. So I was forced to rely on
the specification for the VBROADCASTF128 instruction.

That instruction's spec is *also* completely silent on alignment.
Fortunately, when it comes to the instruction's spec, silence is enough.
There is no #GP fault option for an underaligned address so this
instruction, and by inference the intrinsic, can read any alignment.

As it happens, the old code worked exactly this way and in fact we have
plenty of code that hands pointers with less than 16-byte alignment to
these intrinsics. This code broke pretty spectacularly with this commit.

Fortunately, the fix is super simple! Change a 16 to a 1, and ta da!

Anyways, a lot of debugging for a really boring fix. =]

llvm-svn: 278202
2016-08-10 07:32:47 +00:00
..
ARCMigrate [NFC] Header cleanup 2016-07-18 19:02:11 +00:00
AST Fix typos from r277797 and unused variable from r277889. 2016-08-06 01:44:06 +00:00
ASTMatchers [ASTMatchers] Add matchers canReferToDecl() and hasUnderlyingDecl() 2016-08-09 15:07:52 +00:00
Analysis [analyzer] Try to fix coverity CID 1360469. 2016-08-09 10:00:23 +00:00
Basic [Diag] Fix idiom in comment: "on the lam", not "on the lamb". 2016-08-10 01:09:07 +00:00
CodeGen [x86] Fix a really nasty bug introduced in r276417 where alignment 2016-08-10 07:32:47 +00:00
Driver [OpenCL] Handle -cl-fp32-correctly-rounded-divide-sqrt 2016-08-09 20:10:18 +00:00
Edit [OpenCL] Generate opaque type for sampler_t and function call for the initializer 2016-07-28 19:26:30 +00:00
Format clang-format: Add SpaceAfterTemplate 2016-08-09 14:24:40 +00:00
Frontend [OpenCL] Handle -cl-fp32-correctly-rounded-divide-sqrt 2016-08-09 20:10:18 +00:00
FrontendTool [analyzer] Command line option to show enabled checker list. 2016-08-08 13:41:04 +00:00
Headers [CUDA] Add __device__ overloads for placement new and delete. 2016-08-10 01:09:14 +00:00
Index [index] Fix crash with indexing designated init expressions inside templates. 2016-08-03 05:38:53 +00:00
Lex Reapply r276973 "Adjust Registry interface to not require plugins to export a registry" 2016-08-05 11:01:08 +00:00
Parse Pass information in a record instead of stack. NFC 2016-08-08 04:02:15 +00:00
Rewrite Remove use of builtin comma operator. 2016-02-18 22:34:54 +00:00
Sema [CUDA] Reject calls to __device__ functions from host variable global initializers. 2016-08-10 01:09:21 +00:00
Serialization [ASTReader] Use real move semantics instead of emulating them in the copy ctor. 2016-08-06 12:45:16 +00:00
StaticAnalyzer [analyzer] Change -analyze-function to accept qualified names. 2016-08-08 16:01:02 +00:00
Tooling Fixes calculateRangesAfterReplacements crash when Replacements is empty. 2016-08-08 13:37:39 +00:00
CMakeLists.txt Fix build with various feature flag combinations 2014-07-14 22:17:22 +00:00