llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	e8f7a95941	Add a test case that I've been using to clarify the bitfield layout for both LE and BE targets. AFAICT, Clang get's this correct for PPC64. I've compared it to GCC 4.8 output for PPC64 (thanks Roman!) and to my limited ability to read power assembly, it looks functionally equivalent. It would be really good to fill in the assertions on this test case for x86-32, PPC32, ARM, etc., but I've reached the limit of my time and energy... Hopefully other folks can chip in as it would be good to have this in place to test any subsequent changes. To those who care about PPC64 performance, a side note: there is some obnoxiously bad code generated for these test cases. It would be worth someone's time to sit down and teach the PPC backend to pattern match these IR constructs better. It appears that things like '(shr %foo, <imm>)' turn into 'rldicl R, R, 64-<imm>, <imm>' or some such. They don't even get combined with other 'rldicl' instructions immediately adjacent. I'll add a couple of these patterns to the README, but I think it would be better to look at all the patterns produced by this and other bitfield access code, and systematically build up a collection of patterns that efficiently reduce them to the minimal code. llvm-svn: 169693	2012-12-09 10:08:22 +00:00
Chandler Carruth	fd8eca202f	Fix the bitfield record layout in codegen for big endian targets. This was an egregious bug due to the several iterations of refactorings that took place. Size no longer meant what it original did by the time I finished, but this line of code never got updated. Unfortunately we had essentially zero tests for this in the regression test suite. =[ I've added a PPC64 run over the bitfield test case I've been primarily using. I'm still looking at adding more tests and making sure this is the correct bitfield access code on PPC64 linux, but it looks pretty close to me, and it is worlds better than before this patch as it no longer asserts! =] More commits to follow with at least additional tests and maybe more fixes. Sorry for the long breakage due to this.... llvm-svn: 169691	2012-12-09 07:26:04 +00:00
Chandler Carruth	ff0e3a1e1c	Rework the bitfield access IR generation to address PR13619 and generally support the C++11 memory model requirements for bitfield accesses by relying more heavily on LLVM's memory model. The primary change this introduces is to move from a manually aligned and strided access pattern across the bits of the bitfield to a much simpler lump access of all bits in the bitfield followed by math to extract the bits relevant for the particular field. This simplifies the code significantly, but relies on LLVM to intelligently lowering these integers. I have tested LLVM's lowering both synthetically and in benchmarks. The lowering appears to be functional, and there are no really significant performance regressions. Different code patterns accessing bitfields will vary in how this impacts them. The only real regressions I'm seeing are a few patterns where the LLVM code generation for loads that feed directly into a mask operation don't take advantage of the x86 ability to do a smaller load and a cheap zero-extension. This doesn't regress any benchmark in the nightly test suite on my box past the noise threshold, but my box is quite noisy. I'll be watching the LNT numbers, and will look into further improvements to the LLVM lowering as needed. llvm-svn: 169489	2012-12-06 11:14:44 +00:00

Author

SHA1

Message

Date

Chandler Carruth

e8f7a95941

Add a test case that I've been using to clarify the bitfield layout for

both LE and BE targets.

AFAICT, Clang get's this correct for PPC64. I've compared it to GCC 4.8
output for PPC64 (thanks Roman!) and to my limited ability to read power
assembly, it looks functionally equivalent. It would be really good to
fill in the assertions on this test case for x86-32, PPC32, ARM, etc.,
but I've reached the limit of my time and energy... Hopefully other
folks can chip in as it would be good to have this in place to test any
subsequent changes.

To those who care about PPC64 performance, a side note: there is some
*obnoxiously* bad code generated for these test cases. It would be worth
someone's time to sit down and teach the PPC backend to pattern match
these IR constructs better. It appears that things like '(shr %foo,
<imm>)' turn into 'rldicl R, R, 64-<imm>, <imm>' or some such. They
don't even get combined with other 'rldicl' instructions *immediately
adjacent*. I'll add a couple of these patterns to the README, but
I think it would be better to look at all the patterns produced by this
and other bitfield access code, and systematically build up a collection
of patterns that efficiently reduce them to the minimal code.

llvm-svn: 169693

2012-12-09 10:08:22 +00:00

Chandler Carruth

fd8eca202f

Fix the bitfield record layout in codegen for big endian targets.

This was an egregious bug due to the several iterations of refactorings
that took place. Size no longer meant what it original did by the time
I finished, but this line of code never got updated. Unfortunately we
had essentially zero tests for this in the regression test suite. =[

I've added a PPC64 run over the bitfield test case I've been primarily
using. I'm still looking at adding more tests and making sure this is
the *correct* bitfield access code on PPC64 linux, but it looks pretty
close to me, and it is *worlds* better than before this patch as it no
longer asserts! =] More commits to follow with at least additional tests
and maybe more fixes.

Sorry for the long breakage due to this....

llvm-svn: 169691

2012-12-09 07:26:04 +00:00

Chandler Carruth

ff0e3a1e1c

Rework the bitfield access IR generation to address PR13619 and

generally support the C++11 memory model requirements for bitfield
accesses by relying more heavily on LLVM's memory model.

The primary change this introduces is to move from a manually aligned
and strided access pattern across the bits of the bitfield to a much
simpler lump access of all bits in the bitfield followed by math to
extract the bits relevant for the particular field.

This simplifies the code significantly, but relies on LLVM to
intelligently lowering these integers.

I have tested LLVM's lowering both synthetically and in benchmarks. The
lowering appears to be functional, and there are no really significant
performance regressions. Different code patterns accessing bitfields
will vary in how this impacts them. The only real regressions I'm seeing
are a few patterns where the LLVM code generation for loads that feed
directly into a mask operation don't take advantage of the x86 ability
to do a smaller load and a cheap zero-extension. This doesn't regress
any benchmark in the nightly test suite on my box past the noise
threshold, but my box is quite noisy. I'll be watching the LNT numbers,
and will look into further improvements to the LLVM lowering as needed.

llvm-svn: 169489

2012-12-06 11:14:44 +00:00

3 Commits