Commit Graph

1976 Commits

Author SHA1 Message Date
Ahmed Bougacha 4020363746 MC CFG: Remap enough for data too, analoguous to r188873.
llvm-svn: 188925
2013-08-21 19:40:28 +00:00
Ahmed Bougacha 47c2a75e8d Style cleanup following David's review for r188876.
llvm-svn: 188924
2013-08-21 19:40:25 +00:00
Ahmed Bougacha 1792647942 MC CFG: Add YAML MCModule representation to enable MC CFG testing.
Like yaml ObjectFiles, this will be very useful for testing the MC CFG
implementation (mostly MCObjectDisassembler), by matching the output
with YAML, and for potential users of the MC CFG, by using it as an input.

There isn't much to the actual format, it is just a serialization of the
MCModule class. Of note:
  - Basic block references (pred/succ, ..) are represented by the BB's
    start address.
  - Just as in the MC CFG, instructions are MCInsts with a size.
  - Operands have a prefix representing the type (only register and
    immediate supported here).
  - Instruction opcodes are represented by their names; enum values aren't
    stable, enum names mostly are: usually, a change to a name would need
    lots of changes in the backend anyway.
    Same with registers.

All in all, an example is better than 1000 words, here goes:

A simple binary:

  Disassembly of section __TEXT,__text:
  _main:
  100000f9c:      48 8b 46 08             movq    8(%rsi), %rax
  100000fa0:      0f be 00                movsbl  (%rax), %eax
  100000fa3:      3b 04 25 48 00 00 00    cmpl    72, %eax
  100000faa:      0f 8c 07 00 00 00       jl      7 <.Lend>
  100000fb0:      2b 04 25 48 00 00 00    subl    72, %eax
  .Lend:
  100000fb7:      c3                      ret

And the (pretty verbose) generated YAML:

  ---
  Atoms:
    - StartAddress:    0x0000000100000F9C
      Size:            20
      Type:            Text
      Content:
        - Inst:            MOV64rm
          Size:            4
          Ops:             [ RRAX, RRSI, I1, R, I8, R ]
        - Inst:            MOVSX32rm8
          Size:            3
          Ops:             [ REAX, RRAX, I1, R, I0, R ]
        - Inst:            CMP32rm
          Size:            7
          Ops:             [ REAX, R, I1, R, I72, R ]
        - Inst:            JL_4
          Size:            6
          Ops:             [ I7 ]
    - StartAddress:    0x0000000100000FB0
      Size:            7
      Type:            Text
      Content:
        - Inst:            SUB32rm
          Size:            7
          Ops:             [ REAX, REAX, R, I1, R, I72, R ]
    - StartAddress:    0x0000000100000FB7
      Size:            1
      Type:            Text
      Content:
        - Inst:            RET
          Size:            1
          Ops:             [  ]
  Functions:
    - Name:            __text
      BasicBlocks:
        - Address:         0x0000000100000F9C
          Preds:           [  ]
          Succs:           [ 0x0000000100000FB7, 0x0000000100000FB0 ]
     <snip>
  ...

llvm-svn: 188890
2013-08-21 07:29:02 +00:00
Ahmed Bougacha 69a7562335 MC CFG: Support disassembly at arbitrary addresses in MCObjectDisassembler.
llvm-svn: 188889
2013-08-21 07:28:55 +00:00
Ahmed Bougacha 518cc6f811 MC CFG: Use data structures more appropriate than std::set.
llvm-svn: 188888
2013-08-21 07:28:51 +00:00
Ahmed Bougacha 58ed11341b MC CFG: Add an MCObjectSymbolizer in the MCObjectDisassembler.
Used to detect calls to function symbol stubs (future commit).

llvm-svn: 188887
2013-08-21 07:28:48 +00:00
Ahmed Bougacha b09d140f6b MC CFG: Add MCObjectDisassembler Mach-O implementation.
Supports:
- entrypoint, using LC_MAIN.
- static ctors/dtors, using __mod_{init,exit}_func
- translation between effective and object load address, using
  dyld's VM address slide.

llvm-svn: 188886
2013-08-21 07:28:44 +00:00
Ahmed Bougacha 2eb593682a MC CFG: Add "dynamic disassembly" support to MCObjectDisassembler.
It can now disassemble code in situations where the effective load
address is different than the load address declared in the object file.
This happens for PIC, hence "dynamic".

llvm-svn: 188884
2013-08-21 07:28:37 +00:00
Ahmed Bougacha 57bc9677cd MC CFG: When disassembly is impossible, fallback to data bytes.
This is the behavior of sequential disassemblers (llvm-objdump, ...),
when there is no instruction size hint (fixed-length, ...)

While there, also do some minor cleanup.

llvm-svn: 188883
2013-08-21 07:28:32 +00:00
Ahmed Bougacha a376353346 MC CFG: Add MCObjectDisassembler support for entrypoint + static ctors.
For now, this isn't implemented for any format.

llvm-svn: 188882
2013-08-21 07:28:29 +00:00
Ahmed Bougacha ff12d02d51 MC CFG: Split MCBasicBlocks to mirror atom splitting.
When an MCTextAtom is split, all MCBasicBlocks backed by it are
automatically split, with a fallthrough between both blocks, and
the successors moved to the second block.

llvm-svn: 188881
2013-08-21 07:28:24 +00:00
Ahmed Bougacha d3fc5b9648 MC CFG: Add a few needed methods, mainly MCModule::findFirstAtomAfter.
While there, do some minor cleanup.

llvm-svn: 188880
2013-08-21 07:28:17 +00:00
Ahmed Bougacha ffeecb5c80 MC: ObjectSymbolizer can now recognize external function stubs.
Only implemented in the Mach-O ObjectSymbolizer.
The testcase sadly introduces a new binary.

llvm-svn: 188879
2013-08-21 07:28:13 +00:00
Ahmed Bougacha 382a6d7562 MC: Refactor ObjectSymbolizer to make relocation/section info generation lazy.
llvm-svn: 188878
2013-08-21 07:28:07 +00:00
Ahmed Bougacha 3012ac5387 MC CFG: Add more MCFunction container methods (find, empty).
llvm-svn: 188876
2013-08-21 07:27:59 +00:00
Ahmed Bougacha 7bfc7da6e8 MC CFG: Keep pointer to parent MCModule in created MCFunctions.
Also, drive-by cleaning around createFunction.

llvm-svn: 188875
2013-08-21 07:27:55 +00:00
Ahmed Bougacha d6351e76d5 MC CFG: Don't insert preds/succs again.
llvm-svn: 188874
2013-08-21 07:27:50 +00:00
Ahmed Bougacha c43aa4e88c MC CFG: Remap enough for the inserted instruction.
llvm-svn: 188873
2013-08-21 07:27:47 +00:00
Vladimir Medic 9bad0d33b6 Fix style issues in AsmParser.cpp
llvm-svn: 188798
2013-08-20 13:33:18 +00:00
Tim Northover 1f25623449 Support C99 hexadecimal floating-point literals in assembly
It's useful to be able to write down floating-point numbers without having to
worry about what they'll be rounded to (as C99 discovered), this extends that
ability to the MC assembly parsers.

llvm-svn: 188370
2013-08-14 14:23:31 +00:00
David Majnemer 3d96acb735 [-cxx-abi microsoft] Stick zero initialized symbols into the .bss section for COFF
Summary:
We need to do two things:

- Initialize BSSSection in MCObjectFileInfo::InitCOFFMCObjectFileInfo
- Teach TargetLoweringObjectFileCOFF::SelectSectionForGlobal what to do
  with it

This fixes PR16861.

Reviewers: rnk

Reviewed By: rnk

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1361

llvm-svn: 188244
2013-08-13 01:23:53 +00:00
Saleem Abdulrasool 4208b61858 [CodeGen] prevent abnormal on invalid attributes
Currently, when an invalid attribute is encountered on processing a .s file,
clang will abort due to llvm_unreachable.  Invalid user input should not cause
an abnormal termination of the compiler.  Change the interface to return a
boolean to indicate the failure as a first step towards improving hanlding of
malformed user input to clang.

Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
llvm-svn: 188047
2013-08-09 01:52:03 +00:00
Michael J. Spencer 126973ba93 [Object] Split the ELF interface into 3 parts.
* ELFTypes.h contains template magic for defining types based on endianess, size, and alignment.
* ELFFile.h defines the ELFFile class which provides low level ELF specific access.
* ELFObjectFile.h contains ELFObjectFile which uses ELFFile to implement the ObjectFile interface.

llvm-svn: 188022
2013-08-08 22:27:13 +00:00
Eric Christopher 42ae459174 Fix a FIXME, on darwin all virtual sections have a zerofill type.
llvm-svn: 187912
2013-08-07 21:13:01 +00:00
Eric Christopher ed420bb5c3 Move assert above first use of variable that we'd be asserting on.
llvm-svn: 187899
2013-08-07 18:51:09 +00:00
Benjamin Kramer 1df3a1f678 AsmParser: Store MacroLikeBodies on the side so they don't get leaked.
llvm-svn: 187702
2013-08-04 09:06:29 +00:00
Duncan Sands 3194fca45f Pacify GCC, which worries about falling off the end of the switch.
llvm-svn: 187649
2013-08-02 09:37:20 +00:00
Daniel Malea a3d4245a72 Fixed the Intel-syntax X86 disassembler to respect the (existing) option for hexadecimal immediates, to match AT&T syntax. This also brings a new option for C-vs-MASM-style hex.
Patch by Richard Mitton
Reviewed: http://llvm-reviews.chandlerc.com/D1243

llvm-svn: 187614
2013-08-01 21:18:16 +00:00
Nico Rieck 2c9c89b21d MC: Support larger COFF string tables
Single-slash encoded entries do not require a terminating null. This bumps
the maximum table size from ~1MB to ~9.5MB.

llvm-svn: 187352
2013-07-29 12:30:12 +00:00
Bill Schmidt 0a9170d931 [PowerPC] Support powerpc64le as a syntax-checking target.
This patch provides basic support for powerpc64le as an LLVM target.
However, use of this target will not actually generate little-endian
code.  Instead, use of the target will cause the correct little-endian
built-in defines to be generated, so that code that tests for
__LITTLE_ENDIAN__, for example, will be correctly parsed for
syntax-only testing.  Code generation will otherwise be the same as
powerpc64 (big-endian), for now.

The patch leaves open the possibility of creating a little-endian
PowerPC64 back end, but there is no immediate intent to create such a
thing.

The LLVM portions of this patch simply add ppc64le coverage everywhere
that ppc64 coverage currently exists.  There is nothing of any import
worth testing until such time as little-endian code generation is
implemented.  In the corresponding Clang patch, there is a new test
case variant to ensure that correct built-in defines for little-endian
code are generated.

llvm-svn: 187179
2013-07-26 01:35:43 +00:00
Ahmed Bougacha 555575c32b Revert "Remove use of asymmetric std::lower_bound comparator."
This reverts commit r185676.
Originally done because of VS 2008.

llvm-svn: 186969
2013-07-23 17:44:11 +00:00
Rafael Espindola 6d35481c94 Add a wrapper for open.
This centralizes the handling of O_BINARY and opens the way for hiding more
differences (like how open behaves with directories).

llvm-svn: 186447
2013-07-16 19:44:17 +00:00
Craig Topper d3a34f81f8 Add 'const' qualifiers to static const char* variables.
llvm-svn: 186371
2013-07-16 01:17:10 +00:00
Reid Kleckner dae7b4e4d1 [mc-coff] Resolve aliases when emitting COFF relocations
This is consistent with the ELF object writer.

Add some COFF tests that relocate against an alias.

Reviewers: espindola

Differential Revision: http://llvm-reviews.chandlerc.com/D1079

llvm-svn: 186341
2013-07-15 19:41:21 +00:00
Craig Topper 5871321e49 Use llvm::array_lengthof to replace sizeof(array)/sizeof(array[0]).
llvm-svn: 186301
2013-07-15 04:27:47 +00:00
Tim Northover a630fb0b67 Put ELF COMDAT relocations into the relevant COMDAT group.
Patch from Игорь Пашев  (I do hope we support utf-8 commit messages; I
also hope he'll forgive me for transliterating it as Igor Pashev in
case things go horribly wrong).

llvm-svn: 186034
2013-07-10 20:58:17 +00:00
Ulrich Weigand 52cf8e4488 [PowerPC] Revert r185476 and fix up TLS variant kinds
In the commit message to r185476 I wrote:

>The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD
>correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD.
>This causes some confusion with the asm parser, since VK_PPC_TLSGD
>is output as @tlsgd, which is then read back in as VK_TLSGD.
>
>To avoid this confusion, this patch removes the PowerPC-specific
>modifiers and uses the generic modifiers throughout.  (The only
>drawback is that the generic modifiers are printed in upper case
>while the usual convention on PowerPC is to use lower-case modifiers.
>But this is just a cosmetic issue.)

This was unfortunately incorrect, there is is fact another,
serious drawback to using the default VK_TLSLD/VK_TLSGD
variant kinds: using these causes ELFObjectWriter::RelocNeedsGOT
to return true, which in turn causes the ELFObjectWriter to emit
an undefined reference to _GLOBAL_OFFSET_TABLE_.

This is a problem on powerpc64, because it uses the TOC instead
of the GOT, and the linker does not provide _GLOBAL_OFFSET_TABLE_,
so the symbol remains undefined.  This means shared libraries
using TLS built with the integrated assembler are currently
broken.

While the whole RelocNeedsGOT / _GLOBAL_OFFSET_TABLE_ situation
probably ought to be properly fixed at some point, for now I'm
simply reverting the r185476 commit.  Now this in turn exposes
the breakage of handling @tlsgd/@tlsld in the asm parser that
this check-in was originally intended to fix.

To avoid this regression, I'm also adding a different fix for
this problem: while common code now parses @tlsgd as VK_TLSGD,
a special hack in the asm parser translates this code to the
platform-specific VK_PPC_TLSGD that the back-end now expects.
While this is not really pretty, it's self-contained and
shouldn't hurt anything else for now.  One the underlying
problem is fixed, this hack can be reverted again.

llvm-svn: 185945
2013-07-09 16:41:09 +00:00
Kai Nacke c5cca5ab42 Revert: Fix wrong code offset for unwind code SET_FPREG.
llvm-svn: 185793
2013-07-08 04:48:34 +00:00
Kai Nacke 939ecd7ea0 Revert: Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEH data structures.
llvm-svn: 185791
2013-07-08 04:46:55 +00:00
Kai Nacke 07bad44e9b Revert: Fix alignment of unwind data.
llvm-svn: 185790
2013-07-08 04:45:05 +00:00
Kai Nacke 42097301f6 Revert: Emit personality function and Dwarf EH data for Win64 SEH.
llvm-svn: 185788
2013-07-08 04:43:23 +00:00
Kai Nacke c947ad2a2d Emit personality function and Dwarf EH data for Win64 SEH.
Obviously the personality function should be emitted as language handler
instead of the hard coded _GCC_specific_handler. The language specific
data must be placed after the unwind information therefore it must not
be emitted into a separate section.

Reviewed by Charles Davis and Nico Rieck.

llvm-svn: 185761
2013-07-06 17:17:31 +00:00
Kai Nacke 4417cccba3 Fix alignment of unwind data.
For alignment purposes, the instruction array will always have an even
number of entries, with the final entry potentially unused (in which
case the array will be one longer than indicated by the count of unwind
codes field).

Reviewed by Charles Davis and Nico Rieck.

llvm-svn: 185760
2013-07-06 17:16:50 +00:00
Kai Nacke 2a933a6549 Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEH
data structures.

The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB
instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding
the VK_COFF_IMGREL32 modifier to the symbol reference.
Change also references to start and end of the SEH range of a function
as offsets to start of the function.

Reviewed by Charles Davis and Nico Rieck.

llvm-svn: 185759
2013-07-06 17:16:12 +00:00
Kai Nacke 66bfdb8354 Fix wrong code offset for unwind code SET_FPREG.
The code offset for unwind code SET_FPREG is wrong because it is set
to constant 0. The fix is to do the same as for the other unwind
codes: emit a label and later the absolute difference between the
label and the begin of the prologue.
Also enables the failing test case MC/COFF/seh.s

Reviewed by Charles Davis and Nico Rieck.

llvm-svn: 185758
2013-07-06 17:15:36 +00:00
Nico Rieck a37acf702d MC: Implement COFF .linkonce directive
llvm-svn: 185753
2013-07-06 12:13:10 +00:00
Ahmed Bougacha 156a2deafe Remove use of asymmetric std::lower_bound comparator.
VS 2008 doesn't like it when in debug mode.

llvm-svn: 185676
2013-07-04 23:20:12 +00:00
Nico Rieck 1558c5a6ee MC: Add .section directive to COFF
Supports GAS flags "abdnrswxy". No support for alignment or subsections.

Fixes PR16366.

llvm-svn: 185669
2013-07-04 21:32:07 +00:00
Ulrich Weigand 4050995650 [PowerPC] Remove VK_PPC_TLSGD and VK_PPC_TLSLD
The PowerPC-specific modifiers VK_PPC_TLSGD and VK_PPC_TLSLD
correspond exactly to the generic modifiers VK_TLSGD and VK_TLSLD.
This causes some confusion with the asm parser, since VK_PPC_TLSGD
is output as @tlsgd, which is then read back in as VK_TLSGD.

To avoid this confusion, this patch removes the PowerPC-specific
modifiers and uses the generic modifiers throughout.  (The only
drawback is that the generic modifiers are printed in upper case
while the usual convention on PowerPC is to use lower-case modifiers.
But this is just a cosmetic issue.)

llvm-svn: 185476
2013-07-02 21:29:06 +00:00
Rafael Espindola 64e1af8eb9 Remove address spaces from MC.
This is dead code since PIC16 was removed in 2010. The result was an odd mix,
where some parts would carefully pass it along and others would assert it was
zero (most of the object streamer for example).

llvm-svn: 185436
2013-07-02 15:49:13 +00:00