(turns out I had regressed this when sinking handling of this type down
into GetElementPtrInst::Create - since that asserted before the error
handling was performed)
llvm-svn: 232420
Similar to gep (r230786) and load (r230794) changes.
Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.
(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)
import fileinput
import sys
import re
rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)
def conv(match):
line = match.group(1)
line += match.group(4)
line += ", "
line += match.group(2)
return line
line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
sys.stdout.write(line[off:match.start()])
sys.stdout.write(conv(match))
off = match.end()
sys.stdout.write(line[off:])
llvm-svn: 232184
Summary:
DataLayout keeps the string used for its creation.
As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().
Get rid of DataLayoutPass: the DataLayout is in the Module
The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.
Make DataLayout Non-Optional in the Module
Module->getDataLayout() will never returns nullptr anymore.
Reviewers: echristo
Subscribers: resistor, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D7992
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
Essentially the same as the GEP change in r230786.
A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)
import fileinput
import sys
import re
pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")
for line in sys.stdin:
sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7649
llvm-svn: 230794
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.
This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.
* This doesn't modify gep operators, only instructions (operators will be
handled separately)
* Textual IR changes only. Bitcode (including upgrade) and changing the
in-memory representation will be in separate changes.
* geps of vectors are transformed as:
getelementptr <4 x float*> %x, ...
->getelementptr float, <4 x float*> %x, ...
Then, once the opaque pointer type is introduced, this will ultimately look
like:
getelementptr float, <4 x ptr> %x
with the unambiguous interpretation that it is a vector of pointers to float.
* address spaces remain on the pointer, not the type:
getelementptr float addrspace(1)* %x
->getelementptr float, float addrspace(1)* %x
Then, eventually:
getelementptr float, ptr addrspace(1) %x
Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.
update.py:
import fileinput
import sys
import re
ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
def conv(match, line):
if not match:
return line
line = match.groups()[0]
if len(match.groups()[5]) == 0:
line += match.groups()[2]
line += match.groups()[3]
line += ", "
line += match.groups()[1]
line += "\n"
return line
for line in sys.stdin:
if line.find("getelementptr ") == line.find("getelementptr inbounds"):
if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
line = conv(re.match(ibrep, line), line)
elif line.find("getelementptr ") != line.find("getelementptr ("):
line = conv(re.match(normrep, line), line)
sys.stdout.write(line)
apply.sh:
for name in "$@"
do
python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
rm -f "$name.tmp"
done
The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh
After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).
The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7636
llvm-svn: 230786
Since r199356, we've printed a warning when dropping debug info.
r225562 started crashing on that, since it registered a diagnostic
handler that only expected errors. This fixes the handler to expect
other severities. As a side effect, it now prints "error: " at the
start of error messages, similar to `llvm-as`.
There was a testcase for r199356, but it only really checked the
assembler. Move `test/Bitcode/drop-debug-info.ll` to `test/Assembler`,
and introduce `test/Bitcode/drop-debug-info.3.5.ll` (and companion
`.bc`) to test the bitcode reader.
Note: tools/gold/gold-plugin.cpp has an equivalent bug, but I'm not sure
what the best fix is there. I'll file a PR.
llvm-svn: 230416
Like r230414, add bitcode support including backwards compatibility, for
an explicit type parameter to GEP.
At the suggestion of Duncan I tried coalescing the two older bitcodes into a
single new bitcode, though I did hit a wrinkle: I couldn't figure out how to
create an explicit abbreviation for a record with a variable number of
arguments (the indicies to the gep). This means the discriminator between
inbounds and non-inbounds gep is a full variable-length field I believe? Is my
understanding correct? Is there a way to create such an abbreviation? Should I
just use two bitcodes as before?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7736
llvm-svn: 230415
While fuzzing LLVM bitcode files, I discovered that (1) the bitcode reader doesn't check that alignments are no larger than 2**29; (2) downstream code doesn't check the range; and (3) for values out of range, corresponding large memory requests (based on alignment size) will fail. This code fixes the bitcode reader to check for valid alignments, fixing this problem.
This CL fixes alignment value on global variables, functions, and instructions: alloca, load, load atomic, store, store atomic.
Patch by Karl Schimpf (kschimpf@google.com).
llvm-svn: 230180
Summary:
When creating {insert,extract}value instructions from a BitcodeReader, we
weren't verifying the fields were valid.
Bugs found with afl-fuzz
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7325
llvm-svn: 229345
Eventually we can make some of these pass the error along to the caller.
Reports a fatal error if:
We find an invalid abbrev record
We try to get an invalid abbrev number
We can't fill the current word due to an EOF
Fixed an invalid bitcode test to check for output with FileCheck
Bugs found with afl-fuzz
llvm-svn: 226986
No change in this commit, but clang was changed to also produce trivial comdats when
needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226467
This reverts commit r226173, adding r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226242
This commit moves `MDLocation`, finishing off PR21433. There's an
accompanying clang commit for frontend testcases. I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.
This changes the schema for `DebugLoc` and `DILocation` from:
!{i32 3, i32 7, !7, !8}
to:
!MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)
Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.
llvm-svn: 226048
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226038
The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.
This is not ideal for error handling or diagnostic reporting:
* For error handling, all that the callers care about is 3 possibilities:
* It worked
* The bitcode file is corrupted/invalid.
* The file is not bitcode at all.
* For diagnostic, it is user friendly to include far more information
about the invalid case so the user can find out what is wrong with the
bitcode file. This comes up, for example, when a developer introduces a
bug while extending the format.
The compromise we had was to have a lot of error codes.
With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.
This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.
Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.
llvm-svn: 225562
Now that `Metadata` is typeless, reflect that in the assembly. These
are the matching assembly changes for the metadata/value split in
r223802.
- Only use the `metadata` type when referencing metadata from a call
intrinsic -- i.e., only when it's used as a `Value`.
- Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
when referencing it from call intrinsics.
So, assembly like this:
define @foo(i32 %v) {
call void @llvm.foo(metadata !{i32 %v}, metadata !0)
call void @llvm.foo(metadata !{i32 7}, metadata !0)
call void @llvm.foo(metadata !1, metadata !0)
call void @llvm.foo(metadata !3, metadata !0)
call void @llvm.foo(metadata !{metadata !3}, metadata !0)
ret void, !bar !2
}
!0 = metadata !{metadata !2}
!1 = metadata !{i32* @global}
!2 = metadata !{metadata !3}
!3 = metadata !{}
turns into this:
define @foo(i32 %v) {
call void @llvm.foo(metadata i32 %v, metadata !0)
call void @llvm.foo(metadata i32 7, metadata !0)
call void @llvm.foo(metadata i32* @global, metadata !0)
call void @llvm.foo(metadata !3, metadata !0)
call void @llvm.foo(metadata !{!3}, metadata !0)
ret void, !bar !2
}
!0 = !{!2}
!1 = !{i32* @global}
!2 = !{!3}
!3 = !{}
I wrote an upgrade script that handled almost all of the tests in llvm
and many of the tests in cfe (even handling many `CHECK` lines). I've
attached it (or will attach it in a moment if you're speedy) to PR21532
to help everyone update their out-of-tree testcases.
This is part of PR21532.
llvm-svn: 224257
`MDString`s can have arbitrary characters in them. Prevent an assertion
that fired in `BitcodeWriter` because of sign extension by copying the
characters into the record as `unsigned char`s.
Based on a patch by Keno Fischer; fixes PR21882.
llvm-svn: 224077
This reflects the typelessness of `Metadata` in the bitcode format,
removing types from all metadata operands.
`METADATA_VALUE` represents a `ValueAsMetadata`, and always has two
fields: the type and the value.
`METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`,
doesn't store types. It stores operands at their ID+1 so that `0` can
reference `nullptr` operands.
Part of PR21532.
llvm-svn: 224073
As a fixup to r223616, follow the convention of naming the files after
the LLVM release whose bitcode they're maintaining compatability with.
llvm-svn: 223623
Add assembly and bitcode tests that I neglected to add in r223564 (IR:
Disallow complicated function-local metadata) and r223574 (IR: Disallow
function-local metadata attachments).
Found a couple of bugs:
- The error message for function-local attachments gave the wrong line
number -- it indicated the next token (typically on the next line)
instead of the token that started the attachment. Fixed.
- Metadata arguments of the form `!{i32 0, i32 %v}` (or with the
arguments reversed) fired an assertion in `ValueEnumerator` in LLVM
v3.5, so I suppose this never really worked. I suppose this was
"fixed" by r223564.
(Thanks to dblaikie for pointing out my omission.)
Part of PR21532.
llvm-svn: 223616
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash. The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).
Original commit message follows.
--
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString. Integers are stringified and
a `\0` character is used as a separator.
Part of PR17891.
Note: I've attached my testcases upgrade scripts to the PR. If I've
just broken your out-of-tree testcases, they might help.
llvm-svn: 219010
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString. Integers are stringified and
a `\0` character is used as a separator.
Part of PR17891.
Note: I've attached my testcases upgrade scripts to the PR. If I've
just broken your out-of-tree testcases, they might help.
llvm-svn: 218914
This includes constants, attributes, and some additional instructions not covered by previous tests.
Work was done by lama.saba@intel.com.
llvm-svn: 218297
This allows streams that only use BLOCKINFO for debugging purposes to omit
the block entirely. As long as another stream is available with the correct
BLOCKINFO, the first stream can still be analyzed and dumped.
As part of this commit, BitstreamReader gets a move constructor and move
assignment operator, as well as a takeBlockInfo method.
llvm-svn: 216826
Call `verifyModule()` after parsing and after every transformation.
Also convert some `DEBUG(dbgs())` to `errs()` to increase visibility
into what's going on.
llvm-svn: 215951
Block address forward-references are implemented by creating a
`BasicBlock` ahead of time that gets inserted in the `Function` when
it's eventually encountered.
However, if the same blockaddress was used in two separate functions
that were parsed *before* the referenced function (and the blockaddress
was never used at global scope), two separate basic blocks would get
created, one of which would be forgotten creating invalid IR.
This commit changes the forward-reference logic to create only one basic
block (and always return the same blockaddress).
llvm-svn: 215805
An optional third field was added to `llvm.global_ctors` (and
`llvm.global_dtors`) in r209015. Most of the code has been changed to
deal with both versions of the variables. Users of the C API might
create either version, the helper functions in LLVM create the two-field
version, and clang now creates the three-field version.
However, the BitcodeReader was changed to always upgrade to the
three-field version. This created an unnecessary inconsistency in the
IR before/after serializing to bitcode.
This commit resolves the inconsistency by making the third field truly
optional (and not upgrading in the bitcode reader). Since `llvm-link`
was relying on this upgrade code, rather than deleting it I've moved it
into `ModuleLinker`, where it upgrades these arrays as necessary to
resolve inconsistencies between modules.
The ideal resolution would be to remove the 2-field version and make the
third field required. I filed PR20506 to track that.
I changed `test/Bitcode/upgrade-global-ctors.ll` to a negative test and
duplicated the `llvm-link` check in `test/Linker/global_ctors.ll` to
check both upgrade directions.
Since I came across this as part of PR5680 (serializing use-list order),
I've also added the missing `verify-uselistorder` RUN line to
`test/Bitcode/metadata-2.ll`.
llvm-svn: 215457
`parseBitcodeFile()` uses the generic `getLazyBitcodeFile()` function as
a helper. Since `parseBitcodeFile()` isn't actually lazy -- it calls
`MaterializeAllPermanently()` -- bypass the unnecessary call to
`materializeForwardReferencedFunctions()` by extracting out a common
helper function. This removes the last of the use-list churn caused by
blockaddresses.
This highlights that we can't reproduce use-list order of globals and
constants when parsing lazily -- but that's necessarily out of scope.
When we're parsing lazily, we never have all the functions in memory, so
the use-lists of globals (and constants that reference globals) are
always incomplete.
This is part of PR5680.
llvm-svn: 214581
Correctly sort self-users (such as PHI nodes). I added a targeted test
in `test/Bitcode/use-list-order.ll` and the final missing RUN line to
tests in `test/Assembly`.
This is part of PR5680.
llvm-svn: 214417
Since initializers of GlobalValues are being assigned IDs before
GlobalValues themselves, explicitly exclude GlobalValues from the
constant pool. Added targeted test in `test/Bitcode/use-list-order.ll`
and added two more RUN lines in `test/Assembly`.
This is part of PR5680.
llvm-svn: 214368
Before this patch we had
@a = weak global ...
but
@b = alias weak ...
The patch changes aliases to look more like global variables.
Looking at some really old code suggests that the reason was that the old
bison based parser had a reduction for alias linkages and another one for
global variable linkages. Putting the alias first avoided the reduce/reduce
conflict.
The days of the old .ll parser are long gone. The new one parses just "linkage"
and a later check is responsible for deciding if a linkage is valid in a
given context.
llvm-svn: 214355
When predicting use-list order, we visit functions in reverse order
followed by `GlobalValue`s and write out use-lists at the first
opportunity. In the reader, this will translate to *after* the last use
has been added.
For this to work, we actually need to descend into `GlobalValue`s.
Added a targeted test in `use-list-order.ll` and `RUN` lines to the
newly passing tests in `test/Bitcode`.
There are two remaining failures in `test/Bitcode`:
- blockaddress.ll: I haven't thought through how to model the way
block addresses change the order of use-lists (or how to work around
it).
- metadata-2.ll: There's an old-style `@llvm.used` global array here
that I suspect the .ll parser isn't upgrading properly. When it
round-trips through bitcode, the .bc reader *does* upgrade it, so
the extra variable (`i8* null`) has an extra use, and the shuffle
vector doesn't match.
I think the fix is to upgrade old-style global arrays (or reject
them?) in the .ll parser.
This is part of PR5680.
llvm-svn: 214321
r214242 was subtle enough it really deserves a targeted test with
comments. This adds some global variables that trigger the relevant
code path. Sorry this wasn't committed with the fix.
llvm-svn: 214243