[lit] Fix UnicodeEncodeError when test commands contain non-ASCII chars

Ensure that the bash script written by lit TestRunner is open with UTF-8
encoding when using Python 3.  Otherwise, attempt to write non-ASCII
characters causes UnicodeEncodeError.  This happened e.g. with
the following LLD test:

UNRESOLVED: lld :: ELF/format-binary-non-ascii.s (657 of 2119)
******************** TEST 'lld :: ELF/format-binary-non-ascii.s' FAILED ********************
Exception during script execution:
Traceback (most recent call last):
  File "/home/mgorny/llvm-project/llvm/utils/lit/lit/worker.py", line 63, in _execute_test
    result = test.config.test_format.execute(test, lit_config)
  File "/home/mgorny/llvm-project/llvm/utils/lit/lit/formats/shtest.py", line 25, in execute
    self.execute_external)
  File "/home/mgorny/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1644, in executeShTest
    res = _runShTest(test, litConfig, useExternalSh, script, tmpBase)
  File "/home/mgorny/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1590, in _runShTest
    res = executeScript(test, litConfig, tmpBase, script, execdir)
  File "/home/mgorny/llvm-project/llvm/utils/lit/lit/TestRunner.py", line 1157, in executeScript
    f.write('{ ' + '; } &&\n{ '.join(commands) + '; }')
UnicodeEncodeError: 'ascii' codec can't encode character '\xa3' in position 274: ordinal not in range(128)

Differential Revision: https://reviews.llvm.org/D63254

llvm-svn: 363388
This commit is contained in:
Michal Gorny 2019-06-14 13:31:48 +00:00
parent 6b78e4d0a4
commit 0c28a8f628
3 changed files with 10 additions and 4 deletions

View File

@ -1133,9 +1133,12 @@ def executeScript(test, litConfig, tmpBase, commands, cwd):
# Write script file
mode = 'w'
open_kwargs = {}
if litConfig.isWindows and not isWin32CMDEXE:
mode += 'b' # Avoid CRLFs when writing bash scripts.
f = open(script, mode)
mode += 'b' # Avoid CRLFs when writing bash scripts.
elif sys.version_info > (3,0):
open_kwargs['encoding'] = 'utf-8'
f = open(script, mode, **open_kwargs)
if isWin32CMDEXE:
for i, ln in enumerate(commands):
commands[i] = re.sub(kPdbgRegex, "echo '\\1' > nul && ", ln)

View File

@ -0,0 +1,3 @@
# Run a command including UTF-8 characters.
#
# RUN: echo £

View File

@ -80,7 +80,7 @@
# CHECK: shtest-format :: external_shell/fail_with_bad_encoding.txt
# CHECK: shtest-format :: fail.txt
# CHECK: Expected Passes : 7
# CHECK: Expected Passes : 8
# CHECK: Expected Failures : 4
# CHECK: Unsupported Tests : 5
# CHECK: Unresolved Tests : 3
@ -90,7 +90,7 @@
# XUNIT: <?xml version="1.0" encoding="UTF-8" ?>
# XUNIT-NEXT: <testsuites>
# XUNIT-NEXT: <testsuite name="shtest-format" tests="23" failures="7" skipped="5">
# XUNIT-NEXT: <testsuite name="shtest-format" tests="24" failures="7" skipped="5">
# XUNIT: <testcase classname="shtest-format.shtest-format" name="argv0.txt" time="{{[0-9]+\.[0-9]+}}"/>