[test] Use system locale for mri-utf8.test

Summary:
llvm-ar's mri-utf8.test test relies on the en_US.UTF-8 locale to be
installed for its last RUN line to work. If not installed, the unicode
string gets encoded (interpreted) as ascii which fails since the most
significant byte is non zero. This commit changes the test to only rely
on the system being able to encode the pound sign in its default
encoding (e.g. UTF-16 for Microsoft Windows) by always opening the file
via input/output redirection. This avoids forcing a given locale to be
present and supported. A Byte Order Mark is also added to help
recognizing the encoding of the file and its endianness.

Reviewers: gbreynoo, MaskRay, rupprecht, JamesNagurne, jfb

Subscribers: dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68472

llvm-svn: 374318
This commit is contained in:
Thomas Preud'homme 2019-10-10 11:48:30 +00:00
parent 6430adbe64
commit b6f1d1fa0e
2 changed files with 22 additions and 23 deletions

View File

@ -0,0 +1,22 @@
# Test non-ascii archive members
# XFAIL: system-darwin
RUN: rm -rf %t && mkdir -p %t/extracted
# Note: lit's Python will read this UTF-8 encoded mri-nonascii.txt file,
# decode it to unicode. The filename in the redirection below will then
# be encoded in the system's filename encoding (e.g. UTF-16 for
# Microsoft Windows).
RUN: echo "contents" > %t/£.txt
RUN: echo "CREATE %t/mri.ar" > %t/script.mri
RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri
RUN: echo "SAVE" >> %t/script.mri
RUN: llvm-ar -M < %t/script.mri
RUN: cd %t/extracted && llvm-ar x %t/mri.ar
# Same as above.
RUN: FileCheck --strict-whitespace %s <£.txt
CHECK:{{^}}
CHECK-SAME:{{^}}contents{{$}}

View File

@ -1,23 +0,0 @@
# Test non-ascii archive members
# XFAIL: system-darwin
RUN: rm -rf %t && mkdir -p %t/extracted
RUN: echo "contents" > %t/£.txt
RUN: echo "CREATE %t/mri.ar" > %t/script.mri
RUN: echo "ADDMOD %t/£.txt" >> %t/script.mri
RUN: echo "SAVE" >> %t/script.mri
RUN: llvm-ar -M < %t/script.mri
RUN: cd %t/extracted && llvm-ar x %t/mri.ar
# This works around problems launching processess that
# include arguments with non-ascii characters.
# Python on Linux defaults to ASCII encoding unless the
# environment specifies otherwise, so it is explicitly set.
# The reliance the test has on this locale is not ideal,
# however alternate solutions have been difficult due to
# behaviour differences with python 2 vs python 3,
# and linux vs windows.
RUN: env LANG=en_US.UTF-8 %python -c "assert open(u'\U000000A3.txt', 'rb').read() == b'contents\n'"