[z/OS] treat text files as text files so auto-conversion is done (#90128)
To support auto-conversion on z/OS text files need to be opened as text files. These changes will fix a number of LIT failures due to text files not being converted to the internal code page.
update a number of tools so they open the text files as text files
add support in the cat.py to open a text file as a text file (Windows will continue to treat all files as binary so new lines are handled correctly)
add env var definitions to enable auto-conversion in the lit config file.
[UndefOrPoison] [CompileTime] Avoid IDom walk unless required. NFC (#90092)
If the value is not boolean and we are checking for `Undef` or
`UndefOrPoison`, we can avoid the potentially expensive IDom walk.
This should improve compile time for isGuaranteedNotToBeUndefOrPoison
and isGuaranteedNotToBeUndef.
[OpenMP][TR12] change property of map-type modifier. (#90499)
map-type change to "default" instead "ultimate" from [OpenMP5.2]
The change is allowed map-type to be placed any locations within map
modifiers, besides the last location in the modifiers-list, also
map-type can be omitted afterward.
[CUDA] make kernel stub ICF-proof (#90155)
MSVC linker merges functions having comdat which have identical set of
instructions. CUDA uses kernel stub function as key to look up kernels
in device executables. If kernel stub function for different kernels are
merged by ICF, incorrect kernels will be launched.
To prevent ICF from merging kernel stub functions, an unique global
variable is created for each kernel stub function having comdat and a
store is added to the kernel stub function. This makes the set of
instructions in each kernel function unique.
Fixes: https://github.com/llvm/llvm-project/issues/88883
[lldb] Teach LocateExecutableSymbolFile to look into LOCALBASE on FreeBSD (#81355)
FreeBSD ports will now install debuginfo under $LOCALBASE/lib/debug/, where $LOCALBASE is typically /usr/local. On FreeBSD search this path in addition to existing debug info paths.
Relevant change on the FreeBSD side: https://reviews.freebsd.org/D43515
[AArch64][TargetParser] autogen ArchExtKind enum (#90314)
Re-land 61b2a0e3336aaa0132bbed06dc185aca4ff5d2db. Some Windows builds
were failing because AArch64TargetParserDef.inc is a generated header
which is included transitively into some clang components, but this
information is not available to the build system and therefore there is
a missing edge in the dependency graph. This patch incorporates the
fixes described in ac1ffd3caca12c254e0b8c847aa8ce8e51b6cfbf/D142403.
Thanks to ExtensionSet::toLLVMFeatureList, all values of ArchExtKind
should correspond to a particular -target-feature. The valid values of
-target-feature are in turn defined by SubtargetFeature defs.
Therefore we can generate ArchExtKind from the tablegen data. This is
done by adding an Extension class which derives from SubtargetFeature.
Because the Has* FieldNames do not always correspond to the AEK_
names ("extensions", as defined in TargetParser), and AEK_ names do
not always correspond to -march strings, some additional enum entries
[3 lines not shown]
[libcxx][ci] In picolib build, ask clang for the normalised triple (#90722)
This is needed for a workaround to make sure the link later succeeds. I
don't know the reason for that but it is definitely needed.
https://github.com/llvm/llvm-project/pull/89234 will/wants to correct
the triple normalisation for -none- and this means that clang prior to
19, and clang 19 and above will have different answers and therefore
different library paths.
I don't want to bootstrap a clang just for libcxx CI, or require that
anyone building for Arm do the same, so ask the compiler what the triple
should be.
This will be compatible with 17 and 19 when we do update to that
version.
I'm assuming $CC is what anyone locally would set to override the
compiler, and `cc` is the binary name in our CI containers. It's not
perfect but it should cover most use cases.
[Offload] Fix CMake detection when it is not found (#90729)
Summary:
This variable could be unset if not found or when building standalone.
We should check for that and set it to true or false.
Fixes: https://github.com/llvm/llvm-project/issues/90708
[gn] port 088aa81a5454 (LLVM_HAS_LOGF128)
If we want to turn this on on some platforms, we'll also want to
define HAS_LOGF128 for AnalysisTest, see
llvm/unittests/Analysis/CMakeLists.txt
[Flang][OpenMP] Handle more character allocatable cases in privatization (#90449)
Fixes #84732, #81947, #81946
Note: This is a fix till we enable delayed privatization.
Constant Fold logf128 calls
This is a second attempt to land #84501 which failed on several targets.
This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
[z/OS] add support for z/OS system headers to clang std header wrappers (#89995)
Update the wrappers for the C std headers so that they always forward to
the z/OS system headers.
[SLP]Transform stores + reverse to strided stores with stride -1, if profitable.
Adds transformation of consecutive vector store + reverse to strided
stores with stride -1, if it is profitable
Reviewers: RKSimon, preames
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/90464
[AArch64] NFC: Add RUN lines for streaming-compatible code. (#90617)
The intent is to test lowering of vector operations by scalarization,
for functions that are streaming-compatible (and thus cannot use NEON)
and also don't have the +sve attribute.
The generated code is clearly wrong at the moment, but a series of
patches will follow to fix up all cases to use scalar instructions.
A bit of context:
This work will form the base to decouple SME from SVE later on, as it
will make sure that no NEON instructions are used in
streaming[-compatible] mode. Later this will be followed by a patch that
changes `useSVEForFixedLengthVectors` to only return `true` if SVE is
available for the given runtime mode, at which point I'll change the
`-mattr=+sme -force-streaming-compatible-sve` to `-mattr=+sme
-force-streaming-sve` in the RUN lines, so that the tests are considered
to be executed in Streaming-SVE mode.
[AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12 (#90710)
https://github.com/llvm/llvm-project/pull/90201 made some fixes for
gfx12
image_msaa_load waitcnt insertion.
That fix might break in some situations for pre-gfx12 - this fixes that
by
explitly checking for VSAMPLE which always requires a s_wait_samplecnt
and
leaves the previous logic intact for non-gfx12.
[AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)
Code to determine if a waitcnt is required before a barrier instruction
only
considered S_BARRIER.
gfx12 adds barrier_signal/wait so need to enhance the existing code to
look for
a barrier start (which is just an S_BARRIER for earlier architectures).
[AMDGPU] Do not optimize away pre-existing waitcnt instructions at -O0 (#90716)
The autogenerated memory legalizer tests use -O0 so this allows us to
see the exact waitcnts that were inserted by the memory legalizer
without them being optimized away.