LLVM/project f958a73llvm/test/Transforms/InstCombine call-guard.ll fast-math.ll

[InstCombine] Fix name clashes in check lines (NFC)

These used both lower and upper case variants of the same name,
resulting in malformed check lines when regenerated.
DeltaFile
+8-8llvm/test/Transforms/InstCombine/call-guard.ll
+4-4llvm/test/Transforms/InstCombine/fast-math.ll
+12-122 files

LLVM/project 0d335f7llvm/lib/Transforms/InstCombine InstCombineAddSub.cpp, llvm/test/Transforms/InstCombine add.ll

[InstCombine] Handle more commuted cases in matchesSquareSum()
DeltaFile
+8-21llvm/test/Transforms/InstCombine/add.ll
+10-10llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+18-312 files

LLVM/project a39a382llvm/test/Transforms/InstCombine add.ll

[InstCombine] Thwart complexity-based canonicalization (NFC)

These tests did not test what they were supposed to. The transform
fails to actually handle the commuted cases.
DeltaFile
+40-19llvm/test/Transforms/InstCombine/add.ll
+40-191 files

LLVM/project bce9393llvm/lib/Target/AMDGPU SOPInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.s.wait.event.ll

[AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (#89622)

As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
bit 1 in the encoded simm16 operand.

(cherry picked from commit e0a763c490d8ef58dca867e0ef834978ccf8e17d)
DeltaFile
+3-7llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.wait.event.ll
+1-1llvm/lib/Target/AMDGPU/SOPInstructions.td
+4-82 files

LLVM/project f5f572fllvm/include/llvm/CodeGen MachineFrameInfo.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712)

This is a fix for miscompiles reported in
  https://github.com/llvm/llvm-project/issues/89060

After argument copy elison the IR value for the eliminated alloca
is aliasing with the fixed stack object. This patch is making sure
that we mark the fixed stack object as being aliased with IR values
to avoid that for example schedulers are reordering accesses to
the fixed stack object. This could otherwise happen when there is a
mix of MemOperands refering the shared fixed stack slow via both
the IR value for the elided alloca, and via a fixed stack pseudo
source value (as would be the case when lowering the arguments).

(cherry picked from commit d8b253be56b3e9073b3e59123cf2da0bcde20c63)
DeltaFile
+39-0llvm/test/CodeGen/Hexagon/arg-copy-elison.ll
+7-0llvm/include/llvm/CodeGen/MachineFrameInfo.h
+2-1llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+48-13 files

LLVM/project dfc89f8llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 pr91005.ll

[X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005
DeltaFile
+40-0llvm/test/CodeGen/X86/pr91005.ll
+1-1llvm/lib/Target/X86/X86ISelLowering.cpp
+41-12 files

LLVM/project 047cd91llvm/lib/Target/X86 X86InstrAVX512.td X86ISelLowering.cpp, llvm/test/CodeGen/X86 pr90844.ll

[X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106)

With KNL/KNC being deprecated, we don't need to care about such no VLX
cases anymore. We may remove such patterns in the future.

Fixes #90844

(cherry picked from commit 7963d9a2b3c20561278a85b19e156e013231342c)
DeltaFile
+21-21llvm/lib/Target/X86/X86InstrAVX512.td
+19-0llvm/test/CodeGen/X86/pr90844.ll
+3-1llvm/lib/Target/X86/X86ISelLowering.cpp
+43-223 files

LLVM/project 58e44d3llvm/lib/Target/AMDGPU SIInstrInfo.h SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.s.barrier.wait.ll llvm.amdgcn.s.barrier.ll

[AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (#90595)

Code to determine if a waitcnt is required before a barrier instruction
only
considered S_BARRIER.
gfx12 adds barrier_signal/wait so need to enhance the existing code to
look for
a barrier start (which is just an S_BARRIER for earlier architectures).
DeltaFile
+22-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.wait.ll
+11-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+1-1llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+2-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.barrier.ll
+36-14 files

LLVM/project d1d7131.github/workflows release-binaries.yml set-release-binary-outputs.sh

[Workflows] Re-write release-binaries workflow (#89521)

This updates the release-binaries workflow so that the different build
stages are split across multiple jobs. This saves money by reducing the
time spent on the larger github runners and also makes it easier to
debug, because now it's possible to build a smaller release package
(with clang and lld) using only the free GitHub runners.

The workflow no longer uses the test-release.sh script but instead uses
the Release.cmake cache. This gives the workflow more flexibility and
ensures that the binary package will always be created even if the tests
fail.

This idea to split the stages comes from the "LLVM Precommit CI through
Github Actions" RFC:

https://discourse.llvm.org/t/rfc-llvm-precommit-ci-through-github-actions/76456
(cherry picked from commit abac98479b81cc0cc717bb6cdbae6f774e3b0232)
DeltaFile
+190-73.github/workflows/release-binaries.yml
+0-7.github/workflows/set-release-binary-outputs.sh
+190-802 files

LLVM/project f2c5a10clang/cmake/caches Release.cmake

[CMake][Release] Add stage2-package target (#89517)

This target will be used to generate the release binary package for
uploading to GitHub.

(cherry picked from commit a38f201f1ec70c2b1f3cf46e7f291c53bb16753e)
DeltaFile
+2-0clang/cmake/caches/Release.cmake
+2-01 files

LLVM/project 211cdc6.github/workflows release-binaries.yml

workflows: Fix incorrect input name in release-binaries.yml (#84604)

In aa02002491333c42060373bc84f1ff5d2c76b4ce the input name was changed
from tag to release-version, but the code was never updated.

(cherry picked from commit 8d220d109d28dac352c563ab062fb72132b7eca1)
DeltaFile
+2-2.github/workflows/release-binaries.yml
+2-21 files

LLVM/project d9661e1.github/workflows release-binaries.yml

[Github] Add repository checks to release-binaries workflow (#84437)

This patch adds repository checks to the release-binaries workflow jobs.
People were observing that the job was running on a schedule in their
forks. This only happens on old forks, but those probably exist in great
number given how prolific LLVM is. This is also good practice anyways,
on top of solving the direct problem of these jobs running with the cron
schedule on people's forks.

(cherry picked from commit 9f5be5f0092a636274953389cd5771c45ac0a568)
DeltaFile
+3-0.github/workflows/release-binaries.yml
+3-01 files

LLVM/project b7e2397clang/cmake/caches Release.cmake, llvm/utils/release test-release.sh

[CMake][Release] Enable CMAKE_POSITION_INDEPENDENT_CODE (#90139)

Set this in the cache file directly instead of via the test-release.sh
script so that the release builds can be reproduced with just the cache
file.

(cherry picked from commit 53ff002c6f7ec64a75ab0990b1314cc6b4bb67cf)
DeltaFile
+1-2llvm/utils/release/test-release.sh
+1-0clang/cmake/caches/Release.cmake
+2-22 files

LLVM/project ce88e86clang/cmake/caches Release.cmake

[CMake][Release] Refactor cache file and use two stages for non-PGO builds (#89812)

Completely refactor the cache file to simplify it and remove unnecessary
variables. The main functional change here is that the non-PGO builds
now use two stages, so `ninja -C build stage2-package` can be used with
both PGO and non-PGO builds.

(cherry picked from commit 6473fbf2d68c8486d168f29afc35d3e8a6fabe69)
DeltaFile
+71-73clang/cmake/caches/Release.cmake
+71-731 files

LLVM/project 0ec1bc4.github/workflows release-binaries.yml

workflows: Fixes for building the release binaries (#83694)

Since aa02002491333c42060373bc84f1ff5d2c76b4ce we weren't installing the
correct dependencies, and since 2836d8edbfbcd461b25101ed58f93c862d65903a
we must pass a custom token to github-upload-release.py for verifying
permissions.

(cherry picked from commit 51207756b0692f325cf75560185cf0336239b3e0)
DeltaFile
+6-1.github/workflows/release-binaries.yml
+6-11 files

LLVM/project dd3aa6dllvm CMakeLists.txt, llvm/utils/lit/lit __init__.py

Bump version to 18.1.6 (#91094)

DeltaFile
+1-1llvm/CMakeLists.txt
+1-1llvm/utils/lit/lit/__init__.py
+2-22 files

LLVM/project b910bebllvm/include/llvm/Object MachO.h, llvm/lib/Object MachOObjectFile.cpp

[llvm][MachO] Fix integer truncation in rebase/bind parsing (#89337)

`Count` and `Skip` should use `uint64_t` as they are encoded/decoded
using 64-bit ULEB128.

In `*_OPCODE_DO_*_ULEB_TIMES_SKIPPING_ULEB`, `Skip` could be encoded as
a two's complement for moving `SegmentOffset` backwards. Having a 32-bit
`Skip` truncates the encoded value and leads to a malformed
`AdvanceAmount`
and invalid `SegmentOffset` that extends past valid sections.
DeltaFile
+499-0llvm/test/Object/Inputs/MachO/bind-negative-skip.yaml
+10-10llvm/lib/Object/MachOObjectFile.cpp
+17-0llvm/test/Object/macho-bind-negative-skip.test
+8-7llvm/include/llvm/Object/MachO.h
+534-174 files

LLVM/project ea126aellvm/lib/Target/PowerPC PPCISelLowering.cpp PPCAsmPrinter.cpp, llvm/test/CodeGen/PowerPC aix-shared-lib-tls-model-opt.ll aix-shared-lib-tls-model-opt-small-local-dynamic-tls.ll

[PowerPC] Tune AIX shared library TLS model at function level (#84132)

Under some circumstance (library loaded with the main program), TLS
initial-exec model can be applied to local-dynamic access(es). We
could use some simple heuristic to decide the update at function level:
* If there is equal or less than a number of TLS local-dynamic access(es)
in the function, use TLS initial-exec model. (the threshold which default to
1 is controlled by hidden option)
DeltaFile
+627-0llvm/test/CodeGen/PowerPC/aix-shared-lib-tls-model-opt.ll
+74-0llvm/test/CodeGen/PowerPC/aix-shared-lib-tls-model-opt-small-local-dynamic-tls.ll
+58-0llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+22-0llvm/test/CodeGen/PowerPC/check-aix-shared-lib-tls-model-opt-Option.ll
+21-0llvm/test/CodeGen/PowerPC/check-aix-shared-lib-tls-model-opt-IRattribute.ll
+14-1llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+816-17 files not shown
+859-313 files

LLVM/project 51f178dclang/lib/StaticAnalyzer/Checkers MallocChecker.cpp, clang/test/Analysis NewDelete-atomics.cpp

[analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (#90918)

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this
time. This is useful because even though the actual builtin atomic is
still there, it may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based
heuristics, which isn't necessary to fix the bug but arguably a good
idea regardless.
DeltaFile
+109-9clang/test/Analysis/NewDelete-atomics.cpp
+15-4clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+7-0clang/test/Analysis/Inputs/system-header-simulator-cxx.h
+131-133 files

LLVM/project 73a0144bolt/test/X86 jump-table-fixed-ref-pic.test, bolt/test/X86/Inputs jump-table-fixed-ref-pic.s

[BOLT] Add test case for PIC fixed indirect jump (#91547)

A compiler can generate a redundant indirection for a jump via a fixed
jump table target. Add a test case that covers such pattern that covers
PIC case. We already have non-PIC case detection.

Currently XFAIL.
DeltaFile
+35-0bolt/test/X86/Inputs/jump-table-fixed-ref-pic.s
+9-0bolt/test/X86/jump-table-fixed-ref-pic.test
+44-02 files

LLVM/project 62b5b61clang/lib/Sema SemaLookup.cpp SemaTemplate.cpp, clang/test/CXX/temp/temp.res/temp.dep/temp.dep.type p4.cpp

[Clang][Sema] Fix lookup of dependent operator= outside of complete-class contexts (#91498)

Fixes a crash caused by #90152.
DeltaFile
+20-15clang/lib/Sema/SemaLookup.cpp
+13-0clang/test/CXX/temp/temp.res/temp.dep/temp.dep.type/p4.cpp
+2-5clang/lib/Sema/SemaTemplate.cpp
+35-203 files

LLVM/project ba5170fllvm/test/Transforms/InstCombine lshr.ll

[InstCombine] Thwart complexity-based canonicalization in shl-add test (NFC) (#91413)

Fixed test for #88193
DeltaFile
+4-2llvm/test/Transforms/InstCombine/lshr.ll
+4-21 files

LLVM/project 409ff97llvm/lib/Transforms/InstCombine InstCombineShifts.cpp

[InstCombine] Fix comment from #88193 (NFC) (#91427)

It is inaccurate and needs to be corrected.
DeltaFile
+2-2llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+2-21 files

LLVM/project 1aaab33llvm/lib/TargetParser RISCVISAInfo.cpp

[RISCV] Don't use std::vector<std::string> for split extensions in RISCVISAInfo::parseArchString. NFC (#91538)

We can use a SmallVector<StringRef>.

Adjust the code so we check for empty strings in the loop instead of
making a copy of the vector returned from StringRef::split.

This overlaps with #91532 which also removed the std::vector, but
that PR may be more controversial.
DeltaFile
+18-31llvm/lib/TargetParser/RISCVISAInfo.cpp
+18-311 files

LLVM/project 96568f3llvm/docs LangRef.rst, llvm/include/llvm/Transforms/Instrumentation PGOCtxProfLowering.h

[llvm][ctx_profile] Add instrumentation lowering (#90821)

This adds the instrumentation lowering pass.

(Tracking Issue: #89287, RFC referenced there)
DeltaFile
+326-0llvm/lib/Transforms/Instrumentation/PGOCtxProfLowering.cpp
+229-0llvm/test/Transforms/PGOProfile/ctx-instrumentation.ll
+43-5llvm/docs/LangRef.rst
+17-0llvm/test/Transforms/PGOProfile/ctx-instrumentation-invalid-roots.ll
+4-1llvm/include/llvm/Transforms/Instrumentation/PGOCtxProfLowering.h
+5-0llvm/lib/Passes/PassBuilderPipelines.cpp
+624-62 files not shown
+626-68 files

LLVM/project 1710c8cflang/lib/Lower Bridge.cpp, flang/test/Lower/HLFIR custom-intrinsic.f90 binary-ops.f90

[flang] Lowering changes for assigning dummy_scope to hlfir.declare. (#90989)

The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.

I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).

If this can be done in a cleaner way, please advise.
DeltaFile
+52-51flang/test/Lower/HLFIR/custom-intrinsic.f90
+42-42flang/test/Lower/HLFIR/binary-ops.f90
+62-11flang/lib/Lower/Bridge.cpp
+23-23flang/test/Lower/HLFIR/assignment-intrinsics.f90
+21-21flang/test/Lower/HLFIR/designators.f90
+21-21flang/test/Lower/OpenMP/parallel-firstprivate-clause-scalar.f90
+221-169131 files not shown
+806-696137 files

LLVM/project 36d8b37llvm/test/CodeGen/RISCV imm.ll, llvm/test/CodeGen/RISCV/rv64-legal-i32 imm.ll

[RISCV] Add another missed Zbs constant materialization test. NFC

This can be LI+BCLRI+BCLRI.
DeltaFile
+67-0llvm/test/CodeGen/RISCV/imm.ll
+43-0llvm/test/CodeGen/RISCV/rv64-legal-i32/imm.ll
+110-02 files

LLVM/project c0b5a96llvm/test/CodeGen/RISCV imm.ll, llvm/test/CodeGen/RISCV/rv64-legal-i32 imm.ll

[RISCV] Add tests where we could use Zbs instructions in constant materialization. NFC
DeltaFile
+137-0llvm/test/CodeGen/RISCV/rv64-legal-i32/imm.ll
+116-0llvm/test/CodeGen/RISCV/imm.ll
+253-02 files

LLVM/project 2fb3774llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 gather-with-minbith-user.ll user-node-not-in-bitwidths.ll

Revert "[SLP]Fix PR91467: Look through scalar cast, when trying to cast to another type."

This reverts commit 2475efa91d8b4fa8f1a2d16052cb6d14be7d5dc6.

Causes crashes, see comments on https://github.com/llvm/llvm-project/commit/2475efa91d8b4fa8f1a2d16052cb6d14be7d5dc6.
DeltaFile
+8-1llvm/test/Transforms/SLPVectorizer/AArch64/gather-with-minbith-user.ll
+6-1llvm/test/Transforms/SLPVectorizer/AArch64/user-node-not-in-bitwidths.ll
+1-5llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+2-1llvm/test/Transforms/SLPVectorizer/SystemZ/minbitwidth-root-trunc.ll
+17-84 files

LLVM/project 99052c4llvm/unittests/IR DebugInfoTest.cpp

[gardening][DebugInfo][NFC] Improve comment on HashingDISubprogram test (#91543)

DeltaFile
+3-2llvm/unittests/IR/DebugInfoTest.cpp
+3-21 files