LLVM/project fc7e74ellvm/test/Analysis/CostModel/X86 trunc-sizelatency.ll trunc-codesize.ll

[CostModel][X86] getCastInstrCost - improve CostKind adjustment when splitting src/dst types

Noticed in #90883 review - for non-Throughput costs, we weren't applying the split count to '0 or 1' cost value.

This still doesn't work well as many of the type legalizations are hidden so we don't have the split count, really we need to move a CostKindCosts based costs table, but that's going to be a lot of work :/
DeltaFile
+807-277llvm/test/Analysis/CostModel/X86/trunc-sizelatency.ll
+807-277llvm/test/Analysis/CostModel/X86/trunc-codesize.ll
+807-277llvm/test/Analysis/CostModel/X86/trunc-latency.ll
+290-290llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-latency.ll
+290-290llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-sizelatency.ll
+290-290llvm/test/Analysis/CostModel/X86/shuffle-replication-i1-codesize.ll
+3,291-1,70111 files not shown
+3,584-1,99317 files

LLVM/project bcdbd0bllvm/lib/Transforms/Instrumentation DataFlowSanitizer.cpp

[llvm][DataFlowSanitizer] Don't pass vector by value (NFC)

Closes #89201
DeltaFile
+1-1llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
+1-11 files

LLVM/project 2933ef2clang/lib/Driver/ToolChains HIPUtility.cpp

[clang][HIPUtility] Iterate by const reference (NFC)

Closes #90284
DeltaFile
+2-2clang/lib/Driver/ToolChains/HIPUtility.cpp
+2-21 files

LLVM/project 256797ellvm/include/llvm/IR DebugProgramInstruction.h

[NFC][RemoveDIs] Fix some comments in DebugProgramInstruction.h
DeltaFile
+5-7llvm/include/llvm/IR/DebugProgramInstruction.h
+5-71 files

LLVM/project 1efc191clang/lib/Driver/ToolChains Clang.cpp

[clang][Driver] Iterate with const reference (NFC)

Closes #90282
DeltaFile
+1-1clang/lib/Driver/ToolChains/Clang.cpp
+1-11 files

LLVM/project 6086f69clang-tools-extra/clang-tidy/cert CERTTidyModule.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Add 'cert-int09-c' alias for 'readability-enum-initial-value' (#90868)

The check's ruling exactly matches the corresponding CERT C
Recommendation, and, as such, worth a trivial alias.
DeltaFile
+49-36clang-tools-extra/docs/clang-tidy/checks/readability/enum-initial-value.rst
+10-0clang-tools-extra/docs/clang-tidy/checks/cert/int09-c.rst
+4-1clang-tools-extra/docs/clang-tidy/checks/list.rst
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+4-0clang-tools-extra/clang-tidy/cert/CERTTidyModule.cpp
+71-375 files

LLVM/project fb1c2dbclang/lib/AST/Interp Interp.h Program.cpp, clang/test/AST/Interp builtin-align-cxx.cpp c.c

Revert "Reapply "[clang][Interp] Create full type info for dummy pointers""

This reverts commit 1aeb64c8ec7b96b2301929d8a325a6e1d9ddaa2f.

Due to failures in 32 bit Arm builds:
https://lab.llvm.org/buildbot/#/builders/245/builds/24041
DeltaFile
+22-14clang/lib/AST/Interp/Interp.h
+9-12clang/lib/AST/Interp/Program.cpp
+14-1clang/test/AST/Interp/builtin-align-cxx.cpp
+8-0clang/lib/AST/Interp/Descriptor.cpp
+3-3clang/lib/AST/Interp/Descriptor.h
+0-3clang/test/AST/Interp/c.c
+56-336 files

LLVM/project 4e67b50llvm/test/Transforms/AtomicExpand/AMDGPU expand-atomic-f32-agent.ll expand-atomic-f64-agent.ll

AMDGPU: Add more tests for atomicrmw handling

Add agent scope copies of atomicrmw atomics tests.
Expand testing for the undo identity atomicrmw case.
Test 16-bit atomic expansions.
DeltaFile
+3,717-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-agent.ll
+1,685-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-agent.ll
+859-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2f16-agent.ll
+859-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2bf16-agent.ll
+668-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-agent.ll
+668-0llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-agent.ll
+8,456-03 files not shown
+8,869-219 files

LLVM/project 9f9856dllvm/test/CodeGen/AMDGPU global_atomics_i64_system.ll flat_atomics_i32_system.ll, llvm/test/Transforms/AtomicExpand/AMDGPU expand-atomic-f32-system.ll expand-atomic-f64-system.ll

AMDGPU: Update name for amdgpu.no.remote.memory metadata
DeltaFile
+218-218llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f32-system.ll
+140-140llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll
+140-140llvm/test/CodeGen/AMDGPU/flat_atomics_i32_system.ll
+140-140llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll
+140-140llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll
+98-98llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-f64-system.ll
+876-87610 files not shown
+1,277-1,27716 files

LLVM/project 385f59fllvm/include/llvm/MC MCRegisterInfo.h, llvm/lib/MCA InstrBuilder.cpp

[llvm-mca] Teach MCA constant registers do not create dependencies (#89387)

Constant registers like the zero registers XZR and WZR are treated as
any other register by LLVM-MCA. This can create non existent dependency
chains.
Currently there is no method in MCA to query if a register is constant.
This patch fixes the issue by adding a bool Constant
variable to MCRegisterDesc that is true for constant registers. Since
constant registers do not create dependencies, it makes sense to add
this check to MCA.
DeltaFile
+76-0llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-zero-dependency.s
+12-12llvm/test/tools/llvm-mca/AArch64/HiSilicon/tsv110-forwarding.s
+16-5llvm/lib/MCA/InstrBuilder.cpp
+6-0llvm/include/llvm/MC/MCRegisterInfo.h
+3-2llvm/utils/TableGen/RegisterInfoEmitter.cpp
+113-195 files

LLVM/project b4e751ellvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.set.rounding.ll

AMDGPU: Optimize set_rounding if input is known to fit in 2 bits (#88588)

We don't need to figure out the weird extended rounding modes or
handle offsets to keep the lookup table in 64-bits.
    
https://reviews.llvm.org/D153258

Depends #88587
DeltaFile
+104-308llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+41-21llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+145-3292 files

LLVM/project 6218992clang/lib/Sema SemaTemplate.cpp, clang/test/SemaCXX cxx20-ctad-type-alias.cpp

[clang] CTAD: fix the aggregate deduction guide for alias templates.

For alias templates, the way we construct their aggregate deduction guides is
not following the standard way. We should do the same thing as we do for
implicit deduction guides.
DeltaFile
+2-60clang/lib/Sema/SemaTemplate.cpp
+14-0clang/test/SemaCXX/cxx20-ctad-type-alias.cpp
+7-0clang/test/SemaTemplate/deduction-guide.cpp
+23-603 files

LLVM/project 4036514clang/lib/Sema SemaTemplate.cpp

Refactor: Extract the core deduction-guide construction implementation from DeclareImplicitDeductionGuidesForTypeAlias

We move the core implementation to a dedicate function, so that it can
be reused in other places.
DeltaFile
+203-187clang/lib/Sema/SemaTemplate.cpp
+203-1871 files

LLVM/project 7c64b53llvm/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel BUILD.gn

[gn build] Port ed299b3efd66
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/unittests/CodeGen/GlobalISel/BUILD.gn
+1-01 files

LLVM/project e47d7c6llvm/lib/Target/AMDGPU AMDGPUInsertSingleUseVDST.cpp

Fix MSVC signed/unsigned mismatch warning. NFC.
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUInsertSingleUseVDST.cpp
+1-11 files

LLVM/project ed299b3llvm/include/llvm/CodeGen/GlobalISel GIMatchTableExecutor.h GIMatchTableExecutorImpl.h, llvm/include/llvm/Support Compiler.h

[GlobalISel] Optimize ULEB128 usage (#90565)

- Remove some cases where ULEB128 isn't needed
- Add a fastDecodeULEB128 tailored for GlobalISel which does unchecked
decoding optimized for the common case, which is 1 byte values. We
rarely have >1 byte Inst IDs, OpIdx, etc. and those are the most common
ULEB users by far.

This specific LEB128 decode function generates almost 2x less
instructions than the generic one.
DeltaFile
+49-0llvm/unittests/CodeGen/GlobalISel/GIMatchTableExecutorTest.cpp
+24-2llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+6-9llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+8-0llvm/include/llvm/Support/Compiler.h
+6-2llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+1-0llvm/unittests/CodeGen/GlobalISel/CMakeLists.txt
+94-136 files

LLVM/project 8480c93clang/docs ReleaseNotes.rst, clang/include/clang/Basic DiagnosticSemaKinds.td

[clang] pointer to member with qualified-id enclosed in parentheses in unevaluated context should be invalid (#89713)

clang don't check whether the operand of the & operator is enclosed in
parantheses when pointer to member is formed in unevaluated context, for
example:

```cpp
struct foo { int val; };

int main() { decltype(&(foo::val)) ptr; }
```

`decltype(&(foo::val))` should be invalid, but clang accepts it. This PR
fixes this issue.

Fixes #40906.

---------

Co-authored-by: cor3ntin <corentinjabot at gmail.com>
DeltaFile
+17-0clang/test/CXX/expr/expr.unary/expr.unary.op/p4.cpp
+16-0clang/lib/Sema/SemaExpr.cpp
+3-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+2-0clang/docs/ReleaseNotes.rst
+38-04 files

LLVM/project e4b04b3mlir/include/mlir/Dialect/Transform/IR TransformOps.td, mlir/include/mlir/Dialect/Transform/Interfaces TransformInterfaces.h

[mlir] make transform.foreach_match forward arguments (#89920)

It may be useful to have access to additional handles or parameters when
performing matches and actions in `foreach_match`, for example, to
parameterize the matcher by rank or restrict it in a non-trivial way.
Enable `foreach_match` to forward additional handles from operands to
matcher symbols and from action symbols to results.
DeltaFile
+123-37mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+110-0mlir/test/Dialect/Transform/foreach-match.mlir
+81-4mlir/test/Dialect/Transform/ops-invalid.mlir
+38-22mlir/include/mlir/Dialect/Transform/IR/TransformOps.td
+29-5mlir/lib/Dialect/Transform/Interfaces/TransformInterfaces.cpp
+13-0mlir/include/mlir/Dialect/Transform/Interfaces/TransformInterfaces.h
+394-686 files

LLVM/project edbe6ebllvm/lib/CodeGen/SelectionDAG LegalizeFloatTypes.cpp, llvm/lib/Target/SystemZ SystemZISelLowering.cpp

SystemZ: Don't promote atomic store in IR (#90899)

This is the mirror to the recent atomic load change. The same
bitcast-back-to-integer case is a small code quality regression for the
same reason. This would disappear with a bitcastable legal 128-bit type.
DeltaFile
+35-11llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+16-7llvm/test/CodeGen/SystemZ/atomic-store-08.ll
+0-1llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+51-193 files

LLVM/project 6535e7allvm/test/CodeGen/SystemZ copy-phys-reg-gr128-to-vr128.mir

SystemZ: Remove redundant copy tests from 75f4baa70
DeltaFile
+0-31llvm/test/CodeGen/SystemZ/copy-phys-reg-gr128-to-vr128.mir
+0-311 files

LLVM/project 44648ccllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp, llvm/test/CodeGen/AMDGPU pal-metadata-3.0.ll

[AMDGPU] Always emit lds_size in PAL ELF Metadata 3.0 (#87222)

Emit lds_size for all shader types in PAL metadata.
DeltaFile
+39-0llvm/test/CodeGen/AMDGPU/pal-metadata-3.0.ll
+4-4llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+43-42 files

LLVM/project 9731b77llvm/docs AMDGPUUsage.rst ReleaseNotes.rst, llvm/lib/Target/AMDGPU SIModeRegisterDefaults.cpp SIISelLowering.cpp

AMDGPU: Implement llvm.set.rounding (#88587)

Use a shift of a magic constant and some offseting to convert from
flt_rounds values.

I don't know why the enum defines Dynamic = 7. The standard suggests -1
is the cannot determine value. If we could start the extended values at
4 we wouldn't need the extra compare sub and select.

https://reviews.llvm.org/D153257
DeltaFile
+1,919-0llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+113-0llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.cpp
+72-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+12-0llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.h
+6-0llvm/docs/AMDGPUUsage.rst
+2-0llvm/docs/ReleaseNotes.rst
+2,124-02 files not shown
+2,127-08 files

LLVM/project 13a6fe8clang/include/clang/CIRFrontendAction CIRGenAction.h, clang/lib/CIR/CodeGen CIRGenModule.h

fix comments

Created using spr 1.3.5
DeltaFile
+4-4clang/include/clang/CIRFrontendAction/CIRGenAction.h
+3-5clang/lib/CIR/CodeGen/CIRGenModule.h
+6-0clang/lib/FrontendTool/CMakeLists.txt
+1-1clang/lib/CIR/FrontendAction/CIRGenAction.cpp
+2-0clang/lib/CIR/FrontendAction/CMakeLists.txt
+16-105 files

LLVM/project 70b5a22llvm/lib/Transforms/Utils MemoryTaggingSupport.cpp, llvm/test/Instrumentation/HWAddressSanitizer alloca.ll

[hwasan] Don't crash on vscale allocas (#90932)

getAllocaSizeInBytes will crash casting size to
constant.
DeltaFile
+18-0llvm/test/Instrumentation/HWAddressSanitizer/alloca.ll
+2-0llvm/lib/Transforms/Utils/MemoryTaggingSupport.cpp
+20-02 files

LLVM/project 4300febclang/include/clang/CIR CIRGenerator.h, clang/include/clang/CIRFrontendAction CIRGenAction.h

get buildTopLevelDecl to run

Created using spr 1.3.5
DeltaFile
+45-2clang/lib/CIR/FrontendAction/CIRGenAction.cpp
+31-0clang/include/clang/CIR/CIRGenerator.h
+24-3clang/lib/CIR/CodeGen/CIRGenModule.h
+20-0clang/lib/CIR/CodeGen/CIRGenerator.cpp
+16-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+3-4clang/include/clang/CIRFrontendAction/CIRGenAction.h
+139-97 files not shown
+158-1513 files

LLVM/project e450f98lldb/source/Utility Scalar.cpp, lldb/test/API/python_api/type TestTypeList.py main.cpp

[lldb] Fix Scalar::GetData for non-multiple-of-8-bits values (#90846)

It was aligning the byte size down. Now it aligns up. This manifested
itself as SBTypeStaticField::GetConstantValue returning a zero-sized
value for `bool` fields (because clang represents bool as a 1-bit
value).

I've changed the code for float Scalars as well, although I'm not aware
of floating point values that are not multiples of 8 bits.
DeltaFile
+30-0lldb/unittests/Utility/ScalarTest.cpp
+13-0lldb/test/API/python_api/type/TestTypeList.py
+2-2lldb/source/Utility/Scalar.cpp
+1-0lldb/test/API/python_api/type/main.cpp
+46-24 files

LLVM/project 0ddf974flang/test/Lower/OpenMP default-clause.f90, llvm/lib/Transforms/AggressiveInstCombine AggressiveInstCombine.cpp

rebase

Created using spr 1.3.4
DeltaFile
+243-23llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+0-219llvm/test/Transforms/AggressiveInstCombine/strcmp.ll
+216-0llvm/test/Transforms/AggressiveInstCombine/strncmp-1.ll
+147-0llvm/test/Transforms/AggressiveInstCombine/strncmp-2.ll
+59-20mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+51-4flang/test/Lower/OpenMP/default-clause.f90
+716-26616 files not shown
+855-31422 files

LLVM/project b03e7a5llvm/test/Instrumentation/HWAddressSanitizer alloca.ll

[HWASAN] Regenerate a test (#90943)

DeltaFile
+81-81llvm/test/Instrumentation/HWAddressSanitizer/alloca.ll
+81-811 files

LLVM/project 922ab70llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp, mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

[MLIR][OpenMP] Extend omp.private materialization support: `dealloc` (#90841)

Extends current support for delayed privatization during translation to
LLVM IR. This adds support for materlizaing the `dealloc` region in
`omp.private` ops when this region contains clean-up/deallocation logic
that needs to be executed at the end of the parallel region.

This changes the `OMPIRBuilder` slightly to execute the finalization
callback **after** the privatization callback. This allows us to collect
information about privatized variables on the MLIR and LLVM sides so
that we can properly emit deallocation logic.
DeltaFile
+59-20mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+53-0mlir/test/Target/LLVMIR/openmp-omp.private-dealloc.mlir
+13-13llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+125-333 files

LLVM/project f8fedfblldb/packages/Python/lldbsuite/test/make Makefile.rules

[lldb] Fix TestSharedLibStrippedSymbols for #90622

`ifeq` needs to be at the beginning of a line, otherwise it's
interpreted as part of the recipe.
DeltaFile
+2-2lldb/packages/Python/lldbsuite/test/make/Makefile.rules
+2-21 files