[CostModel][X86] getCastInstrCost - improve CostKind adjustment when splitting src/dst types
Noticed in #90883 review - for non-Throughput costs, we weren't applying the split count to '0 or 1' cost value.
This still doesn't work well as many of the type legalizations are hidden so we don't have the split count, really we need to move a CostKindCosts based costs table, but that's going to be a lot of work :/
[clang-tidy] Add 'cert-int09-c' alias for 'readability-enum-initial-value' (#90868)
The check's ruling exactly matches the corresponding CERT C
Recommendation, and, as such, worth a trivial alias.
AMDGPU: Add more tests for atomicrmw handling
Add agent scope copies of atomicrmw atomics tests.
Expand testing for the undo identity atomicrmw case.
Test 16-bit atomic expansions.
[llvm-mca] Teach MCA constant registers do not create dependencies (#89387)
Constant registers like the zero registers XZR and WZR are treated as
any other register by LLVM-MCA. This can create non existent dependency
chains.
Currently there is no method in MCA to query if a register is constant.
This patch fixes the issue by adding a bool Constant
variable to MCRegisterDesc that is true for constant registers. Since
constant registers do not create dependencies, it makes sense to add
this check to MCA.
AMDGPU: Optimize set_rounding if input is known to fit in 2 bits (#88588)
We don't need to figure out the weird extended rounding modes or
handle offsets to keep the lookup table in 64-bits.
https://reviews.llvm.org/D153258
Depends #88587
[clang] CTAD: fix the aggregate deduction guide for alias templates.
For alias templates, the way we construct their aggregate deduction guides is
not following the standard way. We should do the same thing as we do for
implicit deduction guides.
Refactor: Extract the core deduction-guide construction implementation from DeclareImplicitDeductionGuidesForTypeAlias
We move the core implementation to a dedicate function, so that it can
be reused in other places.
[GlobalISel] Optimize ULEB128 usage (#90565)
- Remove some cases where ULEB128 isn't needed
- Add a fastDecodeULEB128 tailored for GlobalISel which does unchecked
decoding optimized for the common case, which is 1 byte values. We
rarely have >1 byte Inst IDs, OpIdx, etc. and those are the most common
ULEB users by far.
This specific LEB128 decode function generates almost 2x less
instructions than the generic one.
[clang] pointer to member with qualified-id enclosed in parentheses in unevaluated context should be invalid (#89713)
clang don't check whether the operand of the & operator is enclosed in
parantheses when pointer to member is formed in unevaluated context, for
example:
```cpp
struct foo { int val; };
int main() { decltype(&(foo::val)) ptr; }
```
`decltype(&(foo::val))` should be invalid, but clang accepts it. This PR
fixes this issue.
Fixes #40906.
---------
Co-authored-by: cor3ntin <corentinjabot at gmail.com>
[mlir] make transform.foreach_match forward arguments (#89920)
It may be useful to have access to additional handles or parameters when
performing matches and actions in `foreach_match`, for example, to
parameterize the matcher by rank or restrict it in a non-trivial way.
Enable `foreach_match` to forward additional handles from operands to
matcher symbols and from action symbols to results.
SystemZ: Don't promote atomic store in IR (#90899)
This is the mirror to the recent atomic load change. The same
bitcast-back-to-integer case is a small code quality regression for the
same reason. This would disappear with a bitcastable legal 128-bit type.
AMDGPU: Implement llvm.set.rounding (#88587)
Use a shift of a magic constant and some offseting to convert from
flt_rounds values.
I don't know why the enum defines Dynamic = 7. The standard suggests -1
is the cannot determine value. If we could start the extended values at
4 we wouldn't need the extra compare sub and select.
https://reviews.llvm.org/D153257
[lldb] Fix Scalar::GetData for non-multiple-of-8-bits values (#90846)
It was aligning the byte size down. Now it aligns up. This manifested
itself as SBTypeStaticField::GetConstantValue returning a zero-sized
value for `bool` fields (because clang represents bool as a 1-bit
value).
I've changed the code for float Scalars as well, although I'm not aware
of floating point values that are not multiples of 8 bits.
[MLIR][OpenMP] Extend omp.private materialization support: `dealloc` (#90841)
Extends current support for delayed privatization during translation to
LLVM IR. This adds support for materlizaing the `dealloc` region in
`omp.private` ops when this region contains clean-up/deallocation logic
that needs to be executed at the end of the parallel region.
This changes the `OMPIRBuilder` slightly to execute the finalization
callback **after** the privatization callback. This allows us to collect
information about privatized variables on the MLIR and LLVM sides so
that we can properly emit deallocation logic.