[mlir][tosa] Rename Tosa Div op to IntDiv Op (#80047)
This patch renames Tosa Div Op to IntDiv Op to align with the TOSA Spec.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg"
height="34" align="absmiddle"
alt="Reviewable"/>](https://reviewable.io/reviews/llvm/llvm-project/80047)
<!-- Reviewable:end -->
Signed-off-by: Tai Ly <tai.ly at arm.com>
[lldb] Use Python script to generate SBLanguages.h (#90753)
Use a Python script to generate SBLanguages.h instead of piggybacking on
LLDB TableGen. This addresses Nico Weber's post-commit feedback.
[SLP]Improve comparison of shuffled loads/masked gathers by adding GEP cost.
In some cases masked gather is less profitable than insert-subvector of
consecutive/strided stores. SLP has this kind of analysis, but need to
improve it by adding the cost of the GEP analysis.
Also, the GEP cost estimation for masked gather is fixed.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/90737
[SLP]Do not include the cost of and -1, <v> and emit just <v> after MinBitWidth.
After minbitwidth analysis, and <v>, (power_of_2 - 1 const) can be
transformed into just an <v>, (all_ones const), which can be ignored at
the cost estimation and at the codegen. x264 benchmark has this pattern.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/90739
[VPlan] Make CallInst optional for VPWidenCallRecipe (NFCI).
Replace relying on the underling CallInst for looking up the called
function and its types by instead adding the called function as operand,
in line with how called functions are handled in CallInst.
Operand bundles, metadata and fast-math flags are optionally used if
there's an underlying CallInst.
This enables creating VPWidenCallRecipes without requiring an underlying
IR instruction.
[flang] Catch missing "not a dummy argument" cases (#90268)
Declaration checking is looking for inappropriate usage of the INTENT,
VALUE, & OPTIONAL attributes in multiple places, and some oddball cases
like ENTRY points are not checked. Centralize the check for attributes
that apply only to dummy arguments into one spot.
[flang] Fix CHECK() crash in module file generator (#90234)
A sanity CHECK() in mod-file.cpp needs to allow for USE association of a
derived type that has the same name as a locally defined generic
interface.
Fixes https://github.com/llvm/llvm-project/issues/90192.
[clang-tidy] Enable plugin tests with LLVM_INSTALL_TOOLCHAIN_ONLY (#90370)
The only reason for the removed condition was that there was a
dependency for `CTTestTidyModule` on the `clang-tidy-headers` target,
which was only created under the same `NOT LLVM_INSTALL_TOOLCHAIN_ONLY`
condition. It looks like this target is not needed for
`CTTestTidyModule` to build and run, so the dependency can be removed
along with the condition.
See also https://reviews.llvm.org/D111100 for earlier discussions.
[ELF] Adjust --compress-sections to support compression level
zstd excels at scaling from low-ratio-very-fast to
high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
speed, while others focus on achieving the highest compression ratio
possible, similar to traditional high-ratio codecs like LZMA.
Add an optional `level` to `--compress-sections` (#84855) to cater to
these diverse needs. While we initially aimed for a one-size-fits-all
approach, this no longer seems to work.
(https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)
When --compress-debug-sections is used together, make
--compress-sections take precedence since --compress-sections is usually
more specific.
Remove the level distinction between -O/-O1 and -O2 for
--compress-debug-sections=zlib for a more consistent user experience.
Pull Request: https://github.com/llvm/llvm-project/pull/90567
[GlobalISel] Fix store merging incorrectly classifying an unknown index expr as 0. (#90375)
During analysis, we incorrectly leave the offset part of an address info
struct
as zero, when in actual fact we failed to decompose it into base +
offset.
This results in incorrectly assuming that the address is adjacent to
another store
addr. To fix this we wrap the offset in an optional<> so we can
distinguish between
real zero and unknown.
Fixes issue #90242
(cherry picked from commit 19f4d68252b70c81ebb1686a5a31069eda5373de)
[ELF] Catch zlib deflateInit2 error
The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not
have a good way of testing. The latter will be possible with a pending
change that allows setting the compression level, which will come with a
test.
[X86] Enable EVEX512 when host CPU has AVX512 (#90479)
This is used when -march=native run on an unknown CPU to old version of
LLVM.
(cherry picked from commit b3291793f11924a3b62601aabebebdcfbb12a9a1)
[GlobalISel] Don't form anyextending atomic loads.
Until we can reliably check the legality and improve our selection of these,
don't form them at all.
(cherry picked from commit 60fc4ac67a613e4e36cef019fb2d13d70a06cfe8)
[alpha.webkit.UncountedCallArgsChecker] Support more trivial expressions. (#90414)
Treat a compound operator such as |=, array subscription, sizeof, and
non-type template parameter as trivial so long as subexpressions are
also trivial.
Also treat true/false boolean literal as trivial.
[flang] always run PolymorphicOpConversion sequentially (#90721)
It was pointed out in post commit review of
https://github.com/llvm/llvm-project/pull/90597 that the pass should
never have been run in parallel over all functions (and now other top
level operations) in the first place. The mutex used in the pass was
ineffective at preventing races since each instance of the pass would
have a different mutex.
Reapply "Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summaries" (#90610) (#90692)
This reverts commit 2aabfc811670beb843074c765c056fff4a7b443b.
Add fixes to LLD and Gold tests missed in original change.
Co-authored-by: Jan Voung <jvoung at google.com>
[RISCV] Refactor profile selection in RISCVISAInfo::parseArchString. (#90700)
Instead of hardcoding the 4 current profile prefixes, treat profile
selection as a fallback if we don't find "rv32" or "rv64".
Update the error message accordingly.
[NVPTX] Fix 64 bits rotations with large shift values (#89399)
ROTL and ROTR can take a shift amount larger than the element size, in
which case the effective shift amount should be the shift amount modulo
the element size.
This patch adds the modulo step when the shift amount isn't known at
compile time. Without it the existing implementation would end up
shifting beyond the type size and give incorrect results.
[MIR] Serialize MachineFrameInfo::isCalleeSavedInfoValid() (#90561)
In case of functions without a stack frame no "stack" field is
serialized into MIR which leads to isCalleeSavedInfoValid being false
when reading a MIR file back in. To fix this we should serialize
MachineFrameInfo::isCalleeSavedInfoValid() into MIR.