PC-BSD/trueos 36c0c43sys/amd64/amd64 support.S

amd64: check for small size in memmove, memcpy and memset

If the size is 15 bytes or less avoid spinning up rep just to copy the 8
bytes. In my tests on EPYC and old Intel microarchs without ERMS (like
Westmere) it provided a nice win over the current version (e.g. for EPYC
memset with 15 bytes of size goes from 59712651 ops/s to 70600095) all
while almost not pessimizing the other cases.

Data collected during package building shows that < 16 sizes are pretty
common.

Verified with the glibc test suite.

Approved by:    re (kib)
DeltaFile
+9-0sys/amd64/amd64/support.S
+9-01 files

UnifiedSplitRaw