FreeBSD/src 97741df (r354628)contrib/netbsd-tests/usr.bin/grep t_grep.sh, usr.bin/grep util.c file.c

MFC bsdgrep(1) fixes: r320414, r328559, r332805-r332806, r332809, r332832,
r332850-r332852, r332856, r332858, r332876, r333351, r334803,
r334806-r334809, r334821, r334837, r334889, r335188, r351769, r352691

r320414:
Expect :mmap_eof_not_eol to fail

It relies on a jemalloc feature (opt.redzone) no longer available after
r319971.

r328559:
Remove t_grep:mmap_eof_not_eol test

The test was marked as an expected failure in r320414 after r319971's import
of a newer jemalloc removed an essential feature (opt.redzone) for
reproducing the behavior it was testing. Since then, no way has been found
or demonstrated to reliably test the behavior, so remove the test.

r332805:
bsdgrep: Split match processing out of procfile

procfile is getting kind of hairy, and it's not going to get better as we
correct some more bits that assume we process one line at a time.

r332806:
bsdgrep: Clean up procmatches a little bit

r332809:
bsdgrep: Add some TODOs for future work on operating on chunks

r332832:
bsdgrep: Break procmatches down a little bit more

Split the matching and non-matching cases out into their own functions to
reduce future complexity. As the name implies, procmatches will eventually
process more than one match itself in the future.

r332850:
bsdgrep: Some light cleanup

There's no point checking for a bunch of file modes if we're not a
practicing believer of DIR_SKIP or DEV_SKIP.

This also reduces some style violations that were particularly ugly looking
when browsing through.

r332851:
bsdgrep: More trivial cleanup/style cleanup

We can avoid branching for these easily reduced patterns

r332852:
bsdgrep: if chain => switch

This makes some of this a little easier to follow (in my opinion).

r332856:
bsdgrep: Fix --include/--exclude ordering issues

Prior to r332851:
* --exclude always win out over --include
* --exclude-dir always wins out over --include-dir

r332851 broke that behavior, resulting in:
* First of --exclude, --include wins
* First of --exclude-dir, --include-dir wins

As it turns out, both behaviors are wrong by modern grep standards- the
latest rule wins. e.g.:

`grep --exclude foo --include foo 'thing' foo`
foo is included

`grep --include foo --exclude foo 'thing' foo`
foo is excluded

As tested with GNU grep 3.1.

This commit makes bsdgrep follow this behavior.

r332858:
bsdgrep: Use grep_strdup instead of grep_malloc+strcpy

r332876:
bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)

r333351:
bsdgrep: Allow "-" to be passed to -f to mean "standard input"

A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.

While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.

r334803:
netbsd-tests: grep(1): Add test for -c flag

Someone might be inclined to accidentally break this. someone might have
written said test because they broke it locally.

r334806:
bsdgrep(1): Do some less dirty things with return types

Neither procfile nor grep_tree return anything meaningful to their callers.
None of the callers actually care about how many lines were matched in all
of the files they processed; it's all about "did anything match?"

This is generally just a light refactoring to remind me of what actually
matters as I'm rewriting these bits to care less about 'stuff'.

r334807:
bsdgrep(1): whoops, garbage collect the now write-only variable

r334808:
bsdgrep(1): Don't initialize fts_flags twice

Admittedly, this is a clang-scan complaint... but it wasn't wrong. fts_flags
is initialized by all cases in the switch(), which should be fairly obvious.
Annotate this anyways.

r334809:
netbsd-tests: bsdgrep(1): Add a test for -m, too

r334821:
bsdgrep(1): Slooowly peel away the chunky onion

(or peel off the band-aid, whatever floats your boat)

This addresses two separate issues:

1.) Nothing within bsdgrep actually knew whether it cared about line numbers
  or not.

2.) The file layer knew nothing about the context in which it was being
  called.

#1 is only important when we're *not* processing line-by-line. #2 is
debatably a good idea; the parsing context is only handy because that's
where we store current offset information and, as of this commit, whether or
not it needs to be line-aware.

r334837:
bsdgrep(1): Evict character sequence that moved in

r334889:
bsdgrep(1): Some more int -> bool conversions and name changes

Again motivated by upcoming work to rewrite a bunch of this- single-letter
variable names and slightly misleading variable names ("lastmatches" to
indicate that the last matched) are not helpful.

r335188:
bsdgrep(1): Remove redundant initialization; unconditionally assigned later

r351769:
bsdgrep(1): add some basic tests for some GNU Extension support

These will be expanded later as I come up with good test cases; for now,
these seem to be enough to trigger bugs in base gnugrep and expose missing
features in bsdgrep.

r352691:
bsdgrep(1): various fixes of empty pattern/exit code/-c behavior

When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:

- The -v flag semantics were not quite right; lines matched should have been
  counted differently based on whether the -v flag was set or not. procline
  now definitively returns whether it's matched or not, and interpreting
  that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
  with the -w flag. The former is a whole-line match and should be more
  strict, only matching blank lines. No -x and no -w will will match the
  empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
  exit(0) if any file that didn't match was output, so our interpretation
  was simply backwards. The new interpretation makes sense to me.

Tests updated and added to try and catch some of this.

This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.
DeltaFile
+188-139usr.bin/grep/util.c
+76-55usr.bin/grep/file.c
+97-24contrib/netbsd-tests/usr.bin/grep/t_grep.sh
+17-21usr.bin/grep/grep.c
+26-0usr.bin/grep/tests/grep_freebsd_test.sh
+18-3usr.bin/grep/grep.h
+8-1usr.bin/grep/grep.1
+430-2437 files

UnifiedSplitRaw