Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PPC0027: any() and all() tests #22773

Open
wants to merge 6 commits into
base: blead
Choose a base branch
from
Open

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Nov 22, 2024

As per https://github.com/Perl/PPCs/blob/main/ppcs/ppc0027-any-and-all.md

Some still open questions:

  • Name of feature. I currently called it any_all with approximately 7 seconds thought. Maybe list_utils would be better? It reminds of List::Util without listing the specific functions involved.
  • Should $_ alias or copy in these blocks?

Both questions are asked by Perl/PPCs#60

@scottchiefbaker
Copy link
Contributor

Since List::Utils is already a core module why are any and all getting promoted into core? What makes something core worthy?

@Grinnz
Copy link
Contributor

Grinnz commented Nov 22, 2024

So are all the Scalar::Util functions that have been added to builtin. There are a number of benefits, primarily that the op can be written more efficiently (in this case, the List::Util versions suffer from the performance difference such that grep is usually more efficient even though it doesn't short-circuit); and that when it is deemed acceptable for a feature/builtin bundle, it will be automatically available under "use VERSION" and -E.

@leonerd leonerd added the squash-before-merge Author must squash the commits down before merging to blead label Nov 22, 2024
@leonerd leonerd marked this pull request as ready for review November 22, 2024 23:16
@leonerd
Copy link
Contributor Author

leonerd commented Nov 22, 2024

Undrafted, ready(ish) for review. Still some outstanding questions that could do with concrete answers.

@tonycoz
Copy link
Contributor

tonycoz commented Nov 25, 2024

The PPC doesn't say why any EXPR, LIST is disallowed, but one of the obvious advantages would be not having to disambiguate between a block and a hashref literal:

$ ./perl -Ilib -Mfeature=any_all -e 'print "ok\n" if any { d => 1, /a/ } qw(b c a)'
any is experimental at -e line 1.
any EXPR, LIST is not allowed at -e line 1.

You do allude to this, I'll have a fiddle.

@tonycoz
Copy link
Contributor

tonycoz commented Nov 25, 2024

This prevents the ambiguity for me:

diff --git a/toke.c b/toke.c
index 40ab911e66..6d1931e079 100644
--- a/toke.c
+++ b/toke.c
@@ -8001,7 +8001,7 @@ yyl_word_or_keyword(pTHX_ char *s, STRLEN len, I32 key, I32 orig_keyword, struct
     case KEY_all:
         Perl_ck_warner_d(aTHX_
             packWARN(WARN_EXPERIMENTAL__ANY_ALL), "all is experimental");
-        LOP(OP_ALLSTART, XREF);
+        LOP(OP_ALLSTART, XBLOCK);
 
     case KEY_and:
         if (!PL_lex_allbrackets && PL_lex_fakeeof >= LEX_FAKEEOF_LOWLOGIC)
@@ -8011,7 +8011,7 @@ yyl_word_or_keyword(pTHX_ char *s, STRLEN len, I32 key, I32 orig_keyword, struct
     case KEY_any:
         Perl_ck_warner_d(aTHX_
             packWARN(WARN_EXPERIMENTAL__ANY_ALL), "any is experimental");
-        LOP(OP_ANYSTART, XREF);
+        LOP(OP_ANYSTART, XBLOCK);
 
     case KEY_atan2:
         LOP(OP_ATAN2,XTERM);

It even worked for the any({block} LIST) syntax, which you don't test.

@tonycoz
Copy link
Contributor

tonycoz commented Nov 25, 2024

I do think it could use some minimal tests for any(BLOCK, LIST) form, especially since Deparse can generate that:

$ ./perl -Ilib -MO=Deparse,-p -e 'use feature "any_all"; if (any { /a/ } some_list() ) { print "ok\n" }'
any is experimental at -e line 1.
use feature 'any_all';
if (any({/a/;} some_list())) {
    print("ok\n");
}
-e syntax OK

This experiment needs an entry in pod/perlexperiment.pod.

op.c Outdated
/* any { BLOCK } would create an OP_NULL[OP_SCOPE[...]] or
* OP_NULL[OP_LEAVE[...]] here. If we don't see this structure
* then it must have been any EXPR, ... which we forbid
* TODO: See if we can forbid this somehow in perly.y itself
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bison supports conditions on the grammar rules, but you have to opt into them with %glr-parser. They're more intended for disambiguation than controlling rules in this way.

The most obvious way I could see was to make any/all return a new token instead of LSTOP and ended up with a proof of concept:

diff --git a/perly.y b/perly.y
index 6a6510823f..2df4a28557 100644
--- a/perly.y
+++ b/perly.y
@@ -85,7 +85,7 @@
 %token <opval> PLUGEXPR PLUGSTMT
 %token <opval> LABEL
 %token <ival> LOOPEX DOTDOT YADAYADA
-%token <ival> FUNC0 FUNC1 FUNC UNIOP LSTOP
+%token <ival> FUNC0 FUNC1 FUNC UNIOP BLKLSTOP LSTOP
 %token <ival> POWOP MULOP ADDOP
 %token <ival> DOLSHARP HASHBRACK NOAMP
 %token <ival> COLONATTR FORMLBRACK FORMRBRACK
@@ -126,7 +126,7 @@
 %left <ival> OROP <pval> PLUGIN_LOGICAL_OR_LOW_OP
 %left <ival> ANDOP <pval> PLUGIN_LOGICAL_AND_LOW_OP
 %right <ival> NOTOP
-%nonassoc LSTOP LSTOPSUB
+%nonassoc BLKLSTOP LSTOP LSTOPSUB
 %left PERLY_COMMA
 %right <ival> ASSIGNOP <pval> PLUGIN_ASSIGN_OP
 %right <ival> PERLY_QUESTION_MARK PERLY_COLON
@@ -1084,6 +1084,10 @@ listop	:	LSTOP indirob listexpr /* map {...} @args or print $fh @args */
 			{ $$ = op_convert_list($LSTOP, OPf_STACKED,
 				op_prepend_elem(OP_LIST, newGVREF($LSTOP,$indirob), $listexpr) );
 			}
+        |	BLKLSTOP block listexpr /* all/any { ... } @args */
+                        { $$ = op_convert_list($BLKLSTOP, OPf_STACKED,
+                                               op_prepend_elem(OP_LIST, newUNOP(OP_NULL, 0, op_scope($block)), $listexpr) );
+                        }
 	|	FUNC PERLY_PAREN_OPEN indirob expr PERLY_PAREN_CLOSE      /* print ($fh @args */
 			{ $$ = op_convert_list($FUNC, OPf_STACKED,
 				op_prepend_elem(OP_LIST, newGVREF($FUNC,$indirob), $expr) );
diff --git a/toke.c b/toke.c
index dc2239e76a..87ddc05b08 100644
--- a/toke.c
+++ b/toke.c
@@ -2183,6 +2183,34 @@ S_lop(pTHX_ I32 f, U8 x, char *s)
     }
 }
 
+#define BLKLOP(f, x) return S_blklop(aTHX_ f, x, s)
+
+STATIC I32
+S_blklop(pTHX_ I32 f, U8 x, char *s)
+{
+    PERL_ARGS_ASSERT_LOP;
+
+    pl_yylval.ival = f;
+    CLINE;
+    PL_bufptr = s;
+    PL_last_lop = PL_oldbufptr;
+    PL_last_lop_op = (OPCODE)f;
+    if (PL_nexttoke)
+        goto lstop;
+    PL_expect = x;
+    if (*s == '(')
+        return REPORT(FUNC);
+    s = skipspace(s);
+    if (*s == '(')
+        return REPORT(FUNC);
+    else {
+        lstop:
+        if (!PL_lex_allbrackets && PL_lex_fakeeof > LEX_FAKEEOF_LOWLOGIC)
+            PL_lex_fakeeof = LEX_FAKEEOF_LOWLOGIC;
+        return REPORT(BLKLSTOP);
+    }
+}
+
 /*
  * S_force_next
  * When the lexer realizes it knows the next token (for instance,
@@ -8001,7 +8029,7 @@ yyl_word_or_keyword(pTHX_ char *s, STRLEN len, I32 key, I32 orig_keyword, struct
     case KEY_all:
         Perl_ck_warner_d(aTHX_
             packWARN(WARN_EXPERIMENTAL__ANY_ALL), "all is experimental");
-        LOP(OP_ALLSTART, XBLOCK);
+        BLKLOP(OP_ALLSTART, XBLOCK);
 
     case KEY_and:
         if (!PL_lex_allbrackets && PL_lex_fakeeof >= LEX_FAKEEOF_LOWLOGIC)
@@ -8011,7 +8039,7 @@ yyl_word_or_keyword(pTHX_ char *s, STRLEN len, I32 key, I32 orig_keyword, struct
     case KEY_any:
         Perl_ck_warner_d(aTHX_
             packWARN(WARN_EXPERIMENTAL__ANY_ALL), "any is experimental");
-        LOP(OP_ANYSTART, XBLOCK);
+        BLKLOP(OP_ANYSTART, XBLOCK);
 
     case KEY_atan2:
         LOP(OP_ATAN2,XTERM);

This needs some clean up* but it does prevent parsing of all EXPR, LIST, and the tests for the message fail because we don't manage to actually parse it.

* combining S_lop and S_blklop mostly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I did think of adding a new grammar token type. It seems a bit heavy for just this use-case but perhaps if we add more, like first() or reduce() or whatever, it won't seem so bad.

@leonerd leonerd force-pushed the any_all branch 2 times, most recently from 55c4f67 to e57e680 Compare November 26, 2024 14:48
toke.c Show resolved Hide resolved
@tonycoz
Copy link
Contributor

tonycoz commented Nov 27, 2024

Is the any(BLOCK LIST) form meant to be allowed? There's no tests either way.

As to copy or alias the topic, I'm inclined towards a read-only copy, a copy to prevent modification of the original list/container, read-only to make it obvious that modification isn't allowed.

Making it a copy doesn't match the behaviour of grep/map, nor ListUtil::any/all from a quick look, which may be a negative.

@leonerd
Copy link
Contributor Author

leonerd commented Nov 27, 2024

Is the any(BLOCK LIST) form meant to be allowed? There's no tests either way.

Yes it is. Three times now I've opened the test file to go to write a test for it, then got distracted into fixing something else and forgetting. I'll do that now before I even finish writing this comment.

t/op/any_all.t lines 19+20.

As to copy or alias the topic, I'm inclined towards a read-only copy, a copy to prevent modification of the original list/container, read-only to make it obvious that modification isn't allowed.

Making it a copy doesn't match the behaviour of grep/map, nor ListUtil::any/all from a quick look, which may be a negative.

Yes; it's a deliberate deviation from the previous things, but I think it's justifable. Already we don't support deferred expressions, only blocks, so these new any/all are different from grep. Having a further difference in the behaviour of $_ seems OK as it's intending to trend towards a better design.

Comment on lines +12069 to +12070
case OP_ANYSTART:
case OP_ALLSTART:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still get here? The new BLKLSTOP grammar rule doesn't call newGVREF()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems so. I put an assert(0) in there and the t/op/any_all.t file immediately provokes it at compiletime.

pp_ctl.c Outdated
SV *src = PL_stack_base[TOPMARK];
if (SvPADTMP(src)) {
SV *newsrc = sv_mortalcopy(src);
PL_stack_base[TOPMARK] = newsrc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation nit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@tonycoz
Copy link
Contributor

tonycoz commented Nov 28, 2024

I assume the copy/alias thing is waiting on the PPC discussion in Perl/PPCs#60.

@leonerd
Copy link
Contributor Author

leonerd commented Nov 28, 2024

I assume the copy/alias thing is waiting on the PPC discussion in Perl/PPCs#60.

Yeah; but it seems the decision there is "not currently". I've moved it to a "Future Scope" section to maybe think about for another time. For now then I suppose I ought to add a unit test of the current behaviour, alias-modifications and all.

As these only permit `any BLOCK LIST` and not the deferred-expression
form like `grep EXPR, LIST`, we create a whole new token type, BLKLSTOP,
to represent these. The parser then only accepts the block form and not
the expression form when parsing these.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
squash-before-merge Author must squash the commits down before merging to blead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants