<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<font face="Helvetica, Arial, sans-serif">Hi Elias,<br>
<br>
I believe it is better to keep things together, i.e. in a single ⎕
function than in several.<br>
<br>
It may be intuitive to use the character ⊂ </font><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial,
sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif">instead of B in
the axis argument to indicate<br>
that the result is meant for dyadic ⊂.<br>
<br>
</font></font></font></font></font></font></font></font><font
face="Helvetica, Arial, sans-serif">/// Jürgen<br>
<br>
</font><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif"><font face="Courier New,
Courier, monospace"><b><br>
</b></font></font></font>
<div class="moz-cite-prefix">On 10/02/2017 10:47 AM, Elias Mårtenson
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CADtN0W+pMrpBXuDhimYoJc6VfPr9-***@mail.gmail.com">
<div dir="ltr">In playing around with this, I realise that the "B"
mode is quite useful. So much so, in fact, that I'm wondering if
it's warranted to have a dedicated quad-function for this
specific behaviour.
<div><br>
</div>
<div>Here's an example of extracting sequences of 4 characters:</div>
<div><br>
</div>
<div>
<div><font face="monospace, monospace"><b> {⍵ ⊂⍨
"[a-z]{4}" ⎕RE['B'] ⍵} 'abcdef45abchello9'</b></font></div>
<div><font face="monospace, monospace">┏→━━━━━━━━━━━━━━━━━━━┓</font></div>
<div><font face="monospace, monospace">┃"abcd" "abch" "ello"┃</font></div>
<div><font face="monospace, monospace">┗∊━━━━━━━━━━━━━━━━━━━┛</font></div>
</div>
<div><br>
</div>
<div>Regards,</div>
<div>Elias</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On 2 October 2017 at 16:27, Elias
Mårtenson <span dir="ltr"><<a
href="mailto:***@gmail.com" target="_blank"
moz-do-not-send="true">***@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Some progress:
<div><br>
</div>
<div>The behaviour I described earlier still works, but
now has the ability to work N-dimensional arrays of
strings, compiling the regex only once and then applying
it on all the cells.</div>
<div><br>
</div>
<div>In addition to this, I have now also added a flag "B"
(meaning "bitmap") that creates a bitmap of all matches
and can be used in conjunction with ⊂ to split strings
by regex.</div>
<div><br>
</div>
<div>Here's an example:</div>
<div>
<div><font face="monospace, monospace"><b><br>
</b></font></div>
<div>
<div><font face="monospace, monospace"><b> " +"
⎕RE["B"] "this is a test"</b></font></div>
<div><font face="monospace, monospace">┏→━━━━━━━━━━━━━━━━━━━━━━━━━━━━<wbr>━━━━━━━━━━┓</font></div>
<div><font face="monospace, monospace">┃0 0 0 0 1 0 0
2 2 2 0 3 3 3 3 3 0 0 0 0┃</font></div>
<div><font face="monospace, monospace">┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━<wbr>━━━━━━━━━━┛</font></div>
</div>
</div>
<div><br>
</div>
<div>This matches any sequence of spaces, and we can
easily use ⊂ to split the string:</div>
<div><br>
</div>
<div>
<div><font face="monospace, monospace"><b> {⍵ ⊂⍨
0=" +" ⎕RE["B"] ⍵} "this is a test"</b></font></div>
<div><font face="monospace, monospace">┏→━━━━━━━━━━━━━━━━━━━━━┓</font></div>
<div><font face="monospace, monospace">┃"this" "is" "a"
"test"┃</font></div>
<div><font face="monospace, monospace">┗∊━━━━━━━━━━━━━━━━━━━━━┛</font></div>
</div>
<div><br>
</div>
<div>However, I'm not sure if the value returned from the
function are ideal. The idea of the increasing numbers
is to be able to differentiate between the result of:</div>
<div><br>
</div>
<div>
<div><font face="monospace, monospace"><b> " "
⎕RE["B"] " "</b></font></div>
<div><font face="monospace, monospace">┏→━━━━━━┓</font></div>
<div><font face="monospace, monospace">┃1 2 3 4┃</font></div>
<div><font face="monospace, monospace">┗━━━━━━━┛</font></div>
<div><br>
</div>
<div>vs:</div>
<div><br>
</div>
<div><font face="monospace, monospace"><b> " +"
⎕RE["B"] " "</b></font></div>
<div><font face="monospace, monospace">┏→━━━━━━┓</font></div>
<div><font face="monospace, monospace">┃1 1 1 1┃</font></div>
<div><font face="monospace, monospace">┗━━━━━━━┛</font></div>
</div>
<div><br>
</div>
<div>Should it be left like this, or should it be done in
some other way?</div>
<div><br>
</div>
<div>Regards,</div>
<div>Elias</div>
</div>
<div class="HOEnZb">
<div class="h5">
<div class="gmail_extra"><br>
<div class="gmail_quote">On 25 September 2017 at
20:10, Juergen Sauermann <span dir="ltr"><<a
href="mailto:***@t-online.de"
target="_blank" moz-do-not-send="true">***@t-online.de</a><wbr>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi
Elias,<br>
<br>
making a quad function an operator is simple if
the function argument(s) is/are primitive
functions<br>
and a little more complicated if not.<br>
<br>
First of all you have to implement (read:
overload) some of the eval_XXX() function that
have function<br>
arguments. For monadic operators these eval_XXX()
functions areare:<br>
<br>
virtual Token eval_ALB(Value_P A, Token &
LO, Value_P B)<br>
virtual Token eval_ALXB(Value_P A, Token &
LO, Value_P X, Value_P B)<br>
virtual Token eval_LB(Token & LO, Value_P
B)<br>
virtual Token eval_LXB(Token & LO, Value_P
X, Value_P B)<br>
<br>
where L resp. LO stands for the left function
argument. For a dyadic operators they are:<br>
<br>
virtual Token eval_ALRB(Value_P A, Token &
LO, Token & RO, Value_P B)<br>
virtual Token eval_ALRXB(Value_P A, Token &
LO, Token & RO, Value_P X, Value_P B)<br>
virtual Token eval_LRB(Token & LO, Token
& RO, Value_P B)<br>
virtual Token eval_LRXB(Token & LO, Token
& RO, Value_P X, Value_P B)<br>
<br>
where L resp. LO and R resp. RO stand for the left
and right function argument(s), A and B<br>
are the value arguments, and X the axis.<br>
<br>
Not all of them need to be implemented only those
that have function signatures that<br>
are supported by the operator (mainly in terms of
allowing an axis argument X or a<br>
left value argument A).<br>
<br>
If an operator supports defined functions (as
opposed to primitive functions) then it will
typically<br>
implement the operator itself as a macro, which
means that the implementation is written in APL<br>
rather than in C++ (similar to "magic functions"
in NARS). This is needed because primitive
functions<br>
are atomic (they either succeed or fail, but
cannot be continued after a failure) while defined
functions<br>
(and operators) can continue at the point of
interruption after having fixed the values that
have cause<br>
the fault.<br>
<br>
Some of the build-in operators in GNU APL have
both a primitive implementation (which is used
when<br>
the function arguments are primitive) and a macro
based implementation if not. This is for
performance<br>
reasons so that the ability to take defined
functions as arguments does not performance-wise
harm the<br>
cases where the function arguments are primitive.<br>
<br>
The Macro definitions are contained in Macro.def<br>
<br>
Please note that in GNU APL functions cannot
return functions, which may or may not be a
problem<br>
in your case, depending on whether the function
argument(s) of the ⎕-operator is/are primitive or
not.<br>
In standard APL you cannot assign a function to a
name. The usual work-around return a string and ⍎
it.<br>
<br>
My guts feeling is that if you need function
arguments for implementing regular expressions
then<br>
something has been going into the wrong direction
somewhere else.<br>
<br>
Best Regards,<br>
/// Jürgen<span><br>
<br>
<br>
<br>
On 09/25/2017 05:18 AM, Elias Mårtenson wrote:<br>
</span>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"><span>
Dyalog's implementation is much more
expressive than what I had proposed.<br>
<br>
There are technical reasons why we have no
hope of replicating their functionality (in
particular, GNU APL does not have support for
namespaces).<br>
<br>
Their function takes arguments and returns a
function, which is a matcher function that can
be reused, which is useful since you'd only
compile the regexp once. Jürgen, how can I
make a quad-function behave like below? It
seems to be similar in behaviour to ⍤ and ⍣.<br>
<br>
</span>
* ('.at' ⎕R '\u0') 'The cat sat on the mat'
*<span><br>
The CAT SAT on the MAT<br>
<br>
It can also accept a function, in which case
the function is called for each match, to
return a replacement string. Can you explain
how to make a quad-function an operator?<br>
</span>
*<br>
*<br>
* ('\w+' ⎕R {⌽⍵.Match}) 'The cat sat on the
mat'*<span><br>
ehT tac tas no eht tam<br>
<br>
As you can see, they leverage namespaces in
order to pass a lot of different fields to the
replace-function. If we want to do something
similar, ⍵ would probably have to be the match
string, and we'll have to live without the
remaining fields.<br>
<br>
Regards,<br>
Elias<br>
<br>
<br>
</span><span>
On 23 September 2017 at 00:08, Juergen
Sauermann <<a
href="mailto:***@t-online.de"
target="_blank" moz-do-not-send="true">***@t-online.de</a>
<mailto:<a
href="mailto:***@t-online.de"
target="_blank" moz-do-not-send="true">***@t-on<wbr>line.de</a>>>
wrote:<br>
<br>
Hi,<br>
<br>
I have not looked into Dyalogs
implementation myself, but if they<br>
have it then we should aim at being as
compatible as it makes sense.<br>
No problem if some of their capabilities
are not supported (please<br>
avoid<br>
going over the top in the GNU APL
implementation)<br>
<br>
Unfortunately ⎕R is already occupied in
GNU APL (inherited from<br>
IBM APL2),<br>
so some other name(s) are needed.<br>
<br>
Before implementing too much in advance,
it would be good to<br>
present the<br>
intended syntax and semantics on bug-apl
and solicit opinions.<br>
<br>
/// Jürgen<br>
<br>
<br>
On 09/22/2017 04:59 PM, Elias Mårtenson
wrote:<br>
</span>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex"><span>
I did not know this. I took a look at
Dyalog's API and it's not<br>
possible to implement it fully, as it
relies on their object<br>
oriented features. However, the basic
functionality wouldn't be<br>
hard to replicate, if that is something
that is desired.<br>
<br>
Jürgen, what is your opinion on this?<br>
<br>
On 22 September 2017 at 20:21, Jay Foad
<<a href="mailto:***@gmail.com"
target="_blank" moz-do-not-send="true">***@gmail.com</a><br>
</span><span>
<mailto:<a
href="mailto:***@gmail.com"
target="_blank" moz-do-not-send="true">***@gmail.com</a>>>
wrote:<br>
<br>
FYI Dyalog has operators ⎕S (search)
and ⎕R (replace) which<br>
are implemented with PCRE:<br>
<br>
('[Aa]..'⎕S'&')'Dyalog APL'<br>
┌───┬───┐<br>
│alo│APL│<br>
└───┴───┘<br>
('red' 'green'⎕R'green' 'blue')'red
orange yellow green blue'<br>
green orange yellow blue blue<br>
<br>
<a
href="http://help.dyalog.com/16.0/Content/Language/System%20Functions/r.htm"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://help.dyalog.com/16.0/Co<wbr>ntent/Language/System%20Functi<wbr>ons/r.htm</a><br>
</span>
<<a
href="http://help.dyalog.com/16.0/Content/Language/System%20Functions/r.htm"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://help.dyalog.com/16.0/C<wbr>ontent/Language/System%20Funct<wbr>ions/r.htm</a>><br>
<br>
Jay.<br>
<br>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>