Discussion:
[Bug-apl] startup time, and, is there a way to run under FastCGI?
Patrick Giagnocavo
2018-10-15 14:40:59 UTC
Permalink
First, thanks for GNU APL!

I have a simple script that I have written, it compares one list of approximately 5000 10-digit phone numbers with another list of 1700 ten-digit phone numbers and tells me which numbers (items) in the second list, are not in the larger list.

So I do the 2 FIO [49] for each file, assigning each to a variable, then

result<- large5klist~smallerlist
(rho)result
47 1 (rho) result (to print it out) ; I know in this case there are 47 results and I want it printed in just 1 column, i.e. one phone number per line
)OFF

when I run this under Linux (a recent svn trunk), without the banner etc., it completes in approximately 0.520 seconds; actually the banner doesn't seem to make much difference in output.

When I run "comm -23 list1.txt list2.txt" it takes 0.016 seconds on the same hardware.

Now, I don't expect such performance, but, is there a way to reduce the time? Is there a way to start APL such that it can "fork" a task to handle this, so that startup time is almost zero?

And (I think about doing this via a web interface) is there a way to run APL under FastCGI? My guess is that the interpreter startup time is the issue, rather than the actual execution of the commands. I will try to test just an "empty" APL startup e.g. a script which contains only )OFF , and see what amount of time that takes.

Cheers

Patrick Giagnocavo
***@zill.net
Juergen Sauermann
2018-10-16 18:00:54 UTC
Permalink
Juergen Sauermann
2018-10-16 21:02:59 UTC
Permalink
Patrick Giagnocavo
2018-10-18 04:07:45 UTC
Permalink
Hi Juergen,

That is amazing! I was able to download and recompile, and it is indeed, much faster!

On my virtualized machine the time went from 540ms down to about 138ms. I think I ran some other commands in it, will look over the script again tomorrow and see if I can make it even faster. It uses FIO [49] I think, to load the 2 data files.

Would I get faster results, by using FIO [3] to get a file handle, and then use the fscanf available via FIO, do you think?

Cheers,

Patrick

----- Original Message -----
From: "Juergen Sauermann" <***@t-online.de>
To: "Patrick Giagnocavo" <***@zill.net>, bug-***@gnu.org
Sent: Tuesday, October 16, 2018 3:02:59 PM GMT -07:00 US/Canada Mountain
Subject: Re: [Bug-apl] startup time, and, is there a way to run under FastCGI?

Hi,

fixed in SVN 1083 . Time is down to 11 ms:

F5000←⊂[2]'0123456789'[?5000 10⍴10]
F1750←⊂[2]'0123456789'[?1750 10⍴10]

WITHOUT:
T←⎕TS
D←F5000 ∼ F1750
(365 12 30 24 60 60 1000⊥⎕TS-T) 'ms'
11 ms

/// Jürgen



On 10/16/2018 08:00 PM, Juergen Sauermann wrote:


Hi Patrick,

as far as I can see most of the time is spent in the WITHOUT function (A∼B):

F5000←⊂[2]'0123456789'[?5000 10⍴10]
F1750←⊂[2]'0123456789'[?1750 10⍴10]

WITHOUT:
T←⎕TS
D←F5000∼F1750
(365 12 30 24 60 60 1000⊥⎕TS-T) 'ms'
512 ms

Please note that the comm command works on sorted lists, so that
comparing them can be done in linear time. I could do the same
in GNU APL:

T←⎕TS
D←⍋F5000
(365 12 30 24 60 60 1000⊥⎕TS-T) 'ms'
20 ms

which should reduce the execution time from currently O(m×n)
down to O(m log m + n log n). I will look into this.

Regarding FastCGI, I am not familiar with its details, but looking
at the Wikipedia description of it, calling apl from it should be rather
easy ( ⎕FIO[34] to listen () on TCP ports and ( ⎕FIO[3 5 ] to accept()
TCP connections for apl as a server or ⎕FIO[36] for apl as
a client).

Alternatively, if apl is supposed to do something else in parallel
you can connect apl with some other process via ⎕FIO[57] (which
is a socket pair and probably the fastest method) and either use
raw bytes, or TLVs encoded with 33/34 ⎕CR . See

http://svn.savannah.gnu.org/viewvc/apl/trunk/HOWTOs/APL-Communication-Cookbook.html?revision=1077

for details.

Best Refards,
/// Jürgen Sauermann



On 10/15/2018 04:40 PM, Patrick Giagnocavo wrote:


First, thanks for GNU APL!

I have a simple script that I have written, it compares one list of approximately 5000 10-digit phone numbers with another list of 1700 ten-digit phone numbers and tells me which numbers (items) in the second list, are not in the larger list.

So I do the 2 FIO [49] for each file, assigning each to a variable, then

result<- large5klist~smallerlist
(rho)result
47 1 (rho) result (to print it out) ; I know in this case there are 47 results and I want it printed in just 1 column, i.e. one phone number per line
)OFF

when I run this under Linux (a recent svn trunk), without the banner etc., it completes in approximately 0.520 seconds; actually the banner doesn't seem to make much difference in output.

When I run "comm -23 list1.txt list2.txt" it takes 0.016 seconds on the same hardware.

Now, I don't expect such performance, but, is there a way to reduce the time? Is there a way to start APL such that it can "fork" a task to handle this, so that startup time is almost zero?

And (I think about doing this via a web interface) is there a way to run APL under FastCGI? My guess is that the interpreter startup time is the issue, rather than the actual execution of the commands. I will try to test just an "empty" APL startup e.g. a script which contains only )OFF , and see what amount of time that takes.

Cheers

Patrick Giagnocavo ***@zill.net
Juergen Sauermann
2018-10-19 11:37:35 UTC
Permalink
Loading...