Compiling APL Home About us Contact info Vector graphics SVG Fills & Filters Products Price list CSharp translator Dyalog APL2000 Support CUSP CausewayPro RainPro Newleaf Helpstuf Leafhtm Tutorials Demos Climate Charts VML graphics SVG examples Free Stuff CSS Editor Publications Seminars Articles Compiling APL Stonewalling in APL HTML Basics HTML Tables |
Causeway Graphical SystemsA Strategy for Compiling APL for the .Net Framework
|
rifreturn x
[1] © Simple Return example
[2] ©:Public
[3] ©: int x
[4] ©: string r
[5]
[6] :If x>0
[7] r'Out of here'
[8] :Return
[9] :Else
[10] r'Done'
[11] :End
|
// ============ ifreturn ============
// Simple Return example
public string ifreturn(int x)
{
string r;
if (x > 0) {
r = "Out of here";
return r;
} else {
r = "Done";
}
return r;
}
|
At this point, I will have upset about 50% of the C# programmers in the world. Just as we APLers have religious wars about IO, conventional programmers have wars about code layout conventions. I went for the old K&R style, mainly because it generates a lot less white space, and the lines match quite closely with the lines of APL code they came from. Here is a slightly more complex case:
rifandifandifelse x
[1] © Basic IF block example
[2] ©:Public
[3] ©: int r,x
[4]
[5] :If x>0
[6] :AndIf x>1
[7] :AndIf x>2
[8] r0
[9] :Else
[10] r12
[11] :End
|
public int ifandifandifelse(int x)
{
int r;
if (x > 0
&& (x > 1)
&& (x > 2)) {
r = 0;
} else {
r = 12;
}
return r;
}
|
Again, the match is very accurate, and we get the same behaviour that the ‘and’ clauses are only evaluated if all the prior expressions succeed. Note in passing that the conversion process recognises where a C# native function (like greaterthan on two scalars) is adequate to do the job, and doesn’t bother to call our APL library for these. It is slightly depressing just how often this seems to happen in ‘realistic’ application code. Anyway, here’s a For loop to see how that goes:
rforloop n;ct;stuff;nv
© Basic looper
©:Public
©: int n,stuff
©: string r
nv¼32 ª stuff0
:For ct :In ¼½nv
stuff+nv[ct]+1000
:End
r'Done ',stuff
|
public string forloop(int n)
{
string r; // Result
int stuff;
int[] nv;
nv = AE.Range(32);
stuff = 0;
for (int ct = 1; ct <= nv.Length; ct++) {
stuff = stuff + (nv[ct-1] + 1000);
}
r = "Done " + AE.Format(stuff);
return r;
}
|
The conversion special-cases loops over iota as true for(;;) structures. Anything else turns into a C# foreach() block which is much more general, but less efficient.
A couple of other things to note here:
So far, so good. While loops fall out easily enough:
whileleave arg
© Check out conversion
:While true
arg
:If arg<3
:Leave
:End
argarg-1
:EndWhile
|
public void whileleave(int arg)
{
while (true) {
AE.Display(arg);
if (arg < 3) {
break;
}
arg = arg - 1;
}
}
|
Note in passing that we chose to treat assignment to quad as a function (which echos its argument to the console, then returns it untouched).
That leaves Select/Case as the only slight oddball.
rcaselist item;opt
© Test caselist block
©:Public
©: string r,item,opt
r'(not set)'
opt'One'
:Select item
:Case 'fat'
r'fatter'
:CaseList 'cat' 'moggy'
:Select opt
:Case 'One'
r'Hello cat'
:Case 'Two'
r'Hello moggy'
:Else
r'not my cat'
:EndSelect
:CaseList 'bonzo' 'snapper'
r'not my dog'
:Case 'sat'
r'Sitting'
:End
|
public string caselist(string item)
{
string r;
string opt;
r = "(not set)";
opt = "One";
switch (item) {
case "fat":
r = "fatter";
break;
case "cat": case "moggy":
switch (opt) {
case "One":
r = "Hello cat";
break;
case "Two":
r = "Hello moggy";
break;
default:
r = "not my cat";
break;
}
break;
case "bonzo": case "snapper":
r = "not my dog";
break;
case "sat":
r = "Sitting";
break;
}
return r;
}
|
This one (as you can see) was a touch tricky to get right. You have to add in all those break; statements, as well as chopping up the CaseList clause into a list of cases. However (major bonus) in C# we can use strings in a case block which is a huge step forward from Java where you can only switch/case on integers.
Of course, control structures are not the only block element supported by APL. There is each to cope with, and in Dyalog APL we might encounter composition and dynamic functions, probably used in combination with operators like each or reduction. How do we get around these?
In sketching the initial design for the C# generator, we had decided to follow A+ and simply ‘can’ all the common reductions, scans and inner/outer products as primitives. This remains a good approach (for example, expressions like +\vec can run much faster when special-cased), but most modern APLs make life a little harder by permitting user-defined functions to the left of operators like reduction and scan, for example:
plus/¼3 6
... where plus has the obvious (dyadic) definition. The way interpreters handle this is to create a temporary ‘derived function’ plus-reduction and then proceed to apply this monadically to the right argument. The only solution I can see is to copy this approach exactly, for example:
cc.chk'plus/¼3'
int plusReduce(int[] myrarg) // Monadic Reduce
{
int res = myrarg[0];
int elem = myrarg.Length;
if( elem > 1 ) {
res = plus(myrarg[elem-2],myrarg[elem-1]);
for (int i=elem-3;i>=0;i--)
res = plus(myrarg[i],res);
}
return res;
}
plusReduce(AE.Range(3)); int
The target function is type-checked, and assuming it is dyadic, and supports matching left and right argument types (both must be of rank one less than the argument to the generated temporary function) then we just make a block of C# code to do what reduction does (walk back down the array from right to left) and return as the parsed expression a simple monadic call to it.
Clearly, the same approach will work for Scan, but Each is a lot nastier, as the operator itself can be monadic or dyadic, which affects the acceptable valence of its function argument(s). Here is a simple example of the dyadic case:
cc.chk '3 take ¨(1 2)(3 4 5)'
int[][] takeEach(int mylarg,int[][] myrarg) // Dyadic Each
{
int[][] res = new int[myrarg.Length][];
for (int i=0;i<myrarg.Length;i++)
res[i] = take(mylarg,myrarg[i]);
return res;
}
takeEach(3,new int[][]{new int[]{1,2},new int[]{3,4,5}}); int[][]
... and when we start allowing dynamic functions, you really have to cross your fingers and hope it works:
cc.chk '3 {¸¾} ¨(1 2)(3 4 5)'
int[] noname_d1(int alpha, int[] omega) { return AE.Take(alpha,omega); } // Dfn dyad
int[][] noname_d1Each(int mylarg,int[][] myrarg) // Dyadic Each
{
int[][] res = new int[myrarg.Length][];
for (int i=0;i<myrarg.Length;i++)
res[i] = noname_d1(mylarg,myrarg[i]);
return res;
}
noname_d1Each(3,new int[][]{new int[]{1,2},new int[]{3,4,5}}); int[][]
The principle is the same, but we must make a unique name for the generated Dfn, as C# (unfortunately) does not permit local functions. In real code, the Dfn would very likely be local, and would be prefixed with the name of its caller.
Every so often, you paint yourself well into a corner with a series of perfectly reasonable decisions, all of which seem fine at the time. Remembering that C# is strictly scoped, we have a cast-iron rule that a function cannot see local variables in its caller. Fine, we just need to be sure to pass in everything as arguments, which is good APL practice. The snag is that APL has only one way to do this, which is to construct a nested array from the argument set, pass this in as a single item, and then unpick it inside the called function.
“Oh bother”, said Pooh. We just took the decision to support arrays of arrays which are homogeneous all the way down (C# types like int[][]) and to avoid the overhead of the general nested structure, which C# represents as object[] and which implies a lot of runtime expense in recovering (and casting) the contents.
When in doubt, cheat! Here is the dodge we use to get around this particular fix:
msgmultadic arg;first;second;third
[1] © Sample multi-argument function
[2] ©:Args first,second,third
[3] ©: double first
[4] ©: double second
[5] ©: string third,msg
[6]
[7] (first second third)arg ª first1first
[8]
[9] msg'Third argument was ',third
... which is converted to C# as:
string multadic(double first,double second,string third)
{
string msg;
first = AE.Max(1,first);
msg = "Third argument was " + third;
return msg;
}
... and called appropriately:
cc.chk 'multadic 12.3 23.4 ''Hello'' '
multadic 12.3 23.4 'Hello'¤
MARK VERB NOUN
multadic(12.3,23.4,"Hello"); string
... which is why the word-formation stage is very concerned to flag up strands. If the parser finds itself formatting a strand as part of a function argument, it wraps it in parentheses and separates the elements with commas. When a function with an Args comment is processed, the header is built from this comment, and the (hopefully obvious) strand assignment is eliminated from the generated code. Note the difference from:
cc.chk 'qqq„ 12.3 23.4 ''Hello'' '
qqq „ 12.3 23.4 'Hello'¤
MARK NOUN ASGN NOUN
qqq = new object[] {12.3,23.4,"Hello"}; object[]
So, you can have nilads, monads, dyads and multiads in this strange new world. What you cannot have is a function with a complex left argument as well as a complex right argument. I am sure I could cope with these if I had to, but we have to draw the line somewhere.
From this point, the way the code-conversion works is quite specific to Dyalog version 10. Everything else could run in Dyalog, APL+Win or APLX with very little modification. What we have discovered along the way is that there are quite a few places where the ability to instance namespaces (and pass references as function arguments) is extremely helpful in migrating towards a pure C# coding style. A good example is the StringBuilder class, which comes free with C# and is extremely useful when building up longish strings (like the SVG representation of a towerchart) efficiently. To use it, you simply create a new instance, run its Append method as often as you want, and eventually collect the result by running its ToString method.
Something very similar in APL could look like:
strsbt;fluff;nl;word
[1] © Give the StringBuilder a whirl
[2] ©:Public
[3] ©: string str,word
[4] nlAV[4]
[5] fluffnew #.StringBuilder
[6] fluff.Append'Hello, world?',nl
[7] fluff.Append'Wow!',nl
[8] fluff.Append'Farewell, cruel world.',nl
[9] © Check string substitution
[10] fluff.Replace'world' 'Universe'
[11] © Return completed string
[12] strfluff.ToString
Of course, we could use the Dyalog-10 capability of talking to .Net to get straight to the real thing, but somehow this feels like overkill. Why not just make a StringBuilder namespace with a couple of utterly trivial functions called Append and ToString that do the same job? Then we need a mildly hairy function called new to get a true instance of the namespace. This does some pretty horrid stuff with NS to copy all the functions and variables, and then it runs the constructor, if present. The constructor is defined to be any function with the same name as the class. In C#, we have exactly this behaviour already available via the new keyword, so when I run the code-generator on this function:
public string sbt()
{
string str;
char nl;
string word;
StringBuilder fluff;
nl = '\n';
fluff = new StringBuilder();
fluff.Append("Hello, world?" + nl);
fluff.Append("Wow!" + nl);
fluff.Append("Farewell, cruel world." + nl);
// Check string substitution
fluff.Replace("world","Universe");
// Return completed string
str = fluff.ToString();
return str;
}
... it looks remarkably similar, and behaves in exactly the same way. Of course you can use the same trick to define little structures (the legend definition in RainPro is a good example) as namespaces, and include these in the generated DLL. It seems moderately efficient in the source APL, and certainly moves one some way towards an object-oriented approach. Definitely worth doing.
OK, you can convert a line of code, so you can convert a function, so you can convert a whole namespace full of functions, and call it a class. Alternatively you can just take a few functions at random, write these to your source file and call these a class for testing purposes. Here is the way I am doing this at the moment:
'd:\tools\aplc\bench' Make csbench © The Eratos benchmark Bench - done header ... (#.Eratos)(#.Looper) All Done. 1303 bytes written to d:\tools\aplc\bench.cs
... where csbench is mostly just raw C# stuff with a few ‘special’ lines:
using System;
using AE=Causeway.SharpArrays;
public class MainClass
{
static public void Main()
{
Bench test = new Bench();
Console.WriteLine("Off we go");
Console.WriteLine((test.Eratos(24000)).Length); // 1 sec
Console.WriteLine(test.Looper(160000000)); // 1 sec
}
}
!Bench=Eratos,Looper © Benchmarks, assorted
The last line creates a C# class called Bench out of the functions listed after the ‘=’ sign. Here is the complete C# source which it produces:
using System;
using AE=Causeway.SharpArrays;
public class MainClass
{
static public void Main()
{
Bench test = new Bench();
Console.WriteLine("Off we go");
Console.WriteLine((test.Eratos(24000)).Length); // approx 1 sec
Console.WriteLine(test.Looper(160000000)); // approx 1 sec
}
}
public class Bench
{
// ============ Eratos ============
// Returns primes up to n by the sieve of Eratosthenes
public int[] Eratos(int max)
{
int[] primes; // Result
int check;
int[] candidates;
primes = AE.Ravel(1);
candidates = AE.Drop(1,AE.Range(max));
while (0 < candidates.Length) {
primes = AE.Join(primes,check = candidates[0]);
candidates = AE.Compress(AE.LT(0,AE.Residue(check,candidates)),candidates);
}
return primes;
}
// ============ Looper ============
// Simple looper to test speed when we do very little!
public string Looper(int n)
{
string r; // Result
int ct;
ct = 0;
while (ct < n) {
ct = ct + 1;
}
r = "Done looping";
return r;
}
}
The Looper example copies exactly the ‘daft benchmark’ which the STSC guys used back in 1982 to put an upper bound on the possible speedup of APL code. They got a factor of 350 (compiling to 370 Assembler) which I was hoping to get close to. In fact we do rather better – to get this to run for 1sec in C# I have to loop 160,000,000 times and (yes, I did try it) this takes a little over 17min in Dyalog APL, which gives me a factor of around 950. The Eratos example is very comparable with the Dyalog version, so your ‘typical’ commercial APL will be somewhere in between. I think we are getting about ×12 on the chart viewers (which basically convert PostScript macros into XML strings) and maybe ×3 on the graph-generation (the ch namespace) which is playing more to APL’s strengths. A simple simulation (like the Monty Hall problem in the next Vector) would probably get you a factor of around 100 as it is mostly interpreter overhead with very little exploitation of the array stuff.
Given a lightweight library (currently 504K, and unlikely to grow much beyond that) of APL-like functions, converting an APL namespace to a C# class turns out to be rather easier than expected. So far, our experience has been that the older (and hence simpler) the APL code, the easier the job becomes.
A major exception to this rule would be an application written in pure Dfns, which could be handled very cleanly, and rather more completely than an application written in traditional APL style. For example, D has just one control-structure (the guard) and one method of error-trapping (the error-guard) both of which convert very simply to C#. D has strictly lexical scope, which is much closer to the C# model, and D programmers take a pride in avoiding re-use of local variables. Ambivalent functions are easy to spot syntactically, and should be easy to create overloads for. Of course a D compiler should be written in D, and then (naturally) used to compile itself. Maybe someone with time on their hands would like to give it a try.