Pachuco ported to ARM

I had a few days of vacation to use up recently, and I spent some of the time working on pachuco. The main achievement was to port it to ARM. So now the compiler supports x86, x86-64, and ARM. The code is on github.

My main motivation for this project was to learn ARM machine code. The only general-purpose ISAs with a healthy future seem to be x86, x86-64 and ARM, but I haven't done any low-level development on ARM until now. The port also proves that pachuco isn't tied to x86/x86-64. It didn't require any significant changes to the core of the compiler, though a lot of code got moved around to separate the target-specific parts from the target-independent parts.

The ARM machine I used for development is an a NSLU2. This has an 266MHz XScale-IXP42x chip implementing the ARMv5 architecture, and 32MB of memory. It supports the THUMB instruction set, but I just used the main ARM instruction set.

A couple of things that arose in the process of developing the port strike me as worthy of note:

The first relates to the bootstrapping process. Although pachuco can compile itself, I still tend to develop under sbcl, because it makes identifying the causes of bugs much easier. But sbcl hasn't been ported to ARM, so I couldn't follow exactly the same process I followed on x86. The traditional way to port a self-compiler to a new platform is to to cross-compile: run the compiler on a supported machine, but generating code for ARM; then copy the results across to the ARM machine to run, or more realistically, to find out how they fail to run. But following this process literally would introduce cumbersome steps into the edit-test cycle.

What I did instead was to substitute sbcl with a wrapper script that runs sbcl on a remote x86 system via ssh. The script automatically copies the necessary files back and forth. This is still cross-compiling, but that fact is hidden from everything but the wrapper script. This required almost no changes to the main Makefile and build scripts, and allowed me to maintain a simple and rapid edit-test cycle.

The second interesting obstacle became evident as I got close to completing the bootstrap process. It turned out that the bootstrap process would take almost an hour, rather than the one or two minutes I was expecting. The cause was the assembler. The pachuco compiler produces assembly code, and uses the system assembler (specifically gas) to turn that into an executable. The assembly file produced when pachuco compiles itself is about 130k lines, and with 32MB of memory, gas swaps a lot while processing that file. I can't see a good reason for gas to use so much memory (more than the pachuco compiler uses to hold the program), except that it is most often used in conjunction with gcc, and C source files tend to be limited in size.

The solution was to split the output of the pachuco compiler into many smaller 10k-line files. gas can assemble these without swapping, and the linker connects the program back together to make the executable. Achieving this involved shuffling the order of the generated assembly code, and using global rather than local assembly labels in the appropriate places.

Pachuco on ARM now bootstraps for me in a couple of minutes (compared to 20 seconds on my Core2 laptop). It's necessary to set several environment variables and makefile variables to get there, but most of those should go away as I refine the port.

dwragg@bb5a:/tmp/pachuco$ make clean ; HEAP_SIZE=8 BOOTSTRAP_HOST=192.168.1.65 BOOTSTRAP_COMPILER_REMOTE=/home/dwragg/work/pachuco/scripts/sbcl-wrapper time make BOOTSTRAP_COMPILER=scripts/remote-bootstrap CODEGEN=simple COMPILEOPTS="-S -s" 
rm -rf build
mkdir -p build
scripts/compile -C scripts/remote-bootstrap -S -s -o build/stage0-test test/test.pco
build/stage0-test
Tests done
mkdir -p build
scripts/compile -C scripts/remote-bootstrap -S -s -o build/stage0-gc-test test/gc-test.pco
build/stage0-gc-test
GC tests done
mkdir -p build
scripts/compile -C scripts/remote-bootstrap -S -s -o build/stage1 language/util.pco language/expander.pco language/interpreter.pco compiler/walker.pco compiler/mach.pco compiler/mach-32bit.pco compiler/mach-arm.pco compiler/compiler.pco compiler/codegen-simple.pco compiler/codegen-generic.pco compiler/codegen-arm.pco compiler/driver.pco compiler/drivermain.pco
mkdir -p build
scripts/compile -C build/stage1 -S -s -o build/stage2 language/util.pco language/expander.pco language/interpreter.pco compiler/walker.pco compiler/mach.pco compiler/mach-32bit.pco compiler/mach-arm.pco compiler/compiler.pco compiler/codegen-simple.pco compiler/codegen-generic.pco compiler/codegen-arm.pco compiler/driver.pco compiler/drivermain.pco
mkdir -p build
scripts/compile -C build/stage2 -S -s -o build/stage3 language/util.pco language/expander.pco language/interpreter.pco compiler/walker.pco compiler/mach.pco compiler/mach-32bit.pco compiler/mach-arm.pco compiler/compiler.pco compiler/codegen-simple.pco compiler/codegen-generic.pco compiler/codegen-arm.pco compiler/driver.pco compiler/drivermain.pco
cmp -s build/stage2.s build/stage3.s
114.33user 11.71system 2:37.19elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k
109088inputs+42832outputs (526major+137598minor)pagefaults 0swaps

Comment from ManoSesu:-)

HI:
I'm asking if sbcl for arm could be compiled with pachuco. ¿Any ARM version of sbcl?
Thanks