Having pinned down the synchronisation of the three clocks following a reset, I'm in a position to count clock pulses of ΦM in order to know when I can read MO and get data out of the chip without also spending time reading ΦSY and ΦSH.
Last night I did exactly that, with the intention of timing the results to see how fast the process is.
I have a loop of code which does the following:
Set a pin, either something useful, such as changing the reset signal IC or something benign such as setting a pin to it's current value.
Raise the clock signal ΦM
Read a pin, either something useful, such as MO (based on the current count of ΦM) or something not useful (such as reading MO and ignoring the result).
Lower the clock signal ΦM
So the loop represents a single clock pulse, and the time taken to execute the loop is somewhat consistent, by no means perfect, but should be good enough.
I ran this for 50,000 × 60 × 72 iterations, If the clock was running at an ideal 3.57MHz, this would take just about 60 seconds, because the clock should deliver 72 cycles per sample, almost 50,000 samples per second.
I timed the results using the linux time command. The elapsed real time was between 98 and 109 seconds. This gives a lower speed estimate of 1.98MHz, which is just shy of the documented minimum 2MHz signal required be the chip. I think that will probably do. What I have to be careful of is not to cram too much additional decision making into the loop.
There are I'm aware some flaws in the methodology, some of them quite serious:
The methodology takes no account of the set-up and tear-down time for the program that I wrote. (this will artificially depress the results)
The methodology makes no attempt to measure the consistency of the clock, nor of it's duty cycle.
The methodology ignores the multi-tasking nature of the linux operating system, and to some extent of the raspberry-pi itself. This is probably the most serious issue. The clock might be running well above 2MHz for much of the time, but with relatively long periods in which it stops completely. If the registers of the chip are prone to decay without a sustained clock, then these long delay periods may significantly compromise any output data, and throw any findings into question.
Mitigation:
I could raise the process priority of the program. This would likely make interruptions less frequent, but would probably do little to shorten them when they still inevitably occurred.
I could write to the GPIO more directly, the wiringPi library is convenient, and relatively portable, but it does contain options to access GPIO more efficiently, or I could forego wiringPi altogether and write assembly routines.
I could overclock the processor. The latest OS releases have the potential to get considerably more processing power out of the board, but it's not clear to me yet if I can apply this to my revision 1 board.
I could forego linux altogether. Courtesy of Cambridge University there's sufficient information out there for me to write my own operating system. But that would leave me with the options of either writing a new program for every experiment, or writing a considerable amount of operating system with perhaps even a rudimentary file system to make code more re-usable. It doesn't address the issue of hardware interrupts generated by amongst other things, the GPU.