Using the Raspberry-pi I've been examining the timing control for the YM3812.
The input clock (ΦM) is a square wave, ideally between 2 and 4MHz. From this are derived output clocks (ΦSY and ΦSH) which synchronise the digital output datastream (MO).
The Nominal frequency for ΦM, (fM) is 3.57MHz
and the corresponding sample rate for the output fSH is just below 50KHz.
So fM = 72 × fSH
The datasheet for the YM3812 is not particularly clear on the timings for ΦSY, ΦSH and MO, but these outputs are the inputs to the Yamaha YM3014B DAC chip.
The YM3014B datasheet shows a timing diagram which depicts 16 bits of data in each sample. The first 3 are redundant and should be ignored, the next 10 are the mantissa and the remaining 3 are the exponent.
To get 16 bits of data out during 72 cycles of ΦM would suggest that fM = 4.5 × fSY. This would be a little odd but not completely unfeasible.
In fact, contrary to the diagram in the YM3014B datasheet, there are 18 bits of data delivered for each sample, the first 5 of which are redundant. I will label these 18 bits as:
X0 X1 X2 X3 X4 X5 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 S0 S1 S2
So fM = 4 × fSY.
The data on MO changes on a rising edge of ΦM and persists for 4 cycles of ΦM.
ΦSY rises on the rising edge of ΦM, 25% through the 4 cycles of ΦM and falls on the following rising edge of ΦM. It has a duty cycle of 25%
ΦSH has a duty cycle of 32 / 72, rising as MO changes to the 11th bit (D4) and falling at the end of the sample (after S2)
So the easiest way to read the data is to detect the falling edge of ΦSY and read the value of MO into an 18* bit serial to parallel shift register. When ΦSH falls the parallel output should be latched for processing.
* because the first 5 bits are to be ignored, you would probably use a 16-bit shift register and just let the first two bits fall out of the end.