I'm having trouble loading data into the ym3812... I think.
It's not clear to tell, because I might be having trouble reading data out. Anyway the data sheet is frustratingly unclear about a few things.
I've wired up a total of 17 inputs and outputs to the chip, that's the maximum number of io pins readily available to me on the pi. I NEED 16 of these to do the whole job, but theoretically I don't need all 16 of them at once. Having done all this it's changed the physical setup, and I've also changed the code that I'm using, so I might currently have a basic mistake in either of those to contend with. I'm going to strip down the software a bit at a time and then build it back up to try to narrow down the issues.
According to the datasheet, writing a byte into a register goes something like this.
pull down AO
pull down CS
pull down WR
Set the register address in D0 - D7
pull up WR
pull up CS
pull up AO
pull down CS
pull down WR
set the data in D0 - D7
pull up WR
pull up CS
That's all there is to it. The actual order of CS and WR changes shouldn't matter, the datasheet says that the value in D0 - D7 is taken when either CS or WR goes high.
What's not in the datasheet is any information about the time required between writes. There's plenty of information about the setup and hold times during the process of writing either an address or a byte, but nothing to suggest that there's a minimum delay between writes. I thought I was being generous leaving 32 cycles of ΦM between the address write and the data write, and then another 32 cycles after the data write.
But reading Vladimir Arnost's document on programming the OPL-3, which seems to have been largely based on observations of what the ad-lib actually did, he suggests that the a delay of 3.3 μs after the address write and 23 μs after the data write necessary for the OPL-2 (ym3812). At the nominal frequency of 3.58MHz is is equivalent to 12 cycles of ΦM and then a further 80 cycles. He goes on to say that the OPL-3 requires virtually no delay, but that is at odds with the OPL-3 datasheet itself which mandates a minimum of 32 cycles for each delay. The nominal frequency for the OPL-3 is 4 times higher than that of the OPL-2, but these would still seem to be in the same ball park, at around 2.2 μs
The figures quoted by Vladimir appear time and again in open source drivers and emulators, and are consistent with the document written 2 years earlier by Jeffrey S. Lee, where he says the information comes from the Adlib manual.
I'm surprised not to find any of this in the chip datasheet itself, but it seems to work for everyone else, so I'll have to extend my delays.